k-shimi musicTech-life

Blog by Shimon Koifman

Flickr Architecture Lesson

Posted by shimikoif on January 10, 2008

Flickr Architecture Lesson

I was reading a paper about Flickr architecture and would really like to share with you some of it’s highlights.

Flickr has an great challenge, they must handle a vast sea of non-stop expanding new content, ever increasing legions of users, and a constant stream of new features, all while providing excellent performance and usability. How do they do it?

The stats:

  • More than 4 billion queries per day.
  • 470M photos, 4 or 5 sizes of each
  • 2 PB raw storage (consumed about ~1.5TB on Sunday
  • Over 400,000 photos being added every day

Hardware:

  • EMT64 w/RHEL4, 16GB RAM
  • 6-disk 15K RPM
  • Data size is at 12 TB of user metadata (these are not photos, this is just innodb ibdata files – the photos are a lot larger).
  • 2U boxes. Each shard has~120GB of data.

General:

  • Use a share nothing architecture.
  • Everything (except photos) are stored in the database.
  • Statelessness means they can bounce people around servers and it’s easier to make their APIs.
  • Scaled at first by replication, but that only helps with reads.
  • Create a search farm by replicating the portion of the database they want to search.
  • Use horizontal scaling so they just need to add more machines.
  • Handle pictures emailed from users by parsing each email is it’s delivered in PHP. Email is parsed for any photos.
  • Earlier they suffered from Master-Slave lag. Too much load and they had a single point of failure.

  • Photos are stored on the filer. Upon upload, it processes the photos, gives you different sizes, then its complete. Metadata and points to the filers, are stored in the database.
  • Tags do not fit well with traditional normalized RDBMs schema design. De-normalization or heavy caching is the only way to generate a tag cloud in milliseconds for hundreds of millions of tags.
  • Some of their data views are calculated offline by dedicated processing clusters which save the results into MySQL because some relationships are so complicated to calculate it would absorb all the database CPU cycles.

Lessons Learned

Think of your application as more than just a web application. You’ll have REST APIs, SOAP APIs, RSS feeds, Atom feeds, etc.

Go stateless. Statelessness makes for a simpler more robust system that can handle upgrades without flinching.
Re-architecting your database sucks.

Capacity plan. Bring capacity planning into the product discussion EARLY. Get buy-in from the $$$ people (and engineering management) that it’s something to watch.

Start slow. Don’t buy too much equipment just because you’re scared/happy that your site will explode.

Measure reality. Capacity planning math should be based on real things, not abstract ones.

Build in logging and metrics. Usage stats are just as important as server stats. Build in custom metrics to measure real-world usage to server-based stats. Cache. Caching and RAM is the answer to everything.

Abstract. Create clear levels of abstraction between database work, business logic, page logic, page mark-up and the presentation layer. This supports quick turn around iterative development.

Layer. Layering allows developers to create page level logic which designers can use to build the user experience. Designers can ask for page logic as needed. It’s a negotiation between the two parties.

Release frequently. Even every 30 minutes.

Forget about small efficiencies, about 97% of the time. Premature optimization is the root of all evil.

Test in production. Build into the architecture mechanisms (config flags, load balancing, etc.) with which you can deploy new hardware easily into (and out of) production.

Forget benchmarks. Benchmarks are fine for getting a general idea of capabilities, but not for planning. Artificial tests give artificial results, and the time is better used with testing for real. Find ceilings.
– What is the maximum something that every server can do ?
– How close are you to that maximum, and how is it trending ?
– MySQL (disk IO ?)
– SQUID (disk IO ? or CPU ?)
– memcached (CPU ? or network ?)

Be sensitive to the usage patterns for your type of application.
– Do you have event related growth? For example: disaster, news event.
– Flickr gets 20-40% more uploads on first work day of the year than any previous peak the previous year.
– 40-50% more uploads on Sundays than the rest of the week, on average

Be sensitive to the demands of exponential growth. More users means more content, more content means more connections, more connections mean more usage.

Plan for peaks. Be able to handle peak loads up and down the stack.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: