Saturday, March 28, 2009

Facebook architecture & lessons learned

There is a very nice and useful talk given by Aditya Agrawal (director at engineering at Facebook) about Facebook architecture and lessons learned. The presentation is called Facebook: Science and the Social Graph (I am not sure why they picked up this name, sounds little misleading to me).

I have seen it just ones and I didn't get all of this but there are a few points that cought my attention:
  • No distributed transations at the database level (the way they ensure or better fix data integrity is similar to what eBay engineers are using - i.e. offline background jobs). The database schema is probably very simple (keeping things simple helps a lot if your system is expected to grow fast and needs to be scalable [i.e. distributed]). They hold only static data in the database and only the most recent one. Older data are moved to the warehouse. I guess it is then analyzed by Hadoop and Cassandra.
    There is also one older blog post by Greg Linden about scailing Facebook databases.
  • PHP for front end works well now BUT the question is if they would have used it again if they had been given a chance to write the system again. It turns out the that PHP is a good language for web programming but it is not very fast language and code base maintenance cost can grow high. It seems that the only justification for them to use PHP is (was) that it allowed them to start quickly and probably it was the only front end language they had enough experience with. They try to stick to principle known as "Use the right language, library and tool for the task" (see 46:00) so it is funny to see how Aditya gets an question from audience "Is your bet on PHP driven by the same principle?" (see 48:40). To me it is obvious that the fact that the PHP is dynamic (no typing = much harder code analysis and debugging) and interpreted language causes them a lot of troubles.
  • Memcached is one of the biggest star in their stack although they had to implement several customizations (probably the hardest one was switching it to UDP).
  • Thrift (50:00) seems to be one of their most interesting open sources project.
For more information you can also go to

No comments: