Facebook Messages and HBase

Data: Hadoop
Location: C123
Average rating: ****.
(4.38, 8 ratings)

When Facebook first started exploring the next generation of Messages, handling the large data quantity and growth was a primary concern. The current Messages infrastructure handles over 350 million users sending over 15 billion person-to-person messages per month. Our chat service supports over 300 million users who send over 120 billion messages per month. The past storage implementation utilized MySQL, but many limitations arose during scaling. Read-optimized data structures, manual sharding, rigid replication options, and operationally-intensive scaling were MySQL pain points that necessitated a look at alternative storage solutions. We set up test frameworks to evaluate clusters of MySQL, Apache Cassandra, Apache HBase, and a couple of other systems. We ultimately chose HBase as the best solution for our workload.

This is a technically-dense talk that should primarily appeal to software engineers and system architects. No prior knowledge of HBase is necessary, however we expect attendees to have a comfortable knowledge of computer architecture and database fundamentals. The audience should leave with an understanding of the benefits of an HBase datastore solution, how Facebook uses HBase to reach massive scale with reliability, and common APIs / configuration parameters to consider when deploying your own HBase solution.

Nicolas Spiegelberg


Nicolas Spiegelberg is a storage engineer in the Facebook messaging team. He helped implement the HBase storage solution for Facebook Messages from design to deployment. Additionally, Nicolas is an HBase committer and PMC, who has contributed many critical features such as HDFS data reliability, Bloom Filters, and an enhanced compaction algorithm.

UA-Huntsville : Masters in Computer Engineering

Comments on this page are now closed.


08/11/2011 11:48pm PDT

Is there a webcast of this event?

Shu Zhang
07/27/2011 4:50am PDT

Don’t know if this is the right forum for it, but didn’t get a chance to ask in person and didn’t see your email anywhere… So, splitting the data into different hbase clusters was only for partitioning right? Are there any replication done at all across geographies? How do you deal with minimizing latencies for data access across the world? Maybe it’s in cache layer or maybe the partitioning takes user’s geography into account?

An unrelated question is, without any cross-cluster replication, how do you deal with a datacenter outage?

Thanks, my email is shuzhang0@gmail.com

Picture of Siddharth Anand
Siddharth Anand
07/26/2011 5:13am PDT

Great talk!