Want to learn how facebook scales their load balancing infrastructure to support more than a billion users? We will be revealing the technologies and methods we use to route and balance Facebook’s traffic. This talk will focus on Facebook’s DNS load balancer and software load balancer, and how we use these systems to improve user performance, manage capacity, and increase reliability.
Facebook is used by people located all over the world, and its Traffic team is responsible for balancing that traffic and making our network as fast as possible. The Traffic team at Facebook has built several systems for managing and balancing our site traffic, including both a DNS load balancer and a software load balancer capable of handling several protocols.
Our DNS load balancer has two major components: a central GLB decision engine written in Python that makes all the traffic balancing decisions and then generates DNS maps, and an existing open source C DNS server (tinydns) that serves the actual DNS traffic, directing users to clusters based a lookup table loaded from the DNS map.
Our Python decision engine is named Cartographer. It gathers information on internet topology, user latency, user bandwidth, compute cluster load/availability/performance, and then it crunches a bunch of data and determines the current best cluster to point each ISP’s users at. Cartographer also receives a continuous stream of updates from its different monitoring channels and automatically pushes new DNS maps to the DNS server whenever it needs to adjust cluster load or react to network problems. (It can react to both a gross interruption of service due to a problem with Facebook’s network or clusters, as well as localized outages for users in a given country or who use a given ISP.)
We will talk about the structure of Cartographer and explain some of its core algorithms for programmatically balancing traffic. As it handles traffic routing decisions for more than a billion users on Facebook, it is a great example of a small Python application having large impact.
Adam has spent the past 8 years diffusing the firehose of traffic for some of the biggest web sites on the internet. He has wholesale replaced the front end load balancing architecture on a massively growing site while it was serving… twice.
Adam is currently a Production Engineering Manager in the Traffic team at Facebook. He and his team build simple, reliable, and scalable components that adapt to the demands of the fastest moving site on the internet.
When Adam is not evangelizing the tenets of the UNIX philosophy, he is a dedicated father and amateur racing daydreamer.
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Gloria Lombardo at (203) 381-9245 or email@example.com
For media partnerships, contact mediapartners@ oreilly.com
For media-related inquiries, contact Maureen Jennings at firstname.lastname@example.org
View a complete list of Velocity contacts