Internet environments for consumer-facing applications routinely demand high throughput and sub-millisecond latencies for read/write transactions against terabytes of data, and service-level agreements demand 100% uptime. This session will review 10 proven practices for ensuring the high performance and availability that interactive Internet applications demand—even during power outages or natural disasters. These real-world lessons come from supporting the large-scale, multiple data center deployments of CTOs delivering platforms for the high-stakes ad sector, where speed means responses in 5 milliseconds or less, scale ranges from 200,000 to 2 million TPS against terabytes of data, and downtime is not an option. The lessons include:
1. When scaling, keep it the architecture simple, so there are fewer points of failure. For instance, load balancers may fail at high transaction rates even as the database is cruising.
2. Provide full end-to-end automation. People make mistakes, and anything that’s not automated will have production issues.
3. Keep the system asynchronous; otherwise one small failure will quickly snowball into an avalanche of degradation.
4. Keep metrics of everything, because scale tends to creep up from behind, and no one wants to be caught blind.
5. Ensure full intra-data center redundancy because servers fail…often.
6. Extend full data redundancy across multiple data centers, so storms like Sandy don’t put operations out of commission.
7. Have a back-up plan for a remote graceful shutdown that accounts for IP-based security.
8. Make sure code is testable, so there’s a way to let the world know what’s going on.
9. Divide intelligence into online and offline, so all the heavy lifting with predictive modeling is offline.
10. Use the right data management tool for the job; too often “all-in-one” means mediocre for all.
Srini V. Srinivasan, Aerospike founder and vice president of engineering and operations brings 20-plus years of experience in designing, developing and operating Web-scale infrastructures, including Aerospike customers. He holds over a dozen patents in database, Internet, mobile, and distributed system technologies. Srini co-founded Aerospike to solve the scaling problems he experienced with Oracle databases at Yahoo! where, as senior director of engineering, he had global responsibility for the development, deployment and 24×7 operations of Yahoo!’s mobile products, in use by tens of millions of users. Srini joined Yahoo! as part of the Verdisoft acquisition, where as vice president of engineering, he oversaw the development of high-performance data synchronization products for mobile users. Srini also was chief architect of IBM’s DB2 Internet products, and he served as senior architect of digital TV products at Liberate Technologies.
Comments on this page are now closed.
For exhibition and sponsorship opportunities, contact Susan Stewart at firstname.lastname@example.org
For information on trade opportunities with O'Reilly conferences email mediapartners
For media-related inquiries, contact Maureen Jennings at email@example.com
View a complete list of Strata + Hadoop World 2013 contacts