Build Systems that Drive Business
Sep 30–Oct 1, 2018: Training
Oct 1–3, 2018: Tutorials & Conference
New York, NY

Resilient, Performant, & Secure Distributed Systems
Design and build secure, robust, complex systems

Failure is inevitable given the complexity of our systems. Hear insider accounts of failures and learn how to efficiently detect and recover from them, as well as how complex technical systems and problems introduce risk and security issues. Learn the principles and practices for designing and managing systems that are secure, robust, adaptable, and can gracefully recover from failure, including topics related to building and maintaining complex systems, site reliability engineering, infrastructure as code, chaos engineering, and more.

We'll help you solve your toughest challenges with real-world advice from leaders in the field who have grappled with the same problems you're facing today. Like how to:

  • Make distributed system development easier and more reliable
  • Build robust distributed systems using containers
  • Increase system predictability by identifying processes that can benefit the most from automation
  • Keep your serverless apps secure and resilient
  • Deal with sensitive data in distributed systems
1:30pm–5:00pm Monday, October 1, 2018
Location: Sutton South/Regent Parlor Level: Beginner
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Tammy Butow (Gremlin), Ana Margarita Medina (Gremlin), Patrick Higgins (Gremlin)
Average rating: ***..
(3.00, 2 ratings)
Chaos engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. Tammy Butow, Ana Medina, and Patrick Higgins lead a hands-on deep dive into chaos engineering, covering the tools and practices you need to implement it in your organization. Read more.
11:35am–12:15pm Tuesday, October 2, 2018
Location: Nassau Level: Intermediate
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Ameet Kotian (Slack)
Average rating: *****
(5.00, 1 rating)
Slack’s rapid growth over the last few years outpaced the original database’s scaling capacity, which negatively impacted the company's customers and engineers. Ameet Kotian explains how a small team of engineers embarked on a journey for the right database solution, which eventually led them to Vitess, an open source cluster database. Read more.
2:25pm–3:05pm Tuesday, October 2, 2018
Location: Murray Hill Level: Non-technical
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Heidi Waterhouse (LaunchDarkly)
Average rating: *****
(5.00, 4 ratings)
Waffle House's hurricane disaster plan has everything you could want from an IT disaster plan, including contact trees, failover states, and runbooks on partial operation. Heidi Waterhouse shares lessons about state drawn from the world outside computers and explains how to quantify them using a finite state machine and implement them automatically while you are in a less-than-perfect condition. Read more.
2:25pm–3:05pm Tuesday, October 2, 2018
Location: Nassau Level: Intermediate
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Kristina Bennett (Google)
Average rating: ***..
(3.50, 2 ratings)
Kristina Bennett shares best practices for practical data recoverability and shines a light onto some of the pitfalls awaiting the unwary, based on lessons learned from five years of data integrity tooling and consulting across Google. Read more.
2:25pm–3:05pm Tuesday, October 2, 2018
Location: Beekman/Sutton North Level: Intermediate
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Naoman Abbas (Pinterest)
Average rating: *****
(5.00, 1 rating)
Naoman Abbas offers an overview of tools Pinterest built to process trace data and the use cases they’ve enabled and shares some real-world examples. Join in to learn how to apply these techniques to your own challenges. Read more.
3:50pm–4:30pm Tuesday, October 2, 2018
Location: Nassau Level: Beginner
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Bart De Vylder (CoScale)
Average rating: ****.
(4.00, 2 ratings)
Bart De Vylder shares his experience migrating an existing codebase and production environment to Kafka Streams, a relatively new and promising streaming library. Join in to see what aspects worked remarkably well and the challenges he ran into along the way. Read more.
3:50pm–4:30pm Tuesday, October 2, 2018
Location: Beekman/Sutton North Level: Beginner
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Amy Nguyen (Stripe), Cory Watson (Stripe)
Average rating: ***..
(3.00, 1 rating)
You're unsatisfied with one of your monitoring providers. You've considered finding a new solution, but the thought of migrating your data off their platform sounds extremely painful. Amy Nguyen and Cory Watson explain how to make a deadline for an infrastructure-critical software migration while ensuring that everyone's requirements are met and no data has been lost. Read more.
4:45pm–5:25pm Tuesday, October 2, 2018
Location: Nassau Level: Beginner
Secondary topics:  Resilient, Performant & Secure Distributed Systems
James Meickle (Quantopian)
Average rating: *****
(5.00, 2 ratings)
Quantopian integrates financial data from vendors around the globe. As the scope of its operations outgrew cron, the company turned to Apache Airflow, a distributed scheduler and task executor. James Meickle explains how in less than six months, Quantopian was able to rearchitect brittle crontabs into resilient, recoverable pipelines defined in code to which anyone could contribute. Read more.
11:35am–12:15pm Wednesday, October 3, 2018
Location: Gramercy Level: Beginner
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Liz Rice (Aqua Security)
Average rating: *****
(5.00, 1 rating)
Beyond looking out for a little green padlock in the browser bar, what do you need to know about secure connections as a programmer? What do people mean by terms like authentication, verifying a certificate, or signing a message? Join Liz Rice as she demystifies HTTPS, TLS, X.509, and more. Read more.
1:30pm–2:10pm Wednesday, October 3, 2018
Location: Gramercy Level: Intermediate
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Seth Vargo (Google)
Average rating: ****.
(4.00, 2 ratings)
Seth Vargo outlines the key principles for securing microservices and distributed systems in the modern world, where applications run in cloud or hybrid cloud infrastructure. Read more.
1:30pm–2:10pm Wednesday, October 3, 2018
Location: Beekman/Sutton North Level: Intermediate
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Average rating: ***..
(3.50, 4 ratings)
Michael Hausenblas walks you through troubleshooting applications running in Kubernetes, from application-level debugging to distributed tracing to chaos engineering. Read more.
2:25pm–3:05pm Wednesday, October 3, 2018
Location: Gramercy Level: Beginner
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Molly Crowther (Pivotal)
Average rating: *****
(5.00, 1 rating)
Molly Crowther demonstrates how the enterprise can use cloud platforms to make security move at the pace of business—not the other way around. Read more.
3:50pm–4:30pm Wednesday, October 3, 2018
Location: Gramercy Level: Beginner
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Guy Podjarny (Snyk)
Average rating: *****
(5.00, 1 rating)
Serverless shuffles security priorities, naturally mitigating certain risks while elevating others, as this live hacking session vividly demonstrates. Guy Podjarny breaks into a vulnerable demo serverless app while explaining each security mistake, its impact, and how it can be avoided. You'll leave knowing why you need to keep your functions secure and how to do it yourself. Read more.
4:45pm–5:25pm Wednesday, October 3, 2018
Location: Gramercy Level: Intermediate
Secondary topics:  Resilient, Performant & Secure Distributed Systems
Ian Coldwater (Heroku)
Ian Coldwater offers practical advice about securing your Kubernetes clusters, from an attacker’s perspective. Read more.