Build Systems that Drive Business
June 11–12, 2018: Training
June 12–14, 2018: Tutorials & Conference
San Jose, CA

Resilient, Performant, & Secure Distributed Systems

Design and build secure, robust, complex systems

Failure is inevitable given the complexity of our systems. Hear insider accounts of failures and learn how to efficiently detect and recover from them, as well as how complex technical systems and problems introduce risk and security issues. Learn the principles and practices for designing and managing systems that are secure, robust, adaptable, and can gracefully recover from failure, including topics related to building and maintaining complex systems, site reliability engineering, infrastructure as code, chaos engineering, and more.

We'll help you solve your toughest challenges with real-world advice from leaders in the field who have grappled with the same problems you're facing today. Like how to:

  • Make distributed system development easier and more reliable
  • Build robust distributed systems using containers
  • Increase system predictability by identifying processes that can benefit the most from automation
  • Keep your serverless apps secure and resilient
  • Deal with sensitive data in distributed systems
9:00am–12:30pm Tuesday, June 12, 2018
Location: 230 A Level: Beginner
Secondary topics: Resilient, Performant & Secure Distributed Systems
Tammy Butow (Gremlin)
Average rating: ****.
(4.33, 3 ratings)
High-severity incident management is the practice of recording, triaging, tracking, and assigning business value to problems that impact critical systems in order to enhance the customer experience by improving your infrastructure reliability and upskilling your team. Tammy Butow walks you through establishing a high-severity incident management program and measuring its success. Read more.
9:00am–12:30pm Tuesday, June 12, 2018
Location: LL21 A/B Level: Beginner
Secondary topics: Resilient, Performant & Secure Distributed Systems
Nathen Harvey (Chef)
Average rating: ****.
(4.67, 6 ratings)
Join Nathen Harvey to learn how to easily integrate automated tests that check for adherence to policy into any stage of your deployment pipeline, using InSpec for compliance and Chef for remediation. Read more.
9:00am–12:30pm Tuesday, June 12, 2018
Location: LL20 A/B Level: Non-technical
Secondary topics: Resilient, Performant & Secure Distributed Systems
Will Gallego (Fastly)
Average rating: ****.
(4.00, 2 ratings)
Will Gallego walks you through the structure of postmortems used at large tech companies with real-world examples of failure scenarios and debunks myths regularly attributed to failures. You'll learn how to incorporate open dialogue within and between teams to bridge these gaps in understanding. Read more.
1:30pm–5:00pm Tuesday, June 12, 2018
Location: LL20 A/B Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Seth Vargo (Google)
Average rating: *****
(5.00, 2 ratings)
Kubernetes is a popular application scheduler and orchestration tool, but its built-in secret storage does not provide the robustness many organizations require. In this interactive workshop, Seth Vargo demonstrates how to connect applications and services running under Kubernetes to HashiCorp Vault. Read more.
2:10pm–2:50pm Wednesday, June 13, 2018
Location: LL21 E/F Level: Non-technical
Secondary topics: Resilient, Performant & Secure Distributed Systems
Serena Chen (BNZ Digital)
Average rating: *****
(5.00, 4 ratings)
What insights do we gain if we apply user experience design to information security? Serena Chen shares four strategies that apply design thinking to security problems, pinpointing which practices work and which are detrimental. Serena then walks you through some common flows and dissects how design decisions affect your personal security. Read more.
3:40pm–4:20pm Wednesday, June 13, 2018
Location: LL21 E/F Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Luis Colon (Amazon Web Services)
Average rating: ****.
(4.25, 4 ratings)
Many fundamental security practices and controls apply to serverless applications, including implementing proper monitoring and logging of all requests and events. Luis Eduardo Colon explores recommendations published by the Center for Internet Security (CIS), explains how to automate the deployment of some of these controls, and outlines considerations relevant to serverless functions. Read more.
4:35pm–5:15pm Wednesday, June 13, 2018
Location: LL20 A/B Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Ian Lewis (Google)
Average rating: ****.
(4.40, 5 ratings)
Ian Lewis shares the easiest and best ways to improve the security of your Kubernetes clusters Read more.
11:25am–12:05pm Thursday, June 14, 2018
Location: LL21 E/F Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Abby Fuller (Amazon Web Services)
Average rating: ***..
(3.67, 3 ratings)
There are many conference sessions on "how to get started with X." But once you've gotten up and running, there isn't always a lot of guidance on how to solve harder issues. Abby Fuller takes you beyond getting started with containers on AWS, covering advanced topics like hybrid clusters, bringing your own AMI, working with Docker settings not supported in the UI, and debugging load balancers. Read more.
1:15pm–1:55pm Thursday, June 14, 2018
Location: LL21 E/F Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Liz Rice (Aqua Security)
Average rating: ****.
(4.33, 3 ratings)
Liz Rice leads a dive into what's easy—and what's not—about finding and patching security vulnerabilities in containers. Read more.
1:15pm–1:55pm Thursday, June 14, 2018
Location: LL21 A/B Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Manish Mehta (Netflix), Torin Sandall (Open Policy Agent Project)
Average rating: ****.
(4.33, 6 ratings)
Manish Mehta and Torin Sandall lead a deep dive into how Netflix enforces authorization policies (“who can do what”) at scale in its microservices ecosystem in a public cloud without introducing unreasonable latency in the request path. Read more.
2:10pm–2:50pm Thursday, June 14, 2018
Location: LL21 A/B Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Average rating: **...
(2.50, 8 ratings)
Performance debugging is a crucial part of ensuring code is production ready, particularly as a company and its products grow. However, bottlenecks that hold these services back can be hard to identify. Christian Grabowski shares his experience debugging bottlenecks in distributed systems, at both a macro (metrics, distributed tracing) and a micro (user space and kernel space profiling) level. Read more.
2:10pm–2:50pm Thursday, June 14, 2018
Location: LL21 E/F Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Qingyang Chen (Google), Appu Goundan (Google)
Average rating: *****
(5.00, 1 rating)
Qingyang Chen and Appu Goundan demonstrate how to speed up container-based development by building container images with Jib, a Google image build tool for Java applications. Read more.
2:10pm–2:50pm Thursday, June 14, 2018
Location: LL21 C/D Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Jessica DeVita (Microsoft)
Average rating: ****.
(4.50, 2 ratings)
Jessica DeVita tells the story of how a team at Microsoft challenged themselves to retrospect their retrospectives and shares what they learned about applying human factors ideas to software development. Read more.
3:40pm–4:20pm Thursday, June 14, 2018
Location: LL21 A/B Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Kyle Kingsbury (Jepsen)
Average rating: *****
(5.00, 3 ratings)
Kyle Kingsbury offers an overview of Tesser, a Clojure library for writing commutative, parallel folds that can be chained and composed into complex single-pass reductions that are dramatically faster on multicore systems and can be transparently distributed over Hadoop. Read more.
4:35pm–5:15pm Thursday, June 14, 2018
Location: LL21 E/F Level: Intermediate
Secondary topics: Resilient, Performant & Secure Distributed Systems
Cynthia Thomas (Google)
Average rating: ***..
(3.33, 3 ratings)
Modern microservices architectures (like those run on Kubernetes) need modern security solutions to provide least-privilege security. Cynthia Thomas outlines traditional firewall methods and details the evolution of the distributed security model to enforce least privilege for microservices. Read more.