Skip to main content

Schedule: Operations sessions

This track covers building resilience into applications and infrastructure, operations escalation and outage handling patterns, BYOD, networking, security, metrics and monitoring, hybrid cloud implementations, and more.

Track Host

John Allspaw John Allspaw has worked in systems operations for over fourteen years in biotech, government and online media. He started out tuning parallel clusters running vehicle crash simulations for the U.S. government, and then moved on to the Internet in 1997. He built the backing infrastructures at Salon, InfoWorld, Friendster, and Flickr. He is now VP of Tech Operations at Etsy, and is the author of The Art of Capacity Planning published by O'Reilly.

 
Add to your personal schedule
Operations
Beekman
Arun Kejariwal (Machine Zone), Piyush Kumar (Twitter Inc.)
Average rating: ***..
(3.25, 8 ratings)
Agile development has become very predominant in most web companies. This coupled with the dynamic nature of Twitter's traffic may result in sudden breakouts - which manifest themselves as a mean-shift or a rampup - in application and system metrics. The focus of this talk is to present a statistical approach to automatically detect breakouts in a timely fashion, thereby mitigating user impact. Read more.
Add to your personal schedule
Operations
Beekman
doug small (Intuit)
Average rating: ***..
(3.67, 9 ratings)
A system is not a server. A system is the complete set of technology used to deliver your product to the customer, yet systems are rarely tested or validated as a complete technology stack. This is our story on how we implement the fundamentals of Systems Testing for Turbotax Online with some real world practical examples. Read more.
Add to your personal schedule
Operations
Beekman
Anna Shipman (Government Digital Service)
Average rating: ****.
(4.29, 7 ratings)
As fans of infrastructure as code, we wanted reliable tools to automate creation and configuration of machines and networks for GOV.UK and other projects. However, as the UK Government, we were limited in our suppliers, and the tools we wanted didn't exist. So we built them. This talk will cover the challenges and successes of doing modern infrastructure engineering in a traditional environment. Read more.
Add to your personal schedule
Operations
Beekman
Ryan Frantz (Etsy)
Average rating: ****.
(4.27, 26 ratings)
When it’s three in the morning, it’s hard enough waking up, let alone getting your brain in gear to fix problems. Computers should provide us with additional context around an alert, so that we can resolve issues faster and get back to sleep. This presentation discusses how to contextualize alerts automatically, so that engineers can address issues faster and get back to what they were doing. Read more.
Add to your personal schedule
Operations
Beekman
John Berryman (Eventbrite), Baron Schwartz (VividCortex)
Average rating: ****.
(4.29, 7 ratings)
The Log infrastructure as introduced by Jay Kreps is becoming a popular blueprint for architecting a Big Data infrastructure. By it's nature, a Log-centered infrastructure is simpler and more robust to failures and spikey, high-volume loading. This talk will cover VividCortex's transition from more traditional API-based infrastructure to a Kafka Log-backed infrastructure. Read more.
Add to your personal schedule
Operations
Beekman
Ben Hughes (Airbnb), Jon Tai (Airbnb)
Average rating: ****.
(4.59, 17 ratings)
Airbnb has grown rapidly over the last few years, and our infrastructure was not able to keep up with the growth. Faced with a fast approaching busy summer travel season, a small team formed to improve reliability and scalability. We were able to make significant progress without doing a major rewrite, removing functionality, or disrupting other teams. Read more.
Add to your personal schedule
Operations
Beekman
David Josephsen (Librato)
Average rating: ****.
(4.73, 15 ratings)
A discussion of best-practices when creating/managing alerting infrastructures. Examines both technical and social factors. Explores common anti-patterns and how to avoid them. Read more.
Add to your personal schedule
Operations
Beekman
Brian Nuszkowski (Duo Security)
Average rating: ****.
(4.00, 11 ratings)
Congratulations! Your site has just gone big. Depending on how thorough you’ve been with load testing, you may or may not be celebrating this impending flood of user traffic. This talk will elaborate on the topic of load testing, defining what a load testing strategy should look like, and elaborate on several of its components. Read more.
Add to your personal schedule
Operations
Beekman
John Willis (Docker)
Average rating: ****.
(4.56, 9 ratings)
An exploration of Devops and the Network. Software Defined Networks (SDN) is all the buzz, but the reality is that many network operations and engineering groups are dealing with an influx of highly virtualized tooling like OpenVswitch and Openflow and projects like OpenDaylight, Contrail, and NSX. Please join John to help him start the discussion of what DevOps in the Network really means. Read more.
Add to your personal schedule
Operations
Beekman
Sean Kane (New Relic)
Average rating: ****.
(4.20, 5 ratings)
Today everyone is talking about the cloud, but frequently raw computing power and hardware is ideal. This talk covers how we made hardware deployments a breeze at New Relic. We will dive into the protocols, tools and code that enable hardware provisioning and management, explain how they all fit together, and uncover gotchas and lesson learned. Read more.
Add to your personal schedule
Operations
Beekman
Rob Peters (EdgeCast Networks)
Average rating: ****.
(4.00, 2 ratings)
Verizon EdgeCast’s edge network provides global delivery for dynamic applications, websites, mobile apps, live and on-demand streams, large-file downloads, and more. Building and deploying the http server software that delivers this traffic carries some unique challenges, which we address with a combination of new and old best practices along with lessons from our own experiences. Read more.
Add to your personal schedule
Operations
Beekman
Matteo Figus (OpenTable)
Average rating: ***..
(3.80, 5 ratings)
In a service oriented architecture is important to know how our APIs react on high volumes of traffic, in order to learn about their limits and keep under control how they perform. During this talk I'll explain why it is so important and how easy is to do it with some open-source node.js tools. Read more.
Add to your personal schedule
Operations
Regent
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
John Allspaw (Etsy)
Average rating: ****.
(4.50, 14 ratings)
This 3-hour tutorial will cover the theory and fundamentals of "The New View*" on complex systems failure and human error, as well as techniques for facilitating an adverse event debriefing. We will use case studies of known events and a good deal of interactive attendee participation to explore postmortem debriefing techniques and pitfalls. Read more.
Add to your personal schedule
Operations
Regent
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Lindsay Holmwood (Australian Government Digital Transformation Office), Jesse Reynolds (Bulletproof Networks)
Average rating: **...
(2.60, 20 ratings)
Enter Flapjack: a monitoring alert routing system. Flapjack sits at the end of your monitoring pipeline and works out who it should send alerts to. Sounds pretty simple? Flapjack tries to make it so. There are still really hard problems to solve when working out who to notify about a detected failure, and what to do when lots of things fail simultaneously. Read more.
Add to your personal schedule
Operations
Regent
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Tobi Knaup (Mesosphere)
Average rating: ****.
(4.12, 16 ratings)
Devops everywhere spend countless hours building custom scale-out architectures for web apps. Marathon is a new framework built on Apache Mesos that simplifies and automates operations, and provides a simple self-serve interface for developers to launch their apps or Docker containers on a shared cluster in a scalable and fault-tolerant way. Read more.
Add to your personal schedule
Operations
Regent
John Allspaw (Etsy)
Average rating: ****.
(4.14, 7 ratings)
A continuation of the 9:00 am session. Read more.