Training: June 20–21, 2016
Tutorials: June 21, 2016
Keynotes & Sessions: June 22–23, 2016
Santa Clara, CA

Speaker slides & videos

Presentation slides will be made available after the session has concluded and the speaker has given us the files. Check back if you don't see the file you're looking for—it might be available later! (However, please note some speakers choose not to share their presentations.)

Jamie Wilkinson (Google)
Monitoring only sucks when the cost of maintenance scales proportionally with the size of the system being monitored. Recently, tools have emerged that assist with scaling out monitoring configurations sublinearly with the size of the system. Jamie Wilkinson explores time series-based alerting and offers practical examples that can be employed in your environment today.
Charity Majors (Honeycomb)
Charity Majors discusses making better choices with software. Whether you're selecting a new polyglot persistence layer, launching a startup from scratch, or modernizing a mature environment, there have never been more opportunities for chaos. Charity explains when you should use boring technology, when to take a flyer on the bleeding edge, and best practices for making solid technical decisions.
C.J. Jameson (Pivotal)
C.J. Jameson explains how to test all the things, even your bash scripts. Using the BATS framework, it's cheap to write both high- and low-level tests to drive out more modular and readable bash scripts.
Matthew Flaming (New Relic)
New Relic receives over half a trillion data points from customers that need to be processed, stored, and made available ASAP. Matthew Flaming explains why moving from a single application performance product to querying more than a billion events a second to serve multiple products has meant radically reinventing and re-architecting the whole platform multiple times.
The old wisdom about keeping engineers away from customers is bunk. Your product team may be experts on the customer perspective, but everyone can benefit from developing user empathy. Using the Heroku Postgres team as a case study, Peter van Hardenberg explains how to build a highly scaled organization with world-class operations and support and a deep appreciation for the challenges users face.
Diego Lapiduz (Pivotal)
There's a software culture revolution inside the government. User-centric design, lean, and agile are first citizens, but the increased velocity in development and testing requires a change in the way the government does deployment, security, and compliance. Diego Lapiduz shows how the team is building tools to achieve faster deployments and continuous compliance in a secure environment.
The advantages of containerized applications are increasingly recognized. Michael Hausenblas provides a gentle introduction to building and operating containerized applications at scale. The first day focuses on the basics of building app using containers; the second day expands this knowledge, focusing on the operations (monitoring, upgrades, etc.) of these apps.
Ben Lavender (GitHub)
By now, you've probably have heard of ChatOps (especially if you're in operations). GitHub has been using ChatOps for more than five years and continues to scale these practices. Ben Lavender explains the guidelines that GitHub has created to work with ChatOps and the lessons learned in the process.
Lee Atchison (New Relic)
As our applications grow, keeping them operational is challenging. High growth means more data, more computation, and more opportunities for problems. The cloud offers the ability to improve scalability while maintaining availability. Lee Atchison explains the “keep two mistakes high” principal and how to use the cloud to keep applications healthy and growing while keeping costs inline.
Karl Isenberg (Mesosphere)
The orchestration space is fast moving and full of competing products, platforms, and frameworks. How do you choose the right one for your requirements? Karl Isenberg explores the features of several container orchestrators, breaking down the feature sets and characteristics into categories and scoring multiple solutions—including Kubernetes, Marathon, and Docker Swarm—against each other.
Bridget Kromhout (Microsoft)
Bridget Kromhout explains why containers will not fix your broken culture. Microservices won’t prevent your two-pizza teams from needing to have conversations with one another over that pizza. No amount of industrial-strength job scheduling makes your organization immune to Conway’s law.
Artur Bergman (Fastly)
When a DDoS attack occurs, organizations respond using existing mental models of operations, not taking into account the emotional effects of the malicious nature of the attack. Artur Bergman explores how people react and why a DDoS attack is different than other challenges a company may face, proving that it isn't your fault and you're not alone.
Justin Lintz (Spring)
Justin Lintz defines some of the stresses operations people face, outlines methods for mitigating them, and discusses his personal experience of having an anxiety disorder while working in an operations role, raising awareness about the anxiety issues many people face but are afraid to talk about.
Donny Nadolny (PagerDuty)
Distributed systems are hard. They are complicated, hard to understand, and very challenging to manage. But they are critical to modern software, and when they have problems, we need to fix them. Donny Nadolny looks at what it takes to debug a problem in a distributed system like ZooKeeper, walking attendees through the process of finding and fixing one cause of many of these failures.
Tim Kadlec (Independent), Patrick Meenan (Facebook)
Tim Kadlec and Patrick Meenan explain how the construction of websites and applications impacts performance as well as how to quickly debug and resolve performance issues. Tim and Patrick dive into how browsers work, how web pages are delivered, backend and frontend issues, optimizations, and techniques to get the best performance and provide hands-on experience for working on web performance.
Tim Kadlec (Independent), Patrick Meenan (Facebook)
Tim Kadlec and Patrick Meenan explain how the construction of websites and applications impacts performance as well as how to quickly debug and resolve performance issues. Tim and Patrick dive into how browsers work, how web pages are delivered, backend and frontend issues, optimizations, and techniques to get the best performance and provide hands-on experience for working on web performance.
Betsy Nichols (Netuitive, Inc)
Effective monitoring for today’s agile environments is both science and art. (Analytics can provide the “science” while experts and business context can provide the “art.”) There is no perfect solution, but a framework for integrating these varied information sources as collaborators can drive continuous improvement. Elizabeth Nichols highlights (anonymized) examples from real environments.
Courtney Kissler (Starbucks)
Employee burnout is an overlooked anchor dragging down productivity and employee engagement in our industry. Courtney Kissler highlights some tactics and metrics to help leaders proactively address this issue.
Jon Hodgson (Riverbed Technology)
To analyze and improve the performance of modern applications, you must abandon outdated approaches and toolsets which are rooted to the physical topology of servers and JVMs. Jon Hodgson discusses a new paradigm to reveal unexpected relationships and hotspots obscured by the elasticity of containers and microservices so that you can find and fix issues with the most overarching business impact.
Harkeerat Bedi (Verizon Digital Media Services)
Large-scale cloud networks are constantly driven by the need for improved performance in communication between data centers. Such back-office communication makes up a large fraction of traffic in many cloud environments. Harkeerat Bedi offers an overview of a tool that improves the efficiency of data-center-to-data-center communication by learning the congestion level of links in between.
Bruce Lawson (Opera Software)
Ads are annoying and intrusive and can compromise privacy, but, worst of all, they're disastrous for website performance. Bruce Lawson outlines the performance gains Opera has made by deploying a native ad blocker in its flagship browsers, explains how Opera did it, and explores how the whole advertising ecosystem can (hopefully) improve.
Karan Kumar (Instart Logic)
Users' ad blockers are impacting your site's perceived performance, but measuring the impact of ad blockers on actual and perceived performance can be difficult. Karan Kumar offers an overview of new testing he has created that measures the overall impact ad blockers have on the quality of user experience and performance across a number of sites.
Philip Tellis (Akamai), Nic Jansma (Akamai)
Whenever we speak of measuring web performance, we always refer to measuring static events, like page load or time to first tweet. A performant user experience is much more than that. Philip Tellis and Nic Jansma explore methods of measuring web performance as it relates to continuous interactions between the user and a page.
Stephen Ludin (Akamai, Board Member ISRG)
What good are detailed web page timings if you are not measuring the right things? Stephen Ludin offers an overview of the User Timing API, exploring adoption rates, current levels of support, and a path toward universal adoption and usage.
Seth Vargo (Google)
Seth Vargo offers a comprehensive, engineer­-led overview of two of HashiCorp's tools: ­Terraform and Atlas.
Dustin Whittle (AppDynamics)
Dustin Whittle offers a practical introduction to modern performance best practices for web apps, diving into the latest tools and best practices for launching an ideal end-user experience. Find out how you can leverage Chrome Developer Tools, Google PageSpeed, and WebPagetest to get started improving your applications.
Gianluca Borello (Sysdig)
Gianluca Borello explores the state of the art for visibility, monitoring, and troubleshooting for microservices and containers—including live demonstrations of popular tools and methods and the pros and cons of each—with special emphasis on sysdig, an open source system visibility tool.
Gabe Wishnie (Microsoft)
Gabe Wishnie explains how Microsoft monitors the cloud services it provides at high scale with low latency through a multidimensional metric (MDM) system. Gabe offers an introduction to the architecture Microsoft uses, lessons learned along the way, and the areas in which it is still investing.
Matthew Brender (Intel), Raj Dutt (raintank)
Data is beautiful when made visible. Matthew Brender and Raj Dutt offer a demo of Snap, an open telemetry framework designed to gather an increasingly diverse amount of measurements from the cloud, and illustrate how to visualize data in Grafana with unprecedented ease.
Dieter Plaetinck (raintank)
Alerting on your stack is the key to happy customers and a healthy business. Dieter Plaetinck explains what's wrong with the oft-touted complicated alerting methods and explores how to get the in-depth coverage and address complicated alerting needs using simple techniques, with a focus on the workflow using an alerting IDE.
Aneel Lakhani (SignalFx)
OODA isn’t just for DevOps. Mapping isn’t just for strategists. Antifragility isn’t just for Netflix. Once we get beyond cargo-culting ideas, we can begin to understand how they arose and in what context. But how do we apply them? Aneel Lakhani explores the practical application of these ideas through examples from his daily work.
Adam Auerbach (Capital One), Tapabrata Pal (Capital One)
Adam Auerbach and Tapabrata Pal discuss Captial One's transformation to continuous testing, covering core principles, tools, and best practices as well as common roadblocks and some recommendations on how best to remove them from the environment.
Buddy Brewer (SOASTA)
When most people think of performance monitoring tools, they think of things like time series charts, page load times, DNS resolution times, and backend service response times. Buddy Brewer explores alternative ways of visualizing performance data and explains how changing your perspective can sometimes lead you to surprising discoveries.
Richard Cook (Ohio State University SNAFUcatchers)
The C-suite (continuous delivery, continuous integration, continuous delivery, and their enablers like agile, scrum, and so on) is an investment in future adaptive capacity. Richard Cook explains the value of adaptive capacity—being able to respond to new challenges and grasp new opportunities—and explores its far-reaching consequences.
Michelle Carrizosa (SOASTA), Iris Lieuw (Akamai)
Michelle Carrizosa and Iris Lieuw demonstrate how to prioritize improvements across your ecommerce site by identifying which of your pages are the most important to optimize and then looking at resource timing data to determine what affects those pages the most.
Michael Gooding (Akamai), Javier Garza (Akamai Technologies)
Michael Gooding and Javier Garza share their experiences using HTTP/2 over the last 12 months, exploring case studies that demonstrate how performance can be improved while also addressing backward compatibility, using RUM data to review performance-related observations of customers after making the switch, and hands-on demos of HTTP/2 with server push and HTTP/2 + QUIC.
Ritesh Maheshwari (LinkedIn), Yang Yang (LinkedIn)
For the past year, LinkedIn has been running and iteratively improving Luminol, its anomaly detection system for real user monitoring data. Ritesh Maheshwari and Yang Yang offer an overview of Luminol, focusing on how to build a low-cost end-to-end system that can leverage any algorithm, and explain lessons learned and best practices that will be useful to any engineering or operations team.
Alois Mayr (Dynatrace), Alexander Ramos (B2W digital)
Migrating toward microservices tends to result in a 20x larger environment than monolithic counterparts. While the bright side of microservices and their enabling container platforms is high availability and scalability, what about the dark side—the side that nobody talks about in their presentations. Alois Mayr and Alexander Ramos uncover the truth so you don’t have to learn it the hard way.
Patrick Meenan (Facebook)
Patrick Meenan outlines techniques for serving rich experiences to users on fast connections while still offering a fast experience for users on slow connections and addresses some eye-opening issues and solutions for serving content on mobile connections.
Sonia Burney (Akamai), Sabrina Burney (Akamai)
Security techniques have generally focused on protecting users by blocking requests going to the origin, but security is also a concern at the browser. Sonia Burney and Sabrina Burney explore how security can be enforced at the browser level through a combination of optimization techniques and security enhancements, which overall provide an optimal end-user experience.
Ines Sombra (Fastly), Caitie McCaffrey (Twitter)
Surprisingly enough, academic papers can be interesting and very relevant to the work we do in industry as practitioners. Ines Sombra and Caitie McCaffrey demonstrate how academic papers can radically change your perspective and introduce you to new ideas, offering a tour of papers that have reshaped the way they think about building large-scale distributed systems.
Alex Nobert (Flynn)
When you ask Ops engineers where they want to go in their careers, the only answer you get after "I don't know" is "management." But what does that entail, and how do you get there? Alex Nobert discusses his career transition from engineer to manager to director, describing the day-to-day work, expectations, priorities, and goals of each so that you can learn from his mistakes.
Timothy Gross (Joyent)
Microservice architectures manage the complexity of the development process, and application containers help manage the dependencies and deployment of those microservices. But deploying and connecting services together is a challenge because it forces developers to design for operationalization. Timothy Gross explores autopiloting applications as a powerful design pattern to solve this problem.
Mike Dvorkin (Cisco)
Service consumption chaos is currently one of the biggest challenges in microservice environments. Mike Dvorkin discusses how tackling the complexity of exploding service dependencies and their control through a consistent consumption abstraction can significantly simplify how teams develop, test, deploy, and control applications.
Yoav Weiss (Akamai)
Our love-hate relationship with third parties has taken a turn for the worse. While they often pay the bills, HTTP/2 means they’re more of a performance burden, ad blockers mean users have had enough, and projects like Google AMP mean that embedders feel the same. Yoav Weiss explores how to gain back control of your site, discussing mitigation tactics as well as a long-term plan to restore sanity.
Pete LePage (Google)
Pete LePage explores the fundamentals of progressive web apps, covering how to architect a single-page web app using the App Shell model, how to identify the different service worker caching strategies and choose the most appropriate one for a use case, and how to implement an installable web app using manifests, metatags, and other techniques.
Todd Reifsteck (Microsoft Edge), Philippe Le Hegaret (W3C)
Todd Reifsteck and Philippe Le Hegaret discuss the work the W3C Web Performance Working Group is doing, as well as performance-related efforts by other groups, so that you can be up to date with the latest developments and what's coming next. They also explain how easy it is to get involved, provide feedback, and influence the direction that these standards will take.
Ian Carrico (Vox Media), jason ormand (Vox Media)
A little over a year ago, Vox Media created a dedicated performance team, which immediately set out to make all Vox Media sites as fast as possible—and has since made significant progress. Ian Carrico and Jason Ormand discuss what the team has done, how it did it, and what it's still working on.
Dan Slimmon (Exosite)
Common ground, an important concept in recent teamwork research, helps us understand why collaborative troubleshooting breaks down over time, leading to wasted effort and mistakes. Drawing on common ground as well as some ideas from medical diagnosis, Dan Slimmon demonstrates that by extending ChatOps, we can make troubleshooting much easier without losing the benefits of fluid team conversation.
Ozan Turgut (SignalFx)
We are witnessing an explosion in the sheer mass and velocity of data. But this data is most useful if the actual builders and operators—the people with all the context—can understand it and react to it quickly. Ozan Turgut discusses how to use visualization and analytics to turn data into leverage for decision making.
Dean Hume (Settled)
As any web developer knows, the developer tools built into modern browsers are packed with loads of features. The question is, do you really understand how or when to use them? These tools are capable of so much more than just debugging and inspecting elements in the DOM. Dean Hume teaches you exactly how to use the tools to become a better developer, one web page at a time.
Patrick Meenan (Facebook), Tammy Everts (SpeedCurve)
Google partnered with SOASTA to train a machine-learning model on a large sample of real-world performance, conversion, and bounce data. Patrick Meenan and Tammy Everts offer an overview of the resulting model—able to predict the impact of performance work and other site metrics on conversion and bounce rates.
David Hayes (PagerDuty)
DevOps brings proven benefits by automating the deployment pipeline. David Hayes explores the benefits DevOps brings beyond automation, explaining the importance of shared operational responsibility on organizational culture, why reliability means both systems and people, and how aggregating alerts helps to maintain situational awareness.
Eleanor Saitta (Systems Structure Ltd.)
What if the answer to managing security issues starts with the product and design teams? Thinking about security design can drive everything from business decisions through operations, but it means rethinking what security is and building different kinds of relationships between teams. It's a long journey, but Eleanor Saitta outlines three steps to a safer future.
Tobias Baldauf (Akamai Technologies)
Tobias Baldauf explains how to use HTTP/2's superpowers to optimize image delivery, thereby increasing the perceived performance of your page, reducing load times, and driving conversions.
Volker Will (Microsoft)
Microsoft has evolved. Among many other things, it is investing and transforming to provide the best mobile app development experience on the planet. DevOps is the backdrop for Microsoft's transformation. Volker Will shares Microsoft's experiences to help you on your own journey and evolve your point of view.