Build Systems that Drive Business
June 11–12, 2018: Training
June 12–14, 2018: Tutorials & Conference
San Jose, CA

Speakers

Hear from a wide range of talented senior engineers, systems practitioners, and technical managers who are doing amazing things in distributed systems and DevOps. More speakers will be announced; please check back for updates.

Filter

Search Speakers

Sean T. Allen is vice president of engineering at Wallaroo Labs and a member of the Pony core team. His turn-ons include programming languages, distributed computing, Hiwatt amplifiers, and Fender Telecasters. His turn-offs include mayonnaise, stirring yogurt, and sloppy code. He’s one of the authors of Storm Applied.

Presentations

Pat Helland and me: How to build stateful distributed applications that can scale almost infinitely Session

In 2007, Pat Helland published "Life Beyond Distributed Transactions: An Apostate’s Opinion," in which he conducts a thought experiment on how to design a distributed database that can scale almost infinitely. While the paper explicitly addresses distributed database design, Sean Allen shows that the ideas are far more widely applicable, particularly in scaling stateful applications.

Ben Amaba is chief innovation officer for the industrial sector for IBM’s Watson and Cloud Division, where he is responsible for industrial manufacturing, infrastructure, and logistics solutions and leads digital transformation of complex integrated systems where performance, quality, and sustainability measurements are foundational to a successful outcome. He has published globally on best practices and robust approaches to drive market innovation, economic development, and public-private partnerships. Ben holds a copyright for the Process Activity Flow Framework. An advocate of professional engineering licensure, he has been a keynote in major professional and industry conferences around the world for the past 30 years and is prominently featured in Giving 2.0 by Laura Arrillaga-Andreessen. Ben is a member of the United Way Tocqueville Society, which recognizes local philanthropic leaders. He holds a PhD in industrial and systems engineering, an MBA/MS in engineering and operations, and a BS in electrical engineering. He is a registered and licensed professional engineer in several states with International Registry, is certified in production, operations, and inventory management by APICS, is a LEED Accredited Professional in leadership in energy and environmental design, and holds a certificate in corporate strategy from MIT.

Presentations

Digital disruptions and best practices (sponsored by IBM) Session

The digital world has produced efficiencies, new products, and closer customer relationships. Yet as systems continue to gain complexity through emerging technology, the failure rates in budgets, schedules, and quality goals become increasingly unmanageable. Ben Amaba explains how to effectively use frameworks and methods to create business models that are intelligent and resilient.

David Andrews leads CDN architecture and technical evangelism at Verizon Digital Media Services, Previously, he brought several web security products to market at Verizon and worked for startups in the Los Angeles area, building security products in the virtualization and content delivery network (CDN) spaces. He enjoys low-level security exploitation techniques and has an appreciation for the nuances and resulting surprised faces that accompany discovering failure modes in globally distributed systems. David holds a PhD in computer security from a small university in Australia.

Presentations

EdgeControl: CDN tools to appease your inner control freak (sponsored by Verizon Digital Media Services) Keynote

Change is inevitable, but the aftereffects can be both good and bad. Having the right tools is one way to meet this challenge. Dave Andrews explains how to wield the power of a global 50 Tbps application delivery network, featuring 125+ points of presence, to ensure maximum availability during and after a change.

Astrid Atkinson is director of software engineering at Google, where she leads development frameworks. During her 10+ years at Google, Astrid has built infrastructure and managed a variety of engineering teams and spent more than five years on call for Google.com. She has led teams across the infrastructure map, from the team responsible for running and building Google’s web-serving layer to App Engine and cloud systems to core search.

Presentations

Reliability from the ground up: Designing for five nines Keynote

Astrid Atkinson discusses techniques for building systems that are resilient by design.

Too big to change: When even your microservices are monoliths Session

Astrid Atkinson shares a microservices-based approach to tackling legacy and heterogeneity at Google.

Tamar Bercovici is a senior director of engineering at Box, where she leads the database, content, and enterprise teams in building out the core of the Box Content Management Platform. Tamar also runs the annual Box company-wide hackathon and has been frequently invited to speak at various leading technology conferences like Percona, Velocity, @Scale, and Grace Hopper. Tamar was named one of Business Insider’s most powerful women engineers in tech in 2014 and 2017. Previously, Tamar was an early-stage employee at XMPie (now a Xerox company). She holds a PhD in computer science from the Technion Israel Institute of Technology.

Presentations

Lessons learned while evolving Box’s database infrastructure Keynote

When Tamar Bercovici joined Box, the entire platform was running on a single MySQL DB host fronted by a simple pool of memcached servers. Tamar details how the team has evolved the Box database stack to handle an ever-growing query load and dataset. It now comprises hundreds of servers serving millions of queries per second over hundreds of billions of data records.

Bill Boulden is the chief technology officer of ClearView Social, where he has migrated a VM-driven infrastructure to an autoscaling application fleet with serverless components. Bill has been developing software since the age of six. Previously, he was an API architect at Delaware North Companies. Running serverless applications in production has given him a unique perspective on architecture and application delivery for modern companies. In his spare time, he’s a pink-haired house music DJ by the name of Spruke, who enjoys EDM and generative ambient music.

Presentations

Serverless APIs with AWS Lambda and API Gateway Tutorial

Serverless architectures remove load from web servers and scale flawlessly to handle any volume while keeping you from paying for an instant of wasted idle time. Bill Boulden walks you through creating a functioning serverless API that coexists alongside conventionally served web pages using AWS Lambda and API Gateway.

Donovan Brown aka The Man in the Black Shirt is a principal DevOps manager on Microsoft’s cloud developer advocacy team. Developer tools are his thing. Donovan has traveled the globe helping companies in the US, Canada, India, Germany, and the UK develop solutions using Agile practices, Visual Studio, and Team Foundation Server in industries such as communications, healthcare, energy, and financial services. Previously, Donovan spent seven years as a process consultant and a certified Scrum Master. Donovan’s an avid programmer, often finding ways to integrate software into his other hobbies and activities, one of which is professional air hockey, where he has ranked as high as 11th in the world.

Presentations

Enterprise transformation (and you can too) Session

“That would never work here.” You’ve likely heard this sentiment (or maybe you’ve even said it yourself). Good news: change is possible. Donovan Brown explains how Microsoft's Visual Studio Team Services (VSTS) went from a three-year waterfall delivery cycle to three-week iterations and open sourced the VSTS task library and the Git Virtual File System.

Michael Brunton-Spall is an independent security consultant. Previously, Michael was deputy director for technology and operations and head of cybersecurity at the UK Government Digital Service and held a number of jobs ranging from creating low-level embedded hardware to gaming development on consoles to scaling and operating the Guardian newspaper. He is a regular conference speaker, the author of Agile Application Security, and an enthusiastic Agilist and security geek.

Presentations

Attack trees: Security modeling for Agile teams Tutorial

Traditional security approaches to threat and risk management are highly optimized to work within a traditional software development lifecycle. Michael Brunton-Spall shares a new approach to reviewing systems along with real-life examples to help you prioritize where to focus security efforts and what sorts of security threats you should worry about.

Emily Burns is a senior software engineer on the delivery engineering team at Netflix. She is passionate about building software that makes it easier for people to do their job.

Presentations

Multicloud continuous delivery with Spinnaker Tutorial

Tomas Lin and Emily Burns walk you through building continuous delivery pipelines for deploying and promoting code across cloud virtual machines and containers using Netflix's Spinnaker continuous delivery platform.

Tammy Butow is a principal SRE at Gremlin, where she works on chaos engineering—the facilitation of controlled experiments to identify systemic weaknesses. Gremlin helps engineers build resilient systems using their control plane and API. Previously, Tammy led SRE teams at Dropbox responsible for the databases and storage systems used by over 500 million customers and was an IMOC (incident manager on call), where she was responsible for managing and resolving high-severity incidents across the company. She has also worked in infrastructure engineering, security engineering, and product engineering. Tammy is the cofounder of Girl Geek Academy, a global movement to teach one million women technical skills by 2025. Tammy is an Australian and enjoys riding bikes, skateboarding, snowboarding, and surfing. She also loves mosh pits, crowd surfing, metal, and hardcore punk.

Presentations

How to establish a high-severity incident management program Tutorial

High-severity incident management is the practice of recording, triaging, tracking, and assigning business value to problems that impact critical systems in order to enhance the customer experience by improving your infrastructure reliability and upskilling your team. Tammy Butow walks you through establishing a high-severity incident management program and measuring its success.

David Calavera is the CTO of Netlify, where he and his team are building the best platform for deploying and automating modern web projects. Previously, he was a core member of the Docker Engine project, where he helped developers build the container engine that started the container revolution. David also built enterprise tools for GitHub and has contributed to numerous open source projects such us Go, JRuby, and many others.

Presentations

How Netlify migrated to Kubernetes, migrated off, and migrated to it again, without anyone noticing Session

Netlify recently moved a production system to Kubernetes, but the story isn't so simple. David Calavera shares the lessons Netlify learned during the migration that made the company roll the migration back and explains how they rolled it again—all without affecting production availability.

Francesc Campoy is the VP of Product at Dgraph: the most advanced distributed graph database.

Before that, he was VP of Product and Developer Relations at source{d}, the company enabling Machine Learning for large scale code analysis and building the platform for the future of developer tooling. Previously, he worked at Google as Senior Developer Advocate for Google Cloud Platform and the Go team.

He’s passionate about programming and programmers, especially Go and gophers. As part of his effort to help those learning, he’s given many talks and workshops at conferences like Google I/O, Gophercon(s), GOTO, or OSCON.

When he’s not on stage he’s probably coding, writing blog posts, or working on his justforfunc YouTube series where he hacks while cracking bad jokes.

Presentations

Go performance analysis in action Tutorial

Francesc Campoy Flores walks you through the tools that make Go a great programming language, from the well-known go tool to lesser-known tools that allow you to profile, debug, and understand the performance of your programs. Along the way, you'll learn how to tune Visual Studio Code as a Go editor, although you are welcome to use any other editor—most provide great integration with Go.

Qingyang “Q” Chen is a software engineer at Google, where he works on tools to improve the developer experience in the cloud. Previously, Q worked on a number of internet applications, including Google Slides, MongoDB, SubLite, and Scandux.

Presentations

Build containers faster with Jib, a Google image build tool for Java applications Session

Qingyang Chen and Appu Goundan demonstrate how to speed up container-based development by building container images with Jib, a Google image build tool for Java applications.

Serena Chen builds design frameworks at BNZ Digital. She is an ex-physicist/mathematician, teen magazine founder, and hacker at heart. She believes deeply in using technology to build a kinder, more compassionate, better world.

Presentations

Design for security Session

What insights do we gain if we apply user experience design to information security? Serena Chen shares four strategies that apply design thinking to security problems, pinpointing which practices work and which are detrimental. Serena then walks you through some common flows and dissects how design decisions affect your personal security.

David Cheney is a software engineer at VMWare. David has been involved with the Go project for more than eight years. He is a regular contributor to the language, focusing on Go on ARM processors. Previously, he ported Go to FreeBSD/ARM and Solaris/AMD64 and is working on a port to Linux/ARM64. David writes frequently about Go on his blog and has spoken locally and internationally.

Presentations

How we built Contour and what you can learn from our experience Session

David Cheney shares real-world advice on how to extend the capabilities of a Kubernetes cluster, using the development of the open source Contour Ingress controller as a case study.

Jackie Chu is a lead software engineer at Salesforce, where he focuses on identifying performance bottlenecks, developing performance monitoring tools, tuning server-side components, and designing scalable solutions. Jackie has seven years of software engineering experience specializing in performance tuning and optimization in cloud distributed systems. He holds a degree in computer science from the University of California, San Diego.

Presentations

Using AI to solve performance problems (sponsored by Salesforce) Tutorial

Jasmin Nakic and Jackie Chu share techniques to identify performance challenges by analyzing production data from Salesforce and other sources and explore the AI models to predict trends, detect anomalies, and troubleshoot performance problems.

Luis Eduardo Colon is a senior developer advocate for CloudFormation at Amazon Web Services. Previously, Luis was chief architect for TeamQuest and director of research and development for CDS Global. His areas of interest include DevOps automation, data science, Agile methodologies, and serverless applications. Luis holds a BS in computer engineering from Iowa State University and an MS in data analytics from SNHU.

Presentations

Deploy security controls for serverless apps with infrastructure-as-code tools Session

Many fundamental security practices and controls apply to serverless applications, including implementing proper monitoring and logging of all requests and events. Luis Eduardo Colon explores recommendations published by the Center for Internet Security (CIS), explains how to automate the deployment of some of these controls, and outlines considerations relevant to serverless functions.

Miro is a Co-founder and VP Engineering at DNAstack, where he builds a leading genomics cloud platform. He is a Java enthusiast with expertise in distributed systems and middleware, passionate about genetics and making meaningful software. Miro is the creator of the largest search and discovery engine of human genetic data, and the author of a book on parallelization of genomic queries. In his spare time, he blogs and contributes to several open-source projects.

Presentations

How we built a global search engine for genetic data Session

The Beacon Network is the largest search and discovery engine of human genomic data in the world. Miro Cupak details the architecture and technologies behind the system with focus on the technical decisions that allow it to scale and disrupt the perception of genetic data.

Jessica DeVita is a senior program manager for Visual Studio Team Services at Microsoft.

Presentations

A retrospective on retrospectives: How to be a nonexpert expert in system resilience Session

Jessica DeVita tells the story of how a team at Microsoft challenged themselves to retrospect their retrospectives and shares what they learned about applying human factors ideas to software development.

Ilya Dmitrichenko is a developer experience engineer at Weaveworks focused on making the adoption of microservices easier. Previously, Ilya worked at Xively, where he personally experienced the shift to a true DevOps culture. He began to focus further down the stack, becoming one of the early evangelists of and contributors to open source projects in the emerging Docker/container ecosystem.

Presentations

Introduction to systems and service monitoring with Prometheus 1-Day Training

Prometheus—an open source monitoring system and time series database—features a multidimensional data model and a flexible query language and integrates monitoring aspects from client-side instrumentation to alerting. Tamao Nakahara, Ilya Dmitrichenko, and Stefan Prodan offer an overview of Prometheus architecture and concepts and walk you through using Prometheus and PromQL.

Jaana B. Dogan is a software engineer at Google, where she works on observability of Go production services. She has a decade of experience building developer platforms and tools.

Presentations

Achieving observability at Google-scale with OpenCensus Session

Morgan McLean and Jaana Burcu Dogan detail how to quickly instrument your distributed services and gain visibility into their operation with OpenCensus.

Marcel Flores is a research scientist at Verizon Digital Media Services, where he explores transport layer optimizations, large-scale traffic management strategies, and cache optimizations. He holds a PhD from Northwestern University, where his research focused on enabling additional channels of communication in existing networks and improving network performance.

Presentations

Steering the Edgecast CDN with Heteractis Session

Marcel Flores explores the design and implementation of Heteractis, the traffic management system Verizon Digital Media Services uses to turn network telemetry data into automated decisions in an automated fashion.

Dr. Nicole Forsgren is VP of Research & Strategy at GitHub. She is author of the Shingo Publication Award-winning book Accelerate: The Science of Lean Software and DevOps and is best known as lead investigator on the largest DevOps studies to date. She has been a successful entrepreneur (with an exit to Google), a professor, performance engineer, and a sysadmin. Her work has been published in several peer-reviewed journals.

Presentations

Secrets and surprises of high performance: What the data says Keynote

Nicole Forsgren shares results and stories from four years of research to uncover the secrets and surprises of what really makes high-performing technology-driven teams and organizations.

Abby Fuller works on the containers team at Amazon Web Services. Previously, Abby worked at a number of startups, including Airtime and Hailo.

Presentations

Containers: Let's get fancy. Session

There are many conference sessions on "how to get started with X." But once you've gotten up and running, there isn't always a lot of guidance on how to solve harder issues. Abby Fuller takes you beyond getting started with containers on AWS, covering advanced topics like hybrid clusters, bringing your own AMI, working with Docker settings not supported in the UI, and debugging load balancers.

Will Gallego is a senior engineer at Fastly, where he builds scalable, distributed backend systems and tools to help engineers grow. Will is a systems engineer with 15+ years of experience in web technologies. He believes in a free and open internet, blame-aware postmortems, and pronouncing GIF with a soft “G”.

Presentations

Architecting a postmortem Tutorial

Will Gallego walks you through the structure of postmortems used at large tech companies with real-world examples of failure scenarios and debunks myths regularly attributed to failures. You'll learn how to incorporate open dialogue within and between teams to bridge these gaps in understanding.

Javier Garza is a multilingual technology evangelist at Akamai, where he helps the largest websites on the internet run faster and more securely. Javier has been working with computers for the past 25 years in Spain, Germany, and the USA—he started hacking Basic-based computer games at the age of 9. He loves taking things apart, understanding how they work, and finding the best and most practical way of improving them.

Presentations

The secret to building and delivering amazing apps at scale (sponsored by Akamai) Keynote

We are more mobile now than ever. Although we use our mobile devices to optimize our time and do more anytime, anywhere, our apps are still too slow and cannot cope with our fast-paced lifestyle. Javier Garza details the ingredients you need to build and deliver an amazing app your users will love.

Laurent Gil works on security product strategy at Oracle + Dyn, an Oracle Cloud Infrastructure global business unit. Previously, he was cofounder of Zenedge (acquired by Oracle in March 2018) and cofounder and CEO of Ukraine-based Viewdle (acquired by Google in 2012), which focused on machine learning and computer vision. Laurent holds a Doctorate Honoris Causa from the Cybernetic Institute of Ukraine, an MBA from the Wharton School, an MSc in computer science and signal processing from Supélec, a postgraduate degree in management from the Collègedes Ingénieurs in Paris, and a BS in mathematics (summa cum laude) from the University of Bordeaux.

Presentations

Bot or human? Applying machine learning to combating the bot epidemic (sponsored by Oracle + Dyn) Session

Bots now make up over 50% of website traffic and have become the primary source of malicious application attacks. Laurent Gil outlines what you need to know about bot traffic, discusses the types of bots you may encounter, from the simple to the sophisticated, and shares three real-world applications of machine learning and artificial intelligence to identify and defeat malicious bots.

Appu Goundan is a software engineer at Google, where he works on Java build tooling for developers targeting Google’s cloud. He would like container-based deployments to be fast and simple. He’ll also talk to you about home automation, basketball, surfing, and bread making.

Presentations

Build containers faster with Jib, a Google image build tool for Java applications Session

Qingyang Chen and Appu Goundan demonstrate how to speed up container-based development by building container images with Jib, a Google image build tool for Java applications.

Christian Grabowski is a backend software engineer at NS1, a next-generation DNS and traffic management company. Christian has worn many engineering hats over the course of his career and has worked on a variety of software but loves getting into the nitty-gritty, low-level code the most. When he’s not developing fast, intelligent DNS services, he’s rather active in the open source community, contributing to projects such as gobpf, BCC, and Kubernetes.

Presentations

Performance debugging: Finding bottlenecks in distributed systems Session

Performance debugging is a crucial part of ensuring code is production ready, particularly as a company and its products grow. However, bottlenecks that hold these services back can be hard to identify. Christian Grabowski shares his experience debugging bottlenecks in distributed systems, at both a macro (metrics, distributed tracing) and a micro (user space and kernel space profiling) level.

Julia Grace is the head of infrastructure engineering at Slack, where she grew her team from 10 people to 60+. She excels in high-velocity environments, especially during hypergrowth, and loves solving challenging engineering problems at scale. She advises early- and mid-stage startups and has extensive experience raising venture capital funding (including from top-tier investors such as Andreessen Horowitz). Previously, she was cofounder and CTO at Tindie, which she sold, and founded and sat on several advisory boards for startups and large multibillion-dollar nonprofits. Julia holds a BS with honors and an MS in computer science from the University of North Carolina, where her research focused on operating systems, including building a distributed system that allowed peer-to-peer data sharing from internet browser caches in low connectivity. She is an avid athlete and former collegiate rower and is always trying to squeeze a run in (even if it’s just chasing her young daughter).

Presentations

Scaling yourself during hypergrowth Keynote

When Julia Grace joined Slack two-and-a-half years ago, the company had fewer than 100 engineers. It's now at more than 350, and her own team grew from 10 to 50 people in 18 months. Julia shares tips and stories from the leadership front lines as she learned how to rapidly scale herself and her leadership team during a period when her job was substantially changing every six months.

Scaling yourself during hypergrowth: Lessons learned managing managers Session

When Julia Grace joined Slack two-and-a-half years ago, the company had fewer than 100 engineers. It's now at more than 350, and her own team grew from 10 to 50 people in 18 months. Expanding on her keynote, Julia shares tips and stories from the leadership front lines as she transitioned from line manager to managing managers.

Lena Hall is a senior software engineer and developer advocate at Microsoft working on Azure, where she focuses on large-scale distributed systems and modern architectures. Lena has more than 10 years of experience in software engineering with a focus on distributed cloud programming, real-time system design, highly scalable and performant systems, big data analysis, data science, functional programming, and machine learning. Previously, she was a senior software engineer at Microsoft Research. She’s an elected member of the F# Software Foundation’s board of trustees, co-organizes a conference called ML4ALL, and is often an invited member of program committees for conferences like Kafka Summit, Lambda World, and others. Lena holds a master’s degree in computer science.

Presentations

Distributed systems for stream processing: Apache Kafka and Spark Streaming Session

Data is generated at an ever-increasing rate, so your architecture for ingesting these incoming influxes of data needs to be flexible, scalable, fast, and resilient. Alena Hall walks you through using distributed systems like Apache Kafka and Spark Streaming to process data coming from multiple sources in real time, do processing, and perform machine learning tasks.

Ben Hartshorne is an engineer at Honeycomb. For the last 12 years, Ben has built monitoring, alerting, and observability systems for companies ranging from startups like Simply Hired and Parse to large organizations such as Wikimedia and Facebook. Strangely, he actually enjoys this work and is happy to finally be building a company that will help tease out nuances in data that seem to be missing from all the other crappy open source systems he’s used. Though unlikely to pass on a good scotch, he’ll reach for the bourbon or rye first.

Presentations

End-to-end observability for fun and profit Tutorial

Ben Hartshorne and Christine Yen explore what it means for a system to be “up” by discussing end-to-end (e2e) checks (what makes a good one and what techniques are valuable when thinking about them). Along the way, you'll learn how to write and evolve an e2e check against a common API.

Nathen Harvey is vice president of community at Chef, where he helps the community whip up an awesome ecosystem built around the Chef platform. Nathen also spends much of his time helping people learn about the practices, processes, and technologies that support DevOps, continuous delivery, and high-velocity organizations. Previously, Nathen spent a number of years managing operations and infrastructure for a diverse range of web applications. Nathen is a cohost of the Food Fight Show, a podcast about Chef and DevOps. He is also an occasional farmer who loves eggs and actively supports #hugops.

Presentations

Introduction to continuous compliance and remediation Tutorial

Join Nathen Harvey to learn how to easily integrate automated tests that check for adherence to policy into any stage of your deployment pipeline, using InSpec for compliance and Chef for remediation.

Michael Hausenblas is a developer advocate at AWS, part of the container service team, focusing on container security. Michael shares his experience around cloud native infrastructure and apps through demos, blog posts, books, and public speaking engagements as well as contributes to open source software. Previously, was at Red Hat, Mesosphere, MapR, and in two research institutions in Ireland and Austria.

Presentations

Introduction to container networking with Docker and Kubernetes 1-Day Training

Join Michael Hausenblas to learn everything you need to know to be successful with networking in a containerized setup, no matter if you’re a developer or an admin. You'll start with a simple case of Docker containers running on a single machine and move on to advanced networking with Kubernetes.

David Hayes is the director of platform strategy at PagerDuty, where he is scaling the most reliable way of waking up the IT world, and a full-time time data nerd. Dave can be comfortably blamed for anything you hate about PagerDuty’s product, but he’d rather talk about integrating your product with PagerDuty, PagerDuty’s APIs, his JavaScript wrapper, rock climbing, or Mario Kart.

Presentations

Artificial intelligence versus actionable intelligence (sponsored by PagerDuty) Keynote

Artificial intelligence has been almost here for 50 years, but we don't need to wait for it to escape the laboratory. Adding a manageable dose of actionable intelligence to your operations management workflow can save you time and aggravation. PagerDuty will talk about how AI's limitations and how it can decrease your noise and suggest possible courses of action.

Jon Hodgson is the principal scientist for APM at Riverbed Technology. For over a decade, Jon has helped hundreds of organizations around the world optimize the reliability and performance of their mission-critical applications. With a background in data science, application architecture, systems administration, networking, and programming, Jon employs a multidisciplinary approach to troubleshooting, enabling him to analyze and solve some of the most challenging performance issues in complex modern environments. When he’s not obsessing about data visualization and making things perform faster, Jon enjoys digging things up with his tractor at his home in Missouri.

Presentations

The "sound" of performance monitoring data (sponsored by Riverbed) Session

Much of the monitoring data we rely on is fundamentally flawed, lacking the resolution and accuracy needed to effectively detect and diagnose many issues. Digital signal processing science has overcome similar challenges for audio. Using sound as an example, Jon Hodgson explains how these principles are leveraged by organizations to improve the fidelity of their performance monitoring.

Alejandro (Alex) Jaimes is senior vice president of AI and data science at Dataminr. His work focuses on mixing qualitative and quantitative methods to gain insights on user behavior for product innovation. Alex is a scientist and innovator with 15+ years of international experience in research leading to product impact at companies including Yahoo, KAIST, Telefónica, IDIAP-EPFL, Fuji Xerox, IBM, Siemens, and AT&T Bell Labs. Previously, Alex was head of R&D at DigitalOcean, CTO at AiCure, and director of research and video products at Yahoo, where he managed teams of scientists and engineers in New York City, Sunnyvale, Bangalore, and Barcelona. He was also a visiting professor at KAIST. He has published widely in top-tier conferences (KDD, WWW, RecSys, CVPR, ACM Multimedia, etc.) and is a frequent speaker at international academic and industry events. He holds a PhD from Columbia University.

Presentations

Opportunities and challenges in applying machine learning Session

Machine learning (ML) is becoming a critical skill for developers and businesses alike. Applying ML successfully in real-world scenarios, however, remains challenging. Alex Jaimes discusses how to find opportunities to apply ML, the pitfalls in applying it, and the steps required to succeed—from data to metrics to testing to other critical factors.

Timirah James is a developer advocate for Platform9 Systems’s Fission, a FaaS built on top of Kubernetes. She is best known for being a thought leader in the Los Angeles and Silicon Beach tech community, her active roles in the hackathon realm, and mentoring women exploring the world of STEM through her meetup, TechniGal LA.

Presentations

Function composition in a serverless world Session

FaaS functions are great for small functionality but not for complex real-world applications. Soam Vasani and Timirah James explore four available options for composing functions, along with a deep dive into workflows.

Sacha Judd is managing director at Hoku Group. She’s also a startup champion, early-stage investor, cohost of Refactor, cofounder of Flounders’ Club, and creator of Back of a Napkin. Sacha is a frequent speaker at conferences and in-house events on startups, capital raising, diversity and inclusion in the tech sector, and how fans will transform the world.

Presentations

Superfan Session

Homogenous teams are one proven cause of missteps and flaws in software products and pipelines. Sacha Judd offers a fresh perspective, detailing available tools to improve hiring, promotion, and internal culture.

Sean Kane is the lead site reliability engineer at New Relic. A longtime system administrator and operations engineer, Sean has worked in a range of industry segments, including biotech, defense, entertainment, and hardware and software engineering in locations ranging from Alaska to Pakistan over his 20-year career. He’s the coauthor of Docker: Up and Running and provides Docker-related training with O’Reilly. In his spare time, Sean enjoys photography and sharing with his children the endless wonders of science, the great outdoors, and rappelling down skyscrapers. If you are looking for a conversation starter, Sean graduated from the Barnum & Bailey Clown College, completed two summer internships with the CIA, and built the first website in the state of Alaska, as well as the original USPS site.

Presentations

Docker: Up & Running—Workshop edition 2-Day Training

Drawing on his book Docker: Up & Running, Sean Kane walks you through everything that you need to know to start using Docker successfully, from installing Docker and designing and building Docker images to deploying and managing Docker containers. You'll leave with a firm understanding of how containers can help you optimize your workflow.

Docker: Up and Running—Workshop edition (Day 2) Training Day 2

Sean P. Kane, co-author of Docker: Up and Running, and an experienced trainer for O’Reilly will teach students everything that they need to know to start using Docker successfully. This will include teaching students how to install Docker, design and build Docker images, deploy and manage Docker containers, and simply think about containers and how they can help you optimize your workflow.

Brian Ketelsen is a cloud developer advocate at Microsoft. An experienced leader of technical teams with a strong focus on data warehouses and distributed computing, Brian has been writing software for various platforms since 1993. He has honed his broad technical skills in a variety of roles ranging from DBA to CIO. A prolific open source enthusiast, he has contributed to some of the largest Go projects, including Docker, Kubernetes, etc, SkyDNS, Kong, Go-Kit, and Goa, and coauthored Go in Action from Manning Press. Brian spends much of his free time fostering the Go community; he co-organizes GopherCon, the yearly conference for Go programmers held each summer in Denver and helps organize the Tampa Go meetup. Brian holds a bachelor’s degree in computer science.

Presentations

Kubernetes two-day kickstart 2-Day Training

Brian Ketelsen and Erik St. Martin guide you through key concepts and practices for deploying and maintaining applications using Kubernetes.

Kubernetes two-day kickstart (Day 2) Training Day 2

This class is intended for solutions developers, systems operations professionals, solution architects, and development operations professionals who develop, migrate, and deploy container-based applications in the public cloud and want to learn the key concepts and practices for deploying and maintaining applications using Kubernetes.

Kyle Kingsbury, aka Aphyr, is a computer safety researcher and independent consultant. He’s the author of the Riemann monitoring system, the Clojure from the Ground Up introduction to programming, and the Jepsen series on distributed systems correctness. He grills databases in the American Midwest.

Presentations

Improving performance with Tesser Session

Kyle Kingsbury offers an overview of Tesser, a Clojure library for writing commutative, parallel folds that can be chained and composed into complex single-pass reductions that are dramatically faster on multicore systems and can be transparently distributed over Hadoop.

Jepsen 9: The center cannot hold Keynote

Kyle Kingsbury explores anomalies in three distributed systems—Tendermint, Hazelcast, and Aerospike—and shares general strategies for correctness testing using Jepsen, a distributed system testing harness that applies property-based testing to databases to verify their correctness claims during common failure modes: network partitions, process crashes, and clock skew.

Tim Koopmans is director of load testing at Tricentis. A performance testing expert dedicated to making open source load testing tools and cloud testing infrastructure accessible to everyone, Tim engineered the Flood “shared nothing” architecture and holds the key to its unparalleled scalability and throughput. He also developed Ruby JMeter. Previously, Tim spent more than 10 years as a performance and development consultant for companies across the retail, finance, telecom, and government sectors and served as an Australian Army officer for 10 years.

Presentations

Load testing reinvented for DevOps (sponsored by Tricentis) Session

Tim Koopmans explains how load testing is being reinvented for DevOps, covering where traditional load testing approaches fall short for Agile and DevOps, what’s needed to rapidly expose performance issues before they impact users, and new approaches to making load testing faster, simpler, and more realistic.

Bridget Kromhout is a principal cloud advocate at Microsoft. Her CS degree emphasis was in theory, but she now deals with the concrete (if the cloud can be considered tangible). After 15 years as an operations engineer, Bridget traded being on call for being on a plane. A frequent speaker and program committee member for tech conferences, she leads the Devopsdays organization globally and the DevOps community at home in Minneapolis, Minnesota. She podcasts with Arrested DevOps, blogs at Bridgetkromhout.com, and is active in a Twitterverse near you.

Presentations

Kubernetes 101 Tutorial

In this hands-on Kubernetes workshop, Bridget Kromhout walks you through launching clusters and details all the moving parts you need to know about to use Kubernetes in production.

Brodie Kurczynski is a software engineer at Las Cumbres Observatory in Goleta, California, where he works closely with astronomers to gather scientific data.

Presentations

Real-time astronomical observations using a global network of telescopes Session

Brodie Kurczynski shares how Las Cumbres Observatory developed a stateless interface to take real-time observations on a private global telescope network over the internet on a nonprofit budget.

John La Barge is a cloud solutions architect at Google, where he focuses on mobile app development, Kubernetes, and big data pipelines. Previously, he was a lead engineer at Capital One Labs focused on developing iOS applications and APIs.

Presentations

Lightweight mobile DevOps on GCP (sponsored by Google Cloud) Session

John LaBarge details how to perform lightweight mobile DevOps on GCP, including building Android applications with Container Builder, doing functional testing with Firebase Device Lab, and distributing tested artifacts through Crashlytics Beta.

Lynn Langit is a cloud and big data architect, AWS Community Hero, Google Cloud Developer Expert, and Lynda.com author.

Presentations

Serverless SQL Session

Serverless data access (via SQL and other data query/processing languages such as Spark) is fast becoming the norm. Lynn Langit compares the state of public cloud serverless SQL via AWS Athena, Google Big Query, and others and explores architectural patterns and examples of services for emerging serverless and data lake cloud pipelines.

Xavier Léauté is a software engineer at Confluent, where he is responsible for analytics infrastructure, including real-time analytics in Kafka Streams. Previously, Xavier was a quantitative researcher at BlackRock and served in various research and analytics roles at Barclays Global Investors and MSCI. He holds an MEng in operations research from Cornell University and a master’s degree in engineering from École Centrale Paris.

Presentations

Metrics are not enough: Monitoring Apache Kafka Session

Experienced Kafka admins don’t just collect metrics; they go the extra mile and use additional tools to validate availability and performance on both the Kafka cluster and their entire data pipelines. Gwen Shapira and Xavier Léauté share best practices for monitoring Apache Kafka, discussing critical metrics, common mistakes, what metrics don’t tell you, and how to cover these essential gaps.

Richard Lee is CEO of Netra, a company that helps enterprise make sense of the tsunami of imagery by adding structure to photos and videos, making them searchable and more useful. This technology enriches social media and digital video to help companies better understand and reach their customers based on what content they post, view or engage with. Richard is based in Boston. He holds an MBA from MIT and a BS from Boston University.

Presentations

Netra Q&A: Scaling resource-intensive APIs (sponsored by Oracle + Dyn) Keynote

Kyle York and Richard Lee explore Netra’s high-performance computing environment, focusing on how the company's AI and deep learning models process tens of millions of images and videos each day in a time- and cost-effective manner. Along the way, they explain what worked, what didn't, and why you need an Agile, hybrid infrastructure if you want to build an AI business at the scale of social.

Ian Lewis is a Tokyo-based developer advocate on Google’s Cloud Platform team. Ian has held various developer and operations roles throughout his career and enjoys working in environments with diverse ways of thinking. He is passionate about DevOps, SRE, Python, Go, and container orchestration. When he’s not writing controllers and operators in Go, he runs the Kubernetes meetup in Tokyo and blogs about Kubernetes and containers.

Presentations

Kubernetes security best practices Session

Ian Lewis shares the easiest and best ways to improve the security of your Kubernetes clusters

Bryan Liles is an engineer at Heptio. When he is not writing software to help move teams to Kubernetes, he gets to speak at conferences on topics ranging from machine learning to building the next generation of developers. In his free time, Bryan races cars in straight lines and around turns and builds robots and devices.

Presentations

Declarative application configuration: Mixing the old with the new Keynote

Declarative application management enables developers and operators to simplify their configurations while deploying into increasingly complex environments. Bryan Liles explains how to evaluate and integrate these new practices into existing continuous integration pipelines.

Tomas Lin is a senior software engineer on the delivery engineering team at Netflix. He has worked on Spinnaker, Netflix’s open source multicloud continuous delivery platform, since 2013.

Presentations

Multicloud continuous delivery with Spinnaker Tutorial

Tomas Lin and Emily Burns walk you through building continuous delivery pipelines for deploying and promoting code across cloud virtual machines and containers using Netflix's Spinnaker continuous delivery platform.

Donna Malayeri is a product and community manager at Pulumi, where she is responsible for the company’s core open source project. Previously, she was a product manager on the Azure Functions team at Microsoft, guiding the developer experience from its beta through the first year of general availability. She is passionate about creating products that developers love and previously worked on programming languages such as F# and Scala. In a past life, she was an academic but now vastly prefers the experience of shipping software.

Presentations

Tooling in the age of serverless computing Session

Tooling is necessary for serverless and service-full applications. Donna Malayeri shares a decision framework for choosing infrastructure deployment tools, based on whether you need flexibility and control or simplicity and ease-of-use. You'll learn how to evaluate several popular cloud automation tools, including AWS SAM, Terraform, Chalice, Serverless Framework, and more.

Paul McCallick is director of product discovery at Nordstrom, where he supports multiple teams responsible for delivering personalized and relevant product information to customers. His scope includes the recommendation, personalization, and search systems that power Nordstrom’s omnichannel customer experiences.

Presentations

There can be only one (environment): Production Session

Paul McCallick discusses how and why Nordstrom has moved to an only-production viewpoint, saving countless engineering cycles and putting effort where it matters.

Nikki McDonald is a content director at O’Reilly Media, where she writes, edits, and works with the industry’s leading practitioners to develop books, online courses, and training videos to help engineers and developers collaborate more effectively and create and deploy complex distributed systems. She also cochairs O’Reilly’s Velocity Conference, held annually in San Jose, New York, and London. Nikki started out as a features editor at MacUser magazine back when people were still dialing up to the internet with AOL. She lives in Ann Arbor, MI.

Presentations

Thursday opening welcome Keynote

Program chairs Nikki McDonald, Ines Sombra, and James Turnbull open the second day of keynotes.

Wednesday opening welcome Keynote

Program chairs Nikki McDonald, Ines Sombra, and James Turnbull open the first day of keynotes.

Morgan McLean is a project manager for distributed tracing, debugging, and profiling at Google, which includes the OpenCensus project.

Presentations

Achieving observability at Google-scale with OpenCensus Session

Morgan McLean and Jaana Burcu Dogan detail how to quickly instrument your distributed services and gain visibility into their operation with OpenCensus.

Tyler McMullen is CTO of Fastly, where he is responsible for the system architecture and leads the company’s technology vision. As part of the founding team, Tyler built the first versions of Fastly’s instant purging system, API, and real-time analytics. Previously, Tyler worked on text analysis and recommendations at Scribd. A self-described technology curmudgeon, Tyler has experience in everything from web design to kernel development and loathes all of it. Especially distributed systems.

Presentations

Isolation without containers Session

Tyler McMullen offers an overview of sandboxing compilers, which provide important benefits but are also challenging to make both safe and fast. Tyler covers machine code generation and optimization, trap handling, and memory sandboxing and illustrates how to integrate them into an existing system—all based on a real compiler and sandbox, currently in development.

Manish Mehta is a senior security software engineer at Netflix, where he designs and develops solutions around secure bootstrapping, authentication (service and user), and authorization for cloud-native infrastructure. He focuses on cybersecurity, particularly security solutions anchored in cryptography, and has authored several research and conference publications in the field. Manish holds both an MS and a PhD in computer science from the University of Missouri – Kansas City.

Presentations

The distributed authorization system: A Netflix case study Session

Manish Mehta and Torin Sandall lead a deep dive into how Netflix enforces authorization policies (“who can do what”) at scale in its microservices ecosystem in a public cloud without introducing unreasonable latency in the request path.

John Miller III is a senior developer at Fauna since 2017 and focuses on driving advanced technologies into the core server. Prior to Fauna, John spent 28 years as a Senior Technical Staff Member (STSM) at IBM. In his last year, he lead IBM’s Analytics Platform development group and brought advanced technology to the Next Generation Analytic Platform which served the needs of data engineers and data scientists. In addition, he spent 26 years with Informix and 7 years as chief architect where he has designed many components and features, including the Hybrid JSON functionality. John has also focused on the embed space, including a completed redesign of the Informix administration system and the OpenAdmin Tool for Informix. He has led key improvements in database backups, high availability, benchmarking, scale up and scale out architectures, and has patents granted and pending in these areas. John was also a member of the IBM Academy of Technology, on numerous architecture and technical promotion boards, and involved in many cross IBM technology initiatives.

Presentations

Declarative automation for distributed databases Session

The complexity of distributed databases makes building tools for their declarative automation a daunting engineering challenge. Drawing from his experience of developing multiple configuration automation systems for databases, John Miller explores patterns that generally apply to building declarative management tooling for distributed stateful systems.

Imad Mouline is the chief technology officer at Everbridge, where he is responsible for Everbridge’s market strategy, product roadmap, innovation, and research and development. Previously, Imad was cofounder and CTO at enterprise cloud management company CloudFloor (acquired by Everbridge), CTO of Compuware’s Application Performance Management Solutions Division, which was formed when the company acquired Gomez, a provider of web performance management solutions, where Imad was CTO, and CTO of S1 Corporation, a provider of financial services solutions. Imad is a regular presenter at industry, technology, and academic conferences, including APCO, NEDRIX, the World Conference on Disaster Management, Cloud Connect, Interop, Internet World, and the MIT CIO Symposium. He is frequently quoted in leading publications such as the New York Times, USA Today, BBC News, Businessweek, CNN Money, Fortune, Forbes, Investor’s Business Daily, Network World, CIO Zone, and InformationWeek. He is a graduate of the Massachusetts Institute of Technology and has been awarded five US patents.

Presentations

What performance engineering means when lives are at stake (sponsored by Everbridge) Session

During crisis situations when lives are at stake, your critical event management and messaging platform cannot allow even the tiniest performance glitch. Imad Mouline explores technical and compliance challenges for building highly reliable, highly scalable, and highly secure systems that comply with the most demanding clients' needs and the highest levels of international regulations.

Neal Mueller is the product lead for Google Cloud Platform, where he focuses on security and BeyondCorp. Outside of Google, Neal is an adventurer. He has summitted Mount Everest unguided, sailed from Hawaii to San Francisco, swum the English Channel, and completed the first-ever row across the Arctic Ocean, for which he was awarded a Guinness World Record. Neal holds a BA from the University of Pennsylvania and an MBA from the University of Pennsylvania’s Wharton School, both with honors.

Presentations

You want to step outside? What our fight against phishing taught us and how it can help you Session

Google conducted the first longitudinal study of the underground ecosystem fueling credential theft and identified 12.4 million potential victims of phishing kits. Neal Mueller discusses this data and shares phishing demos and recommendations about the effectiveness of phishing prevention tools, including education, antivirus software, filtering, 2FA, password managers, and security keys.

John Mumm is a software architect at Wallaroo Labs based in Netherlands, where he works on an open source high-performance framework for building stateful distributed applications. John holds a PhD in philosophy.

Presentations

Think local: Reducing coordination and improving performance when designing around distributed state Session

Coordination is a common source of performance problems when dealing with distributed state. John Mumm shares strategies for avoiding coordination and relying on local knowledge wherever possible along with pros and cons and tips for using in-memory state instead of the typical approach of using external data stores.

Tamao Nakahara is head of developer experience at Weaveworks and co-organizes DevXCon. Tamao has over 20 years of DevEx, ecosystem alliances, and event experience. Previously, she was director of developer relations at New Relic, ran open source community programs at VMware and Pivotal for Cloud Foundry, Spring, Hadoop, RabbitMQ, and Redis, and helped customers with Oracle virtualization at VMware.

Presentations

Introduction to systems and service monitoring with Prometheus 1-Day Training

Prometheus—an open source monitoring system and time series database—features a multidimensional data model and a flexible query language and integrates monitoring aspects from client-side instrumentation to alerting. Tamao Nakahara, Ilya Dmitrichenko, and Stefan Prodan offer an overview of Prometheus architecture and concepts and walk you through using Prometheus and PromQL.

Jasmin Nakic is lead software engineer on Salesforce’s Sales Cloud performance engineering team, where he helps optimize cloud-based applications and build solutions to analyze and predict performance. Jasmin focuses on bringing advanced predictive analytics and machine learning methods to massive amounts of system performance data. Previously, he did database and application development at Teradata, KickFire, KLA-Tencor, and Oracle. Jasmin started his computer science adventure in high school in his small hometown in Bosnia, where he wrote short programs to analyze results from astronomical observations on HP calculators. Later, at the University of Sarajevo, he studied system programming, compilers, advanced architectures, and networks and wrote a thesis on object-oriented approaches to database programming.

Presentations

Using AI to solve performance problems (sponsored by Salesforce) Tutorial

Jasmin Nakic and Jackie Chu share techniques to identify performance challenges by analyzing production data from Salesforce and other sources and explore the AI models to predict trends, detect anomalies, and troubleshoot performance problems.

Ryan Neal is head of infrastructure and part of the founding team at Netlify. Previously, he worked on the infrastructure team at Yelp and at Palantir in the Middle East. Ryan is based in San Francisco. He loves big data, fire spinning, and his golden retriever.

Presentations

How Netlify migrated to a fully multicloud infrastructure without any service interruptions Session

Ryan Neal explains how Netlify planned, tested, and executed its first multicloud migration that could direct traffic to Google Cloud (GCP), AWS, and Rackspace Cloud on demand, without any service interruptions. Along the way, Ryan shares lessons learned and key takeaways you can apply to your own infrastructure decisions.

Victoria Nguyen is a network systems engineer at Fastly. She loves rock climbing and Halloween.

Presentations

Networks, echolocation, and fish GIFs Session

Victoria Nguyen explains how Fastly overhauled the monitoring and data collection of its globally distributed network without its caches noticing.

Kris Nova is a senior developer advocate at Heptio focusing on containers, infrastructure, and Kubernetes. Kris is also an ambassador for the Cloud Native Computing Foundation. Previously, she was a developer advocate and an engineer on Kubernetes in Azure at Microsoft. Kris has a deep technical background in the Go programming language and has authored many successful tools in Go. She is a Kubernetes maintainer and the creator of kubicorn, a successful Kubernetes infrastructure management tool. Kris organizes a special interest group in Kubernetes and is a leader in the community. She understands the grievances with running cloud-native infrastructure via a distributed cloud-native application and recently authored an O’Reilly book on the topic, Cloud Native Infrastructure. Kris lives in Seattle and spends her free time mountaineering and rock climbing.

Presentations

Moving enterprise Java applications to Kubernetes Session

Kris Nova leads a deep dive into the world of migrating a monolithic Java application to Kubernetes.

Running stateful applications in Kubernetes: Is it worth the risk? Keynote

Kris Nova explores the current state of running stateful applications in Kubernetes, the tooling gaps you'll want to watch out for, and the four metrics that will help you determine if it's worth the risk.

Renee Orser is the vice president of engineering at NS1, where she oversees all delivery and operations of NS1’s engineering organization. Renee brings deep expertise in facilitation, cross-functional communication, and brash problem solving to NS1’s teams. Previously, Renee spent a decade working and traveling in over 30 countries while managing teams delivering distributed, highly scalable digital healthcare products to governments and international nonprofits; her roles included senior program manager at ThoughtWorks, analyst at Partners In Health, and independent consultant. She holds a BA in international relations and Arabic from Tufts University.

Presentations

Observability of team health: Deciphering and reacting to organizational feedback (sponsored by NS1) Keynote

Engineering managers build the strongest teams by listening to their engineers, continuously calibrating their own alerts, and driving change management based on the feedback sourced from within their engineering organization. Renee Orser explains how to monitor the human networks within your engineering teams using models similar to your distributed technology systems.

Jeff Poole is an engineering director over a set of teams that handle both operations and software development at Vivint Smart Home, where he works on the backend platform that powers the smart home and security aspects of Vivint’s products. Over his career, he’s held a diverse collection of roles and responsibilities, including technical lead in creating a multi-data-center-hosted VoIP platform and principal engineer designing networking hardware for defense applications. An adrenaline junkie, he’s moved from skydiving to working on an ambulance and in an ER to working on production systems.

Presentations

More than a series of tubes: Networking in Kubernetes Session

Networking with Docker and Kubernetes is a lot more complex than with traditional servers and virtual machines. Jeff Poole offers an overview of the concepts involved and explains what tuning may be required to use Kubernetes successfully.

Mark Prichard is senior director of product management at AppDynamics, where he focuses on cloud-native technologies such as Docker and Kubernetes. He is a frequent speaker at events in the Java and DevOps communities. In prior interesting lives, Mark taught law and worked briefly for the British Diplomatic Service.

Presentations

Application and business transaction monitoring in a container-orchestrated world (sponsored by AppDynamics) Session

Mark Prichard reviews available metrics from infrastructure, Kubernetes, containers, and application code and shares options for viewing them holistically, thus providing a complete picture of how your applications are behaving and how users are experiencing them.

Stefan Prodan is a developer experience engineer at Weaveworks. Previously, he was a software architect and a DevOps consultant, helping companies embrace DevOps and the SRE movement. Stefan has over 15 years of experience with software development. He enjoys programming in Go and writing about distributed systems.

Presentations

Introduction to systems and service monitoring with Prometheus 1-Day Training

Prometheus—an open source monitoring system and time series database—features a multidimensional data model and a flexible query language and integrates monitoring aspects from client-side instrumentation to alerting. Tamao Nakahara, Ilya Dmitrichenko, and Stefan Prodan offer an overview of Prometheus architecture and concepts and walk you through using Prometheus and PromQL.

Liz Rice is the technology evangelist at container security specialists Aqua Security and coauthor of the O’Reilly report Kubernetes Security. She has a wealth of software development, team, and product management experience from her years spent working on network protocols and distributed systems and in digital technology sectors such as video on demand (VOD), music, and voice over internet protocol (VoIP). When not building startups and writing code, Liz loves riding bikes in places with better weather than her native London or racing in virtual reality on Zwift.

Presentations

What's so hard about container vulnerability scanning? Session

Liz Rice leads a dive into what's easy—and what's not—about finding and patching security vulnerabilities in containers.

Mike Roberts is a partner at Symphonia, a cloud technology consultancy based in New York City. Mike’s a longtime proponent of Agile and DevOps values and is excited by the role that cloud technologies have played in enabling such values for many high-functioning software teams. Mike can be reached at mike@symphonia.io.

Presentations

Crossing the serverless fireswamp Session

Mike Roberts leads a warts-and-all journey through some of the limitations of a serverless approach and shares a practical set of techniques for dealing with these concerns.

Mastering continuously deployed serverless applications 2-Day Training

Serverless applications have moved from interesting oddity to the mainstream. But how do teams take the raw ideas of serverless and apply them in a continuous deployment context, operate serverless applications with confidence, and scale them to handle whatever the world can throw at them? Mike Roberts guides you through the answers to these questions and more in this in-depth master class.

Mastering continuously deployed serverless applications (Day 2) Training Day 2

Serverless applications have moved from “interesting oddity” into the mainstream. But how do teams take the raw ideas of Serverless and apply them to a continuous deployment context, operate Serverless applications with confidence, and scale them to handle whatever the world can throw at them? Mike Roberts guides you through the answers to these questions, and more, in this in-depth masterclass.

Sunil Sadasivan is the former CTO at Buffer and now leads the modernization of VA Appeals at the Department of Veterans Affairs with Nava and the USDS.

Presentations

Leading an effective engineering team within the world's largest bureaucracy Session

Sunil Sadasivan compares the work environments of startups to those of bureaucracies and shares lessons for maintaining an optimal engineering work culture learned at the US Department of Veterans Affairs.

Christian Saide is a DevOps engineer at NS1, where he has been a key player in automating, hardening, and scaling out its systems, particularly by pushing more and more of its infrastructure into container-based architectures and implementing solutions to the tough problems surrounding global distribution. He also served a critical role in NS1’s move to software-defined networking and authored the primary software-defined networking device and network topology. Christian has been working in the technology sector for five years, focusing on networking and distributed systems. Previously, he was at Industrial Color Software, where he climbed from a midlevel software developer to director of development operations and was instrumental in taking the company’s aging infrastructure from a handful of bare-metal servers to multiple virtualization hosts running hundreds of virtual machines, which in turn supported hundreds of containers.

Presentations

Gaining efficiency with time series in ELK Session

Christian Saide explains how NS1 was able to reduce infrastructure, maintenance, and operational costs while simultaneously increasing throughput and visibility of key metrics by leveraging Elasticsearch as a time series database.

Torin Sandall is the cofounder and technical lead of the recent open source Open Policy Agent project. He spent 10 years as a software engineer working on large-scale distributed systems projects. Previously, Torin was a senior software engineer at Cyan (acquired by Ciena), where he designed and developed core components of its SDN/NFV platform. He’s a frequent speaker on policy-related topics in Kubernetes at KubeCon, ContainerDaysPDX, Kubernetes meetups, and more.

Presentations

The distributed authorization system: A Netflix case study Session

Manish Mehta and Torin Sandall lead a deep dive into how Netflix enforces authorization policies (“who can do what”) at scale in its microservices ecosystem in a public cloud without introducing unreasonable latency in the request path.

Ryan Schneider is a lead education engineer at VMware in the cloud native business. He has a passion for architecture and building great systems and is excited about the cloud native movement that the Kubernetes community is driving. Previously, he worked at Heptio, as a backend and distributed system engineer in companies both large and small, and as an adjunct professor in the Software Engineering Department at the Rochester Institute of Technology (RIT). After years of software development and architecture in the industry, he decided to blend his love for teaching and open source software and took a position as education engineer at Elastic, where he taught and consulted with engineers worldwide about Elasticsearch. Ryan holds a BS in CS and an MS in software development and management.

Presentations

Introduction to containers and Kubernetes (sponsored by Heptio) Tutorial

Using a combination of lecture and hands-on exercises, Ryan Schneider walks you through deploying Kubernetes and containers to build out a microservices architecture.

Baron Schwartz is the founder and CTO of VividCortex, the best way to see what your production database servers are doing. Baron has written a lot of open source software and several books, including High Performance MySQL. He’s focused his career on learning and teaching about performance and observability of systems generally, including the view that teams are systems and culture influences their performance, and databases specifically.

Presentations

How to monitor your database Session

Baron Schwartz demonstrates how to monitor a database by understanding the difference between workload and resource monitoring—and the golden signals for each.

Gwen Shapira is a system architect at Confluent, where she helps customers achieve success with their Apache Kafka implementations. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, the coauthor of Hadoop Application Architectures, and a frequent presenter at industry conferences. She is also a committer on Apache Kafka and Apache Sqoop. When Gwen isn’t coding or building data pipelines, you can find her pedaling her bike, exploring the roads and trails of California and beyond.

Presentations

Metrics are not enough: Monitoring Apache Kafka Session

Experienced Kafka admins don’t just collect metrics; they go the extra mile and use additional tools to validate availability and performance on both the Kafka cluster and their entire data pipelines. Gwen Shapira and Xavier Léauté share best practices for monitoring Apache Kafka, discussing critical metrics, common mistakes, what metrics don’t tell you, and how to cover these essential gaps.

Natalie Silvanovich is a security researcher for Google’s Project Zero. Her current focus is on script engines, particularly understanding the subtleties of the scripting languages they implement and how they lead to vulnerabilities. She is a prolific finder of vulnerabilities in this area, reporting over a hundred vulnerabilities in Adobe Flash in the last year. Previously, she worked in mobile security on the Android Security Team at Google and as a team lead of the Security Research Group at BlackBerry, where her work included finding security issues in mobile software and improving the security of mobile platforms. Outside of work, Natalie enjoys applying her hacking and reverse engineering skills to unusual targets and has spoken at several conferences on the subject of Tamagotchi hacking.

Presentations

JavaScript, security, and the case for feature simplicity Keynote

JavaScript engines are frequently targeted by malicious attackers, and dozens of vulnerabilities are reported in them every year. Most of these occur due to errors made while implementing well-specified features. Natalie Silvanovich discusses the link between feature complexity, developer error, and security vulnerabilities and the importance of considering implementation difficulty in design.

Ines Sombra is director of engineering at Fastly, where she spends her time helping the web go faster. Ines holds an MS in computology with an emphasis on cheesy ’80s rock ballads. She has a fondness for steak, fernet, and a pug named Gordo. In a previous life, she was a data engineer.

Presentations

Thursday opening welcome Keynote

Program chairs Nikki McDonald, Ines Sombra, and James Turnbull open the second day of keynotes.

Wednesday opening welcome Keynote

Program chairs Nikki McDonald, Ines Sombra, and James Turnbull open the first day of keynotes.

Erik St. Martin has spent the last decade building and securing distributed systems for large enterprises such as cable providers, credit bureaus, and fraud detection companies, and now works for Microsoft as a Sr. Cloud Developer Advocate. He co-authored a book on the Go programming language, podcasts with GoTimeFM, and co-organizes GopherCon, the annual conference for the Go community.

Presentations

Kubernetes two-day kickstart 2-Day Training

Brian Ketelsen and Erik St. Martin guide you through key concepts and practices for deploying and maintaining applications using Kubernetes.

Kubernetes two-day kickstart (Day 2) Training Day 2

This class is intended for solutions developers, systems operations professionals, solution architects, and development operations professionals who develop, migrate, and deploy container-based applications in the public cloud and want to learn the key concepts and practices for deploying and maintaining applications using Kubernetes.

Cynthia Thomas is a Networking Specialist at Google Cloud. She has spent 10+ years in the networking industry, most recently with open source cloud and networking solutions. Cynthia has been an advocate of open source technologies while working on cloud-related technologies for the last 5 years. She is a frequent speaker at conferences, including DevOpsDays, DockerCon, Kubernetes meetups, and OpenStack events.

Presentations

How to reduce the attack surface of your container workloads Session

Modern microservices architectures (like those run on Kubernetes) need modern security solutions to provide least-privilege security. Cynthia Thomas outlines traditional firewall methods and details the evolution of the distributed security model to enforce least privilege for microservices.

Jon Tirsen is a software engineer at Square, where he works on backend scalability issues for Square’s Cash app. Jon has been building software for more than two decades at companies such as Google and ThoughtWorks. He’s lived all over the world but has now returned to his home country, Sweden—at least for now.

Presentations

Scaling Square's Cash app with Vitess Session

Jon Tirsen explains how Square scaled out the backend for its Cash app using Vitess, a database middleware for MySQL built at YouTube.

Matt Torrisi is an experienced solution architect within the Oracle + Dyn Edge Services Group within the Oracle Cloud Infrastructure organization. He has spent the last seven years advising companies from the smallest startups to the largest global enterprises on how to optimize their infrastructure posture and ensure brand resilience.

Presentations

Leveraging multiplatform DNS for web application resiliency (sponsored by Oracle + Dyn) Tutorial

Matt Torrisi demonstrates how to build domain traffic easily by enabling multiplatform DNS, covers the important criteria in assessing DNS network compatibility, and walks you through using DNS as a traffic-steering platform.

James Turnbull is VPE at Glitch. A longtime member of the open source community, James is the author of a number of books about open source software. Previously, he was a CTO in residence at Microsoft, founder and chief technology officer at Empatico and Kickstarter, VPE of Venmo, and an adviser at Docker. James likes food, wine, books, photography, and cats. He is not overly keen on long walks on the beach or holding hands.

Presentations

Thursday opening welcome Keynote

Program chairs Nikki McDonald, Ines Sombra, and James Turnbull open the second day of keynotes.

Wednesday opening welcome Keynote

Program chairs Nikki McDonald, Ines Sombra, and James Turnbull open the first day of keynotes.

Seth Vargo is an engineer at Google Cloud. Previously he worked at HashiCorp, Chef Software, CustomInk, and some Pittsburgh-based startups. He is the author of Learning Chef and is passionate about reducing inequality in technology. When he is not writing, working on open source, teaching, or speaking at conferences, Seth advises non-profits.

Presentations

Integrating Vault and Kubernetes Tutorial

Kubernetes is a popular application scheduler and orchestration tool, but its built-in secret storage does not provide the robustness many organizations require. In this interactive workshop, Seth Vargo demonstrates how to connect applications and services running under Kubernetes to HashiCorp Vault.

Service discovery. . .across clouds? Session

Local service discovery and availability is easy, but how do you discover services in other data centers or other cloud providers? Seth Vargo explains how HashiCorp Consul can provide service discovery, monitoring, and failover across many regions and multiple public and private cloud providers.

Soam Vasani is a software engineer at Platform9 Systems, where he created and works on the Fission framework and has also worked on Platform9’s Kubernetes cluster deployment and management product. His past work includes distributed filesystems and contributions to the GNU debugger and toolchain. He’s interested in distributed systems, DevOps tools and frameworks, and programming languages.

Presentations

Function composition in a serverless world Session

FaaS functions are great for small functionality but not for complex real-world applications. Soam Vasani and Timirah James explore four available options for composing functions, along with a deep dive into workflows.

Kathleen Vignos is a full stack engineer turned manager who has led engineering teams at Twitter and Wired. She’s worked at two startups (one of which she founded), traveled the western US for management consulting and professional services, taught business software programming at the university level, won a hackathon, and built dozens of websites. Other experiences include everything from being on call as a COBOL programmer for Y2K to modifying a React app for a hack week project. She holds engineering degrees from UCLA and Michigan.

Presentations

For managers: How to keep up your technical skills without annoying your team(s) Session

Engineering teams want technically competent managers, but they also often want managers to keep their hands off their code. So how can managers keep their technical skills relevant in order to add the most value? Kathleen Vignos shares creative strategies for developing and maintaining technical skills—some through the act of managing itself.

Bing Wei is a software engineer on the infrastructure team at Slack, working on its edge cache service. Previously, she was at Twitter, where she contributed to the open source RPC library Finagle, worked on core services for tweets and timelines, and led the migration of tweet writes from a monolithic Rails application to JVM-based microservices.

Presentations

From dandelion to tree: Scaling Slack Session

In 2016, Slack faced a problem: the load on its backend servers had increased by 1,000x. Bing Wei explains how rearchitecting the system with lazy loading, a publish/subscribe model, and an edge cache service overcame the problem with zero downtime, improved latency, and led to gains in reliability and availability.

Shannon Weyrick is vice president of architecture at NS1. A 20-year veteran of internet infrastructure, Shannon is an accomplished technical architect, developer, and leader whose experience encompasses both development and operations of globally distributed platforms. Previously, Shannon worked at INAP and F5. A regular open source contributor, he has led and worked on a wide range of infrastructure projects from high-performance servers to novel programming languages and runtimes, and he enjoys writing and speaking at industry conferences.

Presentations

Rebuilding the airplane in flight. . .safely Session

Rewriting the key software component of your platform from scratch is always intimidating, especially when you guarantee 100% uptime, your platform is in the critical application delivery path, and your environment is highly distributed. Shannon Weyrick discusses NS1's recent DNS server rewrite and the steps the company took to roll it out across its globally distributed network with no downtime.

Jamie Wilkinson is a site reliability engineer at Google. He’s a contributing author to the SRE Book and has presented on contemporary topics at prominent conferences such as Linux.conf.au, Monitorama, PuppetConf, Velocity, and SRECon. His interests began in monitoring and the automation of small installations and have continued with human factors in automation and systems maintenance on large systems. Despite his more than 15 years in the industry, he’s still trying to automate himself out of a job.

Presentations

Principia SLOdica: A treatise on the metrology of service level objectives Session

Jamie Wilkinson offers an overview of SLOs and the concept of the error budget, a study of the motivation to move away from cause- to symtom-based alerting, and demonstrates how to implement it in your own projects.

Jeff Williams is cofounder and CTO of Contrast Security, an application security product designed for DevOps and CI/CD. He recently authored the DZone DevSecOps cheat sheet and speaks frequently on the topic. Previously, Jeff founded Aspect Security and served as the global chair of OWASP for eight years. Jeff created the OWASP Top 10, OWASP Enterprise Security API, OWASP Application Security Verification Standard, XSS Prevention Cheat Sheet, and many more popular open source projects.

Presentations

Jumpstarting your DevSecOps pipeline with IAST and RASP (sponsored by Contrast Security) Session

Jeff Williams explains how to layer security tools on a CI/CD pipeline without disrupting it and demonstrates a fast, effective, scalable DevSecOps pipeline using free tools.

Scott Wimer is a principal systems engineer at Smartsheet. Scott has been nerding for money since 1995. In that time, he’s done technical support, built PCs, built networks, written code for money in 13 different languages, spent more than a decade working on operating system kernels and device drivers, built and lead technical teams, obtained a few patents, published some papers, given away open source code, developed and taught classes in Perl, Python, IPv6 networking, and Linux virtualization with KVM, designed distributed systems, spoken at a few conferences, mentored peers and proteges, written business plans, done sales cold calls, raised angel capital, and basically trod the road of a technology generalist addicted to learning.

Presentations

Deleting data for fun and profit^H^H^H^H^H^H loss avoidance Session

Scott Wimer explains how to support the GDPR’s Right to be Forgotten through targeted, secure data destruction.

Erica Windisch founder and CTO of IOpipe, where she brings her decades of experience in building developer and operational tooling to serverless applications. Erica also has more than 16 years of experience designing and building cloud infrastructure management solutions. She was an early and longtime contributor to OpenStack and a maintainer of the Docker project.

Presentations

The state of statelessness Session

Serverless and other stateless applications still manipulate state—somewhere. Erica Windisch explains why observing this state and knowing where, how, and why that state is manipulated is important for operational security and developer concerns such as debugging.

Martin Woodward is the principal program manager for DevOps in Microsoft, where he focuses on Visual Studio Team Services and Team Foundation Server. Previously, Martin was executive director of the .NET Foundation, helping drive Microsoft’s move to open source, and was responsible for the Java, Linux, and Mac tooling in the Developer division, where he helped introduce Git into Microsoft.

Presentations

How Microsoft does DevOps (sponsored by Microsoft) Session

Expanding on the concepts from his keynote, Martin Woodward digs in behind the data and gives a technical summary of the steps Microsoft has taken in its journey to DevOps. Join in to discover what Microsoft has learned so far and the next areas it will focus on.

Why Microsoft does DevOps (sponsored by Microsoft) Keynote

Martin Woodward leads a whistle-stop tour of Microsoft's seven-year DevOps journey, explaining why the company embarked on this transformation and what benefits it has already seen.

Jason Yee is a technical evangelist at Datadog, where he works to inspire developers and ops engineers with the power of metrics and monitoring. Previously, he was the community manager for DevOps and performance at O’Reilly Media and a software engineer at MongoDB. He’s currently exploring the world while living as a nomad and would love to hear about the part of the world that you call home.

Presentations

Canary deploys with Kubernetes and Istio Session

Jason Yee shows how you can more easily test code in production while isolating the effect of potential issues using container orchestration and services meshes.

Christine Yen is the cofounder of Honeycomb, a startup with a new approach to observability and debugging systems with data. Christine has built systems and products at companies large and small and likes to have her fingers in as many pies as possible. Previously, she built Parse’s analytics product (and leveraged Facebook’s data systems to expand it) and wrote software at a few now-defunct startups.

Presentations

End-to-end observability for fun and profit Tutorial

Ben Hartshorne and Christine Yen explore what it means for a system to be “up” by discussing end-to-end (e2e) checks (what makes a good one and what techniques are valuable when thinking about them). Along the way, you'll learn how to write and evolve an e2e check against a common API.

Kyle York is the general manager and vice president of business and product strategy for the Dyn global business unit at Oracle. He is a longtime Dyn executive, having joined in 2008. Over the years, Kyle has spearheaded company growth and corporate strategy, which led to the acquisition by Oracle. In his current role, Kyle focuses on product and business strategy and overall global business unit (GBU) operations.

Presentations

Netra Q&A: Scaling resource-intensive APIs (sponsored by Oracle + Dyn) Keynote

Kyle York and Richard Lee explore Netra’s high-performance computing environment, focusing on how the company's AI and deep learning models process tens of millions of images and videos each day in a time- and cost-effective manner. Along the way, they explain what worked, what didn't, and why you need an Agile, hybrid infrastructure if you want to build an AI business at the scale of social.

The internet versus your sites: Taking action against internet volatility (sponsored by Oracle + Dyn) Keynote

When the internet is not bombarding your DNS with bogus requests, it’s trying to execute malicious SQL queries and crawling your site with bots (some good, some bad). Join Kyle York to learn how to take action.