Building and maintaining complex distributed systems
June 19–20, 2017: Training
June 20–22, 2017: Tutorials & Conference
San Jose, CA

Speakers

New speakers are added regularly. Please check back to see the latest updates to the agenda.

Filter

Search Speakers

Peter Alvaro is an assistant professor of computer science at the University of California, Santa Cruz, where he leads the Disorderly Labs research group. His research focuses on using data-centric languages and analysis techniques to build and reason about data-intensive distributed systems in order to make them scalable, predictable, and robust to the failures and nondeterminism endemic to large-scale distribution. Peter holds a PhD from UC Berkeley, where he studied with Joseph M. Hellerstein. He is a recipient of the NSF CAREER award.

Presentations

Orchestrating chaos: Applying database research in the wild Keynote

Lineage-driven fault injection (LDFI), a novel approach to automating failure testing, can greatly reduce the number of faults that must be explored via fault injection. Peter Alvaro explores LDFI’s theoretical roots in the database research notion of provenance and presents early results from the field and opportunities for near- and long-term future research.

Megan Anctil is a senior engineer on the Technical Operations team at Slack. She enjoys deep dives in debugging and long walks on the beach with her #MonitoringLove(s).

Presentations

Our many monitoring monsters Session

One size definitely doesn't fit all when it comes to open source monitoring solutions, and executing generally understood best practices in the context of unique distributed systems presents all sorts of problems. Megan Anctil shares pain points and lessons learned at Slack wrangling known technologies such as Icinga, Graphite, Grafana, and the Elastic Stack to best fit the company's use cases.

David Andrews is a CDN architect and evangelist at Verizon Digital Media Services. He enjoys low-level security exploitation techniques and has an appreciation for the nuances and resulting surprised faces that accompany discovering failure modes in globally distributed systems. Previously, Dave brought several web security products to market at Verizon Digital Media Services and worked for startups in the Los Angeles area, building security products in the virtualization and content delivery network (CDN) spaces. He holds a PhD in computer security from a small university in Australia.

Presentations

Preventing cascading failures in a global network (sponsored by Verizon Digital Media Services) Keynote

Cascading failures are every team's worst nightmare. Without the right monitoring, alerting, and containment in place, the failure of a system's key part can quickly result in the entire system failing. Dave Andrews shares strategies for addressing cascading failures at various scales, on a single system, within a given data center and in a globally distributed environment.

Doug Barth is a site reliability engineer at Stripe. Doug has a deep interest in software, hardware, and production systems and has spent his career using computers to solve hard problems. He helped deploy PagerDuty’s IPsec mesh network and is now writing Zero Trust Networks.

Presentations

Zero Trust networks: Building systems in untrusted networks Session

Douglas Barth and Evan Gilman offer an overview of Zero Trust, a new security model that considers all parts of the network to be equally untrusted. Doug and Evan show how to leverage a network's strengths by combining traditional SRE security approaches with novel technological arrangements while using software and hardware to secure the systems operating in those networks.

Kristopher Beevers is founder and CEO of NS1, the next-gen DNS and traffic management company. Previously, Kristopher led platform development at Voxel.net (acquired by Internap), where he built cloud and bare metal platforms, content delivery networks, and other distributed infrastructure products. Kristopher holds BS, MS, and PhD degrees in computer science from RPI.

Presentations

Meet the Experts with Kristopher Beevers Meet the Experts

Kristopher works on DNS resiliency, traffic management setups, application delivery architectures, global infrastructure management, and service provider redundancy and would be happy to share ideas around best practices, scale-up architecture considerations, and the like.

Resiliency in a service provider world (sponsored by NS1) Keynote

Today we depend upon service providers (for storage, compute, network, DNS, CDN, and much more) to build and deliver our applications. Even when the most sophisticated service providers on the internet fail—and they do—it’s still possible to build resilient applications. Kristopher Beevers explores how ops teams and developers are thinking about resiliency in a service provider world.

Micheal Benedict leads product management for Pinterest’s cloud and data infrastructure. Previously, Micheal led products for Twitter Cloud Platform, building next-generation compute services that span internal and public clouds. He and his team built Kite, a service lifecycle manager and an infrastructure metering and chargeback system. Prior to that, he was an engineer building systems that powered Twitter’s observability and monitoring stack. Micheal holds a master’s degree in computer science from the State University of New York at Buffalo.

Presentations

Managing the microservices lifecycle: The what, why, and how Session

Companies like Twitter, Pinterest, and Uber are powered by thousands of microservices. Managing the lifecycle of services (i.e., creating them, provisioning resources, deploying, metering, charging, and deprecating) at scale proves to be challenging. Micheal Benedict discusses the need for a lifecycle manager, how to implement governance, and the impact of such a system on developer productivity.

Artur Bergman is the founder and CEO of Fastly, the future of content delivery. Previously, he served as CTO at Wikia, managed LiveJournal’s engineering team, and was an operations architect at Six Apart. In past lives, he was a committer to Varnish, built high-volume financial trading systems, reimplemented Perl 5’s threading system, and created djabberd.

Presentations

Future history Keynote

When Fastly CEO Artur Bergman helped organize the first Velocity event 10 years ago, the tech landscape was very different. Artur looks back at the last decade of DevOps and explores shifting patterns in operations, development, and systems through the lens of the Velocity Conference.

Marcus Blankenship is an author, trainer, and consultant who helps companies improve their software delivery teams and processes. Fifteen years ago, he made the leap from a senior programmer/architect designing product configuration expert systems to leading teams and departments, and he has done so at global enterprises and his own software consultancy. Marcus has worked extensively as a consultant and trainer with manufacturing, digital agencies, and SaaS companies. Marcus is also the author of 7 Habits That Ruin Your Technical Team.

Presentations

Technology leadership: Building and managing high-performance teams 2-Day Training

Understanding why the manager-engineer relationship is key to employee productivity and satisfaction requires learning a framework for building strong relationships within organizations, creating a driven culture, and communicating upward and outward to benefit teams. Marcus Blankenship explores why engineers must be prepared for the human and political challenges they face daily.

TRAINING: Technology leadership - Building and managing high-performance teams (Day 2) Training Day 2

Understanding why the manager-engineer relationship is key to employee productivity and satisfaction requires learning a framework for building strong relationships within organizations, creating a driven culture, and communicating upward and outward to benefit teams. Marcus Blankenship explores why engineers must be prepared for the human and political challenges they face daily.

Aaron Blohowiak is a senior software engineer on the Chaos and Traffic team at Netflix. Aaron has a decade of experience taking down production, learning from mistakes, and striving to build ever more resilient systems.

Presentations

Precision chaos Session

Chaos Monkey and Kong changed the culture around infrastructure failure, but the most common cause of downtime is service failure. Turning off an entire service in production is too risky. Aaron Blohowiak offers an overview of precision chaos techniques that verify service-level fault tolerance and reveal hidden resource constraints while minimizing potential fallout.

Refael Botbol is the BlazeMeter testing domain expert at CA Technologies, where he enables developers to achieve higher-quality applications by injecting testing throughout the software development lifecycle. He is passionate about helping organizations democratize access to testing tools (performance, functional), allowing development teams to successfully achieve sprint deadlines. Refael has nearly 15 years of end-to-end experience, ranging from development and system engineering up to ensuring delivery of high quality applications. His proficiency includes operating systems and performance testing and leading multiple web-based platforms projects, using technologies including Apache, JBoss, JMeter, Microsoft IIS, Selenium, Taurus, and other open source tools.

Presentations

Open source tool chains for continuous testing (sponsored by CA Technologies) Session

The goal of continuous testing is to find defects earlier and release software faster, which can be achieved by integrating a set of open source functional and performance testing tools in the early stages of the software delivery lifecycle. Refael Botbol explains how to integrate open source tools like Apache JMeter and Selenium with Taurus and Jenkins as part of a continuous testing effort.

Juan Pablo Buriticá is the vice president of engineering at Splice, where he leads a distributed team throughout the US and Latin America building a cloud platform for music creation, collaboration, and sharing. Juan Pablo has built effective software engineering organizations by emphasizing open source software values, technical excellence, trust, and empathy. He has organized five global software engineering conferences, spoken at multiple events, and founded and led the growth of Colombia’s JavaScript community, the largest Spanish-speaking JS community in the world, with more than 5,000 members.

Presentations

Technical decision making for teams, the open source way Session

Juan Pablo Buriticá explains how to use technical RFCs as a decision-making tool in your engineering organization to increase effectiveness. When implemented properly, technical RFCs can encourage trust and delegation, respectful discussions, knowledge sharing, and accountability and support good software design.

Brendan Burns is a partner architect at Microsoft Azure, where he runs the Container Service and Resource Manager teams, and a cofounder of the Kubernetes open source project. Previously, he worked at Google on cloud APIs and web search infrastructure and was a professor of computer science at Union College. Brendan holds a PhD in computer science from the University of Massachusetts Amherst and a BA in computer science and studio art from Williams College.

Presentations

Democratizing distributed systems: Building reusable distributed system patterns using containers Session

Building reliable distributed systems is challenging and often bespoke, so it's hard for developers to share implementations and best practices. Brendan Burns explores common patterns for composing reliable distributed systems and shows how these patterns can be expressed via containers, so that they can be reused throughout many different applications.

Tammy Butow is a site reliability engineering manager at Dropbox, where she is the team lead for the Databases and Magic Pocket SRE teams. She enjoys working on infrastructure engineering and is interested in chaos engineering, antifragile systems, automation, Go, and Linux. Previously, Tammy worked in security engineering and product engineering. She is the cofounder of Girl Geek Academy, a global movement to teach 1 million women technical skills by 2025. Girl Geek Academy received support from the Australian prime minister and a grant from the Australian government in 2016 to scale the Miss Makes Code program, which is aimed at teaching algorithms to 5- to 8-year-old girls. An Australian, Tammy currently lives in San Francisco, where she likes to ride bikes, skateboard, snowboard, and surf. She also loves mosh pits, crowd surfing, metal, and hardcore punk.

Presentations

Chaos engineering bootcamp Tutorial

Chaos engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. Tammy Butow leads a hands-on tutorial on chaos engineering, covering the tools and practices you need to implement chaos engineering in your organization.

Lee Calcote is the senior director of technology strategy at SolarWinds. Lee is an innovative thought leader who is passionate about developer platforms and management software for clouds, containers, infrastructure, and applications and has consistently focused on advanced and emerging technologies throughout his career, at companies including Seagate, Cisco, and Pelco. Lee is active in the tech community and is an organizer of technology meetups and conferences, a writer, an author, and a speaker.

Presentations

The over-under on container networking Session

With application developers busily adopting container technologies, the time has come for network engineers to prepare for the unique challenges brought on by networking cloud-native applications. Lee Calcote walks you through available container connectivity options, explaining their function and when they should be used and comparing their performance characteristics.

Laine Campbell is a principal at OpsArtisan. Previously, Laine was a senior staff engineer for data infrastructure at Apple, the interim CTO at with.me, CEO and cofounder of Blackbird, and founder of PalominoDB. Laine has also held DBA roles at Oracle, MySQL, and Cassandra and was an architect and designer for 11 years with such companies as Obama for America, Travelocity, Zappos, Chegg, LiveJournal, Disney Mobile, and Adobe. She is an open source proponent and an advocate for bringing technology, job opportunities, and privileges to underserved populations.

Presentations

Database reliability engineering: What, why, and how? Session

SRE is becoming quite the ubiquitous term, but what about DBRE? Laine Campbell and Charity Majors dive into DBRE, exploring the paths to this craft and how to culturally evolve and support it. Laine and Charity focus on organizational scale, self-service, and force multipliers in recoverability, observability, availability, security, release management, and infrastructure.

Meet the Experts with Laine Campbell and Charity Majors Meet the Experts

Laine and Charity have three little words for you: database reliability engineering. Bring your questions and ideas.

Jack Chan is a senior engineering manager in Shutterfly’s Photos group. He was recently heavily involved with helping the company with a hybrid cloud migration solution with photos-related API services on AWS paired with a set of core services in a private data center. Jack has been working in software engineering development for quite some time, helping startups scale up to millions of users with cloud solutions. Previously, he worked in IT organizations at Adobe, Apple, and 3Com.

Presentations

How Shutterfly migrated 10+ billion photos to the cloud Session

Jack Chan describes how Shutterfly migrated metadata from over 10B photos from a private data center into AWS in 100 days and explores designs to absorb mountains of metadata, on-premises ecommerce integration, and parallel user experiences, all in a highly scalable fashion. Shutterfly Photos is now a hybrid cloud solution with images hosted on-premises and client-facing photos metadata on AWS.

Colin Charles is the chief evangelist at Percona. Previously, Colin was on the founding team of MariaDB Server, worked at MySQL, and worked actively on the Fedora and OpenOffice.org projects. Colin has been a MySQL user since 2000. He’s well known within open source communities in APAC and has spoken at many conferences.

Presentations

Best practices for MySQL high availability Tutorial

The MySQL world is full of trade-offs; choosing a high-availability solution is no exception, but only with high availability can you achieve distributed systems in your database layer. Colin Charles explores the MySQL high-availability landscape, offering deep dives into current technologies, recommendations, and what to look out for.

Pete Cheslock is the head of Threat Stack’s operations and support teams, where he focuses on delivering the highest level of service, reliability, and customer satisfaction to Threat Stack’s growing user base. An industry veteran with over 15 years’ experience in operations, Pete understands the challenges and issues faced by security, development, and operations professionals every day. Previously, Pete held senior positions at Dyn and Sonian, where he built, managed, and developed automation and release engineering teams and projects.

Presentations

Scale it to a billion: How to build it, keep it safe, and keep it running Session

Pete Cheslock shares the operational and security practices that helped Threat Stack scale while staying stable and secure, covering technology and tools and the various scale points that forced hard decisions.

Cliff Crocker is a product line director at Akamai Technologies, where he spends his time building product strategy for performance analytics. Previously, he was vice president of product at SOASTA and engineering leader for the performance, reliability, and site analytics initiatives at @WalmartLabs. Cliff is an active contributor in the web performance community, evangelizing the importance of speed as it relates to user behavior and ultimately business ROI. In his spare time, he enjoys skiing in the mountains of Colorado, where he resides with his wife and two boys.

Presentations

The false dichotomy of finders versus fixers (sponsored by SOASTA, now a part of Akamai) Keynote

Most tools designed to help you manage your systems fall into two categories: finders, such as monitoring services and log file analyzers, or fixers, such as cloud infrastructure providers or container orchestration. It's up to you to translate information from your finders into actions for your fixers. Cliff Crocker explains how to use intelligent analytics to connect data to actions.

Armon Dadgar is the CTO of HashiCorp, where he brings distributed systems into the world of DevOps tooling. Armon has a passion for distributed systems and their application to real-world problems. He has worked on Nomad, Vault, Terraform, Consul, and Serf at HashiCorp and maintains the Statsite and Bloomd OSS projects.

Presentations

Nomad and next-generation application architectures Session

Armon Dadgar offers an overview of Nomad, an application scheduler designed for both long-running services and batch jobs. Along the way, Armon explores the benefits of using schedulers for empowering developers and increasing resource utilization and how schedulers enable new next-generation application architectures.

Simon de Haan is the chief engineer at Praekelt.org. Simon has the rare talent to demystify software systems and platforms for nonengineers. Previously, he was the team lead on Praekelt.org’s Vumi platform, an open source messaging platform that allows for interactive conversations over SMS, USSD, Google Talk, and other basic technologies at low cost and at population scale in the majority world. Vumi is the technology that powers various groundbreaking initiatives such as Wikipedia Text, PeaceTXT, MomConnect, MAMA, and the Libyan election registrations. Prior to joining Praekelt, Simon was CTO at Soocial.com, a senior developer at Eight.nl, and the owner of Fission.nl. Simon has hosted various talks, webinars, and hackathons about his passion for development and for building systems that can scale and can help our partners reach their audiences with life-s​aving information. Growing up in the Middle East ruined him for the ordinary as those years left an unmistakable impression and set him on a direct course to be involved in community development, entrepreneurship, and technology.

Presentations

Meet the Experts with Simon de Haan and Milton Madanda Event

Chat with Simon and Milton about using tech for social impact, designing public services, and designing within the technical constraints of emerging markets.

No place like home: Building resilient distributed systems locally in Africa Session

Developing reliable healthcare systems requires careful integration of a country’s health, tech, and legal ecosystems. In Africa, locally built resilient distributed systems are needed to meet the demand of national-scale digital health services and data sovereignty laws. Simon de Haan explores the challenges and proven solutions building in these environments.

Bart De Vylder is a data scientist at CoScale. Previously, Bart was active in software engineering and architecture, with a focus on distributed systems. His interests lie in machine learning and building reliable, scalable data processing systems. Bart holds a PhD in artificial intelligence from the Free University of Brussels.

Presentations

A hands-on data science crash course for modeling and predicting the behavior of (large) distributed systems Tutorial

Data science is a hot topic. Bart De Vylder offers a practical introduction that goes beyond the hype, exploring data analysis, visualization, and machine-learning techniques using Python for modeling the behavior of distributed systems. You'll leave with a solid starting point to implement data science techniques in your infrastructure or domain of interest.

Dinesh Dutt is chief scientist at Cumulus Networks. Dinesh has been in the networking industry for 15 years, most of it spent at Cisco Systems, where he was involved in enterprise and data center networking technologies, including the design of many of the ASICs that powered Cisco’s megaswitches, such as Cat6K and the Nexus family of switches. He also has experience in storage networking from his days at Andiamo Systems and in the design of FCoE. Dinesh is a coauthor of TRILL and VxLAN and has filed over 40 patents.

Presentations

Troubleshooting data center networks: Fresh tools and perspectives Tutorial

Dinesh Dutt explores network troubleshooting and explains how to avoid common network problems ranging from misconfigured cabling to misbehaving protocols, how a modern networking tool chest can help simplify network configurations, and how automation is improving troubleshooting turnaround times to minimize downtime.

Devin Elliot is the founder of Unoceros. Previously, Devin worked in weather data tech, designed flavor molecules from bacteria, painted houses, and was a professional snowboarder.

Presentations

Edge infrastructure will save you from your mobile traffic nightmares Session

It takes more than a one-tenth scale server-based test environment to seamlessly load balance and deliver content to millions of mobile users. Devin Elliot explains how UX for customers of major media and live streaming events was improved by leveraging idle distributed networks of smartphones and smart devices to repeatedly map, measure, and load test at scale.

Tammy Everts is a co-chair of O’Reilly Fluent.

Tammy has spent the past two decades studying how people use the web. Since 2009, she’s focused on the intersection between web performance, user experience, and business metrics. Her book, Time Is Money: The Business Value of Web Performance (O’Reilly), is a distillation of much of this research. She co-curates (with Tim Kadlec) WPO Stats, a collection of performance case studies.

Tammy is chief experience officer at SpeedCurve, where she helps companies understand how visitors use their websites.

Presentations

Performance is about people, not metrics Keynote

Tammy Everts walks you through a brief history of UX and web performance research, highlighting key studies that connect the dots between performance and user experience and sharing some educated guesses about new metrics that are just around the corner.

Stephen Feloney is vice president of products for the Continuous Delivery business unit at CA Technologies, where he is responsible for the company’s service virtualization, application test, release automation, and test data management solutions. Previously, Stephen held product management roles with a focus on enterprise software at companies spanning from HP to startups. His most recent position was senior director of products at Dynatrace, where he worked on analytics and application monitoring. Stephen also spent 12+ years a software engineer. He holds a BS in computer engineering from Santa Clara University.

Presentations

Continuous delivery made easy: Removing barriers in the modern software factory (sponsored by CA Technologies) Session

Delivering software continuously is a common ambition, but many face challenges pursuing this goal. Stephen Feloney shares new technologies, solutions, and best practices that make it easier for organizations to attain continuous delivery and leads a live demonstration showing end-to-end orchestration throughout the continuous delivery toolchain.

Bret Fisher is a Virginia Beach-based freelance DevOps and Docker consultant, trainer, speaker, and open source volunteer. Bret has been a cloud and data center ops and system administrator for 20 years. Currently, he helps teams Dockerize their apps and systems and improve their speed of deployment, resiliency, metrics, and awareness (all that DevOps-y stuff). Bret is a Docker Captain and Code for America Brigade Captain. He runs several monthly meetups, speaks at conferences, and is obsessed with containerizing any app he sees. (He’ll likely talk your ear off about it next time you meet.) Bret also develops in Node.js, Bash, and general web, usually for open source projects. In his free time, he does CrossFit, surfs a little, geeks out in the awesome local dev community in Virginia Beach, and travels with his wife.

Presentations

Docker production: Orchestration, security, and beyond Tutorial

Starting where previous Docker workshops leave off, Bret Fisher, Laura Frank, and Tony Pujals dive into the new Swarm mode clustering (services), failover, blue-green deployments, monitoring, logging, troubleshooting, and security, covering the latest built-in features and common third-party tools as they walk you through installing them on your own five-node cloud Swarm cluster.

Meet the Experts with Bret Fisher, Laura Frank, and Tony Pujals Meet the Experts

If you want the lowdown on using Docker in production, bring any and all questions to Bret, Laura, and Tony. They've got you covered.

Nicole Forsgren is the CEO and chief scientist at DevOps Research and Assessment (DORA). Nicole is an IT impacts expert who is best known for her work with tech professionals and as the lead investigator on the largest DevOps studies to date. She is a consultant, expert, and researcher in knowledge management, IT adoption and impacts, and DevOps. In a previous life, she was a professor, sysadmin, and hardware performance analyst. Nicole has been awarded public and private research grants (funders include NASA and the NSF), and her work has been featured in various media outlets, peer-reviewed journals, and conferences. She holds a PhD in management information systems and a master’s degree in accounting.

Presentations

Are we there yet? Signposts on your journey to awesome Session

When embarking on a journey of transformation, you want to measure your current status and subsequent progress while keeping tabs on factors that drive improvement in technology performance. Nicole Forsgren explains the importance of knowing how (and what) to measure—ensuring you catch successes and failures when they first show up, not just when they’re epic.

Camille Fournier is the former head of engineering at Rent the Runway. She was previously a vice president at Goldman Sachs. Camille is an Apache ZooKeeper committer and PMC member and a Dropwizard framework PMC member.

Presentations

The role of being technical in technical leadership Keynote

There is compelling evidence that technical workers want leaders who are strong technologists, leaders they believe they can learn from. What does this mean for those who wish to become engineering managers and technical leaders? How can you be an effective noncoding technical leader? Camille Fournier explores this conundrum and shares strategies to overcome it.

Laura Frank is a Docker Captain and the director of engineering at Codeship, where she works on improving the Docker infrastructure and overall experience for all users of the CI/CD platform. Previously, she worked on several open source projects to support Docker in the early stages of the project, including Panamax and ImageLayers. Laura lives in Berlin, where she can be found eating döner or attempting to try every type of gin in the world.

Presentations

Docker production: Orchestration, security, and beyond Tutorial

Starting where previous Docker workshops leave off, Bret Fisher, Laura Frank, and Tony Pujals dive into the new Swarm mode clustering (services), failover, blue-green deployments, monitoring, logging, troubleshooting, and security, covering the latest built-in features and common third-party tools as they walk you through installing them on your own five-node cloud Swarm cluster.

Everything you thought you already knew about orchestration Session

Do you understand how quorum, consensus, leader election, and different scheduling algorithms can impact your running application? Could you explain these concepts to the rest of your team? Laura Frank explores the algorithms that power all modern container orchestration platforms and shares actionable steps to keep your highly available services highly available.

Meet the Experts with Bret Fisher, Laura Frank, and Tony Pujals Meet the Experts

If you want the lowdown on using Docker in production, bring any and all questions to Bret, Laura, and Tony. They've got you covered.

Vicky Villalobos is a senior product manager at HPE, where she leads a team that handles all phases of her product lifecycle. Vicky has a passion for translating customer requirements into product features and technical specs so that at the end of the day her team can deliver products that meet customers’ needs and enable them to master performance engineering. With roots as a software engineer, Vicky has built a successful career leading IT product management initiatives for enterprise software and cloud computing projects and thrives on collaboration with multiple business units and customers all over the globe. An animal lover, Vicky fosters kittens to prepare them for adoption and dreams about puppies.

Presentations

Performance in a hyperscaling world (sponsored by Hewlett Packard Enterprise) Session

Vicky Villalobos explores some of best practices and tooling used to load and monitor a system in order to find performance and behavior across any OS, deployment environment, or device and shares real-life success stories and best practices of teams who are navigating these challenges on a daily basis.

Evan Gilman is a site reliability engineer currently focusing on Zero Trust research. With roots in academia, Evan finds passion in both reliable, performant systems, and the networks they run on. When he’s not building automated network systems, he can be found at the nearest pinball table or working on his upcoming book, Zero Trust Networks.

Presentations

Zero Trust networks: Building systems in untrusted networks Session

Douglas Barth and Evan Gilman offer an overview of Zero Trust, a new security model that considers all parts of the network to be equally untrusted. Doug and Evan show how to leverage a network's strengths by combining traditional SRE security approaches with novel technological arrangements while using software and hardware to secure the systems operating in those networks.

Sebastien Goasguen is senior director of cloud technologies at Bitnami, where he leads all the Kubernetes efforts. Sebastien joined Bitnami through the acquisition of his startup Skippbox. Sebastien is a 20-year open source veteran. A member of the Apache Software Foundation, he worked on Apache CloudStack and Libcloud for several years before diving into the container world. He is an avid blogger and enjoys spreading the word about new cutting-edge technologies. He also trains developers and sysadmins on all things Docker and Kubernetes. Sebastien is the author of the O’Reilly Docker Cookbook and 60 Recipes for Apache CloudStack.

Presentations

Meet the Experts with Sebastien Goasguen Meet the Experts

Stop by and talk with Sebastien. He’s got a ton of topics he’s happy to discuss: getting started and building distributed apps with Kubernetes, the Kubernetes Python client, Helm charts, Google's Summer of Code project, and more.

Scheduling containers with Kubernetes: Is it that different than other schedulers? Session

Kubernetes has emerged as one of the leading container orchestrators. Sebastien Goasguen explores its architecture and compares it with other orchestration/scheduling systems, outlining the similarities and explaining why Kubernetes API primitives make all the difference.

Sasha Goldshtein is the CTO of Sela Group, a Microsoft C# MVP and Azure MRS, a Pluralsight author, and an international consultant and trainer. Sasha’s consulting work revolves mainly around distributed architecture, production debugging, and mobile application development. Sasha is the author of Introducing Windows 7 for Developers (Microsoft Press) and Pro .NET Performance (Apress). He is also a prolific blogger and the author of numerous training courses, including .NET Debugging, .NET Performance, Android Application Development, and Modern C++.

Presentations

Linux performance monitoring with BPF Tutorial

Sasha Goldshtein leads a hands-on workshop on Linux dynamic tracing. You'll explore the BPF Compiler Collection (BCC), a set of tools and libraries for dynamic tracing, and gain firsthand experience of memory leak analysis, generic function tracing, kernel tracepoints, static tracepoints in user-space programs, and the baked-in tools for file I/O, network, and CPU analysis.

Oliver Gould is the CTO of Buoyant, where he leads open source development efforts. Previously, he was a staff infrastructure engineer at Twitter, where he was the tech lead of the Observability, Traffic, and Configuration and Coordination teams. Oliver is the creator of linkerd and a core contributor to Finagle, the high-volume RPC library used at Twitter, Pinterest, SoundCloud, and many other companies.

Presentations

The service mesh: Distributed resilience for a cloud-native world Session

Modern application architecture is becoming cloud native: containerized, "microserviced," and orchestrated. But resilience is more than just Docker and Kubernetes. Oliver Gould explains why companies like PayPal, Ticketmaster, and Monzo are adopting the service mesh model, where internal, service-to-service traffic is managed and instrumented with a mesh of load-balancing proxies.

Julia Grace is the director of infrastructure engineering at Slack. Previously, she was cofounder and CTO of Tindie, a marketplace for electronics funded by Andreessen Horowitz, where she built out and led the engineering team from founding through acquisition. Prior joining the startup world, she spent several years building systems at IBM Research. Julia holds a BS and MS in computer science from the University of North Carolina at Chapel Hill with a focus on distributed systems. She is an avid runner and once starred in a TV commercial.

Presentations

10,000 messages a minute: Lessons learned from building engineering teams under pressure Session

Julia Grace has built teams at IBM Research, startups, and Slack and has done due diligence for venture capitalists to determine how well a startup’s engineering team is working together. Drawing on this knowledge, Julia attempts to answer the question, Why do some teams ship features rapidly, support each other, and effectively communicate while others struggle?

Alexander (Alex) Grbic is vice president of product marketing in the Programmable Solutions group at Intel Corporation, where he is responsible for defining, promoting, and managing innovative hardware, software, and intellectual property products for Intel’s programmable logic solutions, spanning the full product lifecycle. Alex joined Intel in 2015 with the acquisition of Altera Corporation, where he was senior director of software and IP marketing, managing product planning and outbound marketing for multiple Altera products, including the Quartus II design software and IP cores, digital signal processing design tools, and Altera’s software development kit for OpenCL. Previously, he spent several years overseeing Altera’s applications engineering—a role that included responsibility for escalated customer support, initiatives for early adoption of new products, and technical collateral—and worked in the company’s software and IP engineering organization, where he led the development of external memory interfaces IP, work on visualization and analysis tools, and performance analysis of Altera’s products. Earlier in his career, Alex was a hardware designer working on shared-memory multiprocessors in the Department of Electrical and Computer Engineering (ECE) at the University of Toronto. He also taught advanced computer science courses at the university and continues to serve on the university’s ECE board of advisors. He authored multiple patents in software design flows. Alex holds a bachelor’s and a master’s degree in applied science and a PhD in computer engineering, all from the University in Toronto.

Presentations

Achieve predictable performance (sponsored by Intel) Keynote

Alex Grbic explains how a single FPGA can deliver significant acceleration for multiple workloads. This new approach of integrating data analytics frameworks and existing databases enables enterprise customers to run unmodified applications without requiring any FPGA expertise and can be used with unstructured, NoSQL, and traditional relational databases, such as Swarm64.

Brendan Gregg is a senior performance architect at Netflix, where he does large-scale computer performance design, evaluation, analysis, and tuning. Previously, Brendan worked as a performance and kernel engineer. He has created performance analysis tools included in multiple operating systems, as well as visualizations and methodologies. Brendan is the author of Systems Performance. He received the USENIX LISA Award for outstanding achievement in system administration.

Presentations

Performance analysis superpowers with Linux eBPF Session

Advanced performance observability and debugging has arrived in Linux 4.x, with enhanced BPF (eBPF). Brendan Gregg offers an overview of Linux's new dynamic and static tracing tools for the analysis of filesystems, storage, CPUs, TCP, and more. Join in to explore a new generation of tools and visualizations.

Timothy Gross is a product manager for Joyent, providers of the Triton Elastic Container Service. Previously, Tim ran ops at DramaFever, where he and his scrappy team ran Docker in production to serve a few million fans their daily dose of dramas, documentaries, and gross-out horror movies. In another life, Tim was an architect (buildings, not software). He took the leap into programming and operations after he discovered he could automate away almost everything boring in his life.

Presentations

Software-defined culture Session

Conway's law tells us that "organizations which design systems. . .are constrained to produce designs which are copies of the communication structures of these organizations." What if we turn Conway's law around? Timothy Gross explores how to make technology choices that improve the culture of your organization.

Jason Hansen is a program manager working on Azure Container Service (ACS) at Microsoft, which he joined through its acquisition of Deis. Jason has a knack for breaking things—it may be the result of years of bad luck or that he spent too much time grappling with infrastructure software and hardware. Either way, Kubernetes is his home now, and he spends his time working to make it easier for teams to build amazing applications.

Presentations

Real-world Kubernetes 2-Day Training

Kubernetes has emerged as the leading platform for containerized applications. Jason Hansen and Sean Knox offer a deep dive into Kubernetes, from concept to implementation, sharing detailed explanations of its architecture, security, and use cases.

TRAINING: Real-world Kubernetes (Day 2) Training Day 2

Kubernetes has emerged as the leading platform for containerized applications. Jason Hansen and Sean Knox offer a deep dive into Kubernetes, from concept to implementation, sharing detailed explanations of its architecture, security, and use cases.

David Hayes is a full-time time data nerd and the director of platform strategy at PagerDuty, where he is scaling the most reliable way of waking up the IT world. Dave can be comfortably blamed for anything you hate about PagerDuty’s product, but he’d rather talk about integrating your product with PagerDuty, PagerDuty’s APIs, his JavaScript wrapper, or rock climbing and Mario Kart.

Presentations

DevOps and incident management: A recipe for success (sponsored by PagerDuty) Keynote

Growing companies are customer-centric, and all members of an organization are now responsible for contributing to the customer experience. David Hayes explains why DevOps is a requirement for success and outlines some of the challenges that all DevOps teams will face over the next five years.

Micha “Mies” Hernandez van Leuffen is a hacker entrepreneur and the founder and CEO of Wercker, which he created in order to make developers’ lives easier by building the next generation of developer automation for the modern cloud.

Presentations

Cloud-native development: You're doing it wrong (sponsored by Oracle) Keynote

Developing and running applications in a cloud-native world has a problem set of its own. While this paradigm solves a lot of old challenges, moving to containers and microservices and launching them at scale on schedulers such as Kubernetes requires a different approach. Micha Hernandez van Leuffen shares five best practices for developing cloud-native applications.

Kelsey Hightower has worn every hat possible throughout his career in tech but most enjoys leadership roles focused on making things happen and shipping software. Kelsey is a strong open source advocate focused on building simple tools that make people smile. When he is not slinging Go code, you can catch him giving technical workshops covering everything from programming and system administration to his favorite Linux distro of the month.

Presentations

Patrick Hill is the engineering team lead at Atlassian and recently transferred from Sydney to the Austin office. (G’day, y’all!) In his free time, Patrick enjoys taking his beard from “distinguished professor” to “lumberjack” and back again.

Presentations

It's not just automation: Culture matters for when things break in DevOps (sponsored by Atlassian) Session

Ever had an incident that didn't go as planned? Patrick Hill shares five values developed by Atlassian SREs to better handle incident management.

Jeff Holoman is a systems engineer at Cloudera. Jeff is a Kafka contributor and has focused on helping customers with large-scale Hadoop deployments, primarily in financial services. Prior to his time at Cloudera, Jeff worked as an application developer, system administrator, and Oracle technology specialist.

Presentations

When it absolutely, positively has to be there: Reliability guarantees in Kafka Session

Kafka provides the low latency, high throughput, high availability, and scale that financial services firms require. But can it also provide complete reliability? Gwen Shapira and Jeff Holoman walk you through everything that happens to a message, from producer to consumer, and pinpoint all the places where data can be lost if you're not careful.

Sneha Inguva is an enthusiastic software engineer working on building developer tooling at DigitalOcean. Previously, Sneha worked at a number of startups. Her experience across an eclectic range of verticals, from education to 3D printing to casinos, has given her a unique perspective on building and deploying software. When she isn’t bashing away on a project or reading about the latest emerging technology, Sneha is busy molding the minds of young STEM enthusiasts in local NYC schools.

Presentations

Observability in a dynamically scheduled world Session

Over the past year, DigitalOcean's Delivery team has been building a runtime platform based on Kubernetes with the goal of making shipping code easier. A core component of this system is a monitoring and alerting system based on Prometheus and Alertmanager. Sneha Inguva offers an overview of the system and shares problems encountered, potential solutions, and key lessons learned in the process.

Karl Isenberg is a distributed systems architect at Mesosphere working on DC/OS (the Datacenter Operating System). Prior to Mesosphere, Karl worked on CloudFoundry and BOSH at Pivotal. Karl’s current side projects include Probe (a service-ready check), Inject (a Golang dependency injection library), and Mesos Compose Docker-in-Docker. Karl is, as of this writing, the only person to have been a committer on CloudFoundry, Kubernetes, and DC/OS, so he is uniquely qualified to address the container platform market, cloud-native frameworks, lifecycle management strategies, and deployment tools in general. Karl’s publications include Obfuscation, an irregularly updated tech blog, and a more active stream of technology-related tweets.

Presentations

Container orchestration wars Session

The orchestration space is fast moving and full of competing products, platforms, and frameworks. How do you choose the right one for your requirements? Karl Isenberg explores the features of several container orchestrators, breaking down the feature sets and characteristics into categories and scoring multiple solutions against each other, and discusses what's new this year.

Twelve-year system operations veteran Adam Jacob is the CTO of Chef, a company whose mission is to bring infrastructure automation to the masses. He is the primary author of Chef.

Presentations

The future works like people Keynote

Velocity helped define the era of DevOps and usher in the deep transformation of the field. Adam Jacob looks back on why this happened and explains how we need to shift our perspective to design organizations that can cope with not only what's new but also what's coming next.

Samir Jafferali is a staff SRE at LinkedIn. Samir is passionate about everything that makes the internet tick.

Presentations

Orchestrating multihomed cloud services for a fast and resilient edge Session

With members in every corner of the world, LinkedIn has built services around six CDNs, numerous PoPs, and three DNS platforms. Samir Jafferali explains how LinkedIn uses big data to steer DNS intelligently, optimizes the CDNs for performance, mitigates DDoSes, and measures metrics using RUM and synthetic monitoring and shares best practices that edge teams of all sizes can benefit from.

Dan Jones is the CTO and cofounder at VictorOps, where he supports the company’s goal of making on-call suck less. He is intimately familiar with what it takes to keep a business running when the slightest outage means lost revenue and unhappy customers. With almost 30 years in the software industry, Dan has spent the last 20 years architecting and building scalable 24/7 internet services designed to be “always on.” Previously, Dan was chief architect and vice president of engineering at two successful startups, Raindance Communications and Lijit Networks.

Presentations

The move to event sourcing and CQRS in distributed systems Session

Dan Jones discusses VictorOps's transition to event sourcing and CQRS in distributed systems. Through the use of persistent actors, VictorOps was able to redesign, rebuild, and deploy the entire underlying infrastructure without any noticeable impact to end users.

Nora Jones is a senior chaos engineer at Netflix. Nora is passionate about delivering high-quality software, improving processes, and promoting efficiency within architecture. Occasionally, she pokes holes in distributed systems to make them more resilient.

Presentations

The road to chaos Session

Chaos engineering isn't always the most popular practice among your developers. Nora Jones covers the specifics of designing a chaos engineering solution and explains how to increment your solution technically and culturally, the socialization and evangelism pieces that tend to get overlooked in the process, and how to get developers excited about purposefully injected failure.

Dharmesh Kakadia is a developer and a researcher at Microsoft, where he works on distributed systems. Dharmesh is the author Apache Mesos Essentials. He is passionate about open source and likes to work at the intersection of data and cloud. He enjoys reading in his free time.

Presentations

Scheduling deep dive for orchestration systems Session

Orchestration systems all have different design trade-offs. Despite best efforts, these choices are not always clear to developers using these systems. Dharmesh Kakadia describes the fundamentals of scheduling and explores the scheduling algorithms implemented by various orchestration systems, highlighting similarities, differences, and the consequences of the design choices for the users.

Peco Karayanev, an APM domain expert, currently serves as Senior Field Applications Engineer for Riverbed Technology . Previously, he served in the same role at OPNET, (the leader in Application and Network Performance Management) , prior to its 2012 acquisition by Riverbed. Before that, Peco spent seven years at National Instruments, where he worked most recently in the R&D division, developing a control and provisioning framework (PIE), for several new cloud-based SaaS products, and as a Web Systems Engineer supporting ni.com. Peco has presented at a variety of technology conferences including Vignette Village, OPNETWORK, Oracle Open World and LASCON on a panoply of systems and application performance topics.

Presentations

Ensuring performance in complex architectures: Why integrated visibility is needed (sponsored by Riverbed) Session

If you truly care about end-user experience and need to build highly scalable applications, you must stop treating your users, code, servers, and networks as independent systems. Peco Karayanev discusses a modern integrated visibility approach, where all monitoring shares a common data model that reveals issues previously hidden or misdiagnosed.

Suman Karumuri is the lead for distributed tracing at Pinterest. Previously, he served as the lead for Zipkin project at Twitter. He is the author of an upcoming book Distributed Tracing from O’Reilly.

Presentations

PinTrace: A distributed tracing pipeline Session

Distributed tracing is an emerging field of monitoring distributed systems. Suman Karumuri shares the challenges of building and deploying distributed tracing at scale using PinTrace, one of the largest distributed tracing pipelines. Drawing on real-world examples, Suman explains how traces can be used to understand, debug, and optimize your production workflows.

Ranjeeth Kathiresan is a senior performance engineer at Salesforce, where he focuses primarily on improvising the performance, scalability, and availability of applications by assessing and tuning the server-side components in terms of code, design, configuration, and so on, particularly with Apache HBase. Ranjeeth is an admirer of performance engineering and is especially fond of tuning an application to perform better.

Presentations

Scaling HBase for big data (sponsored by Salesforce) Session

Even though HBase is considered a highly scalable distributed solution, there are cases where the schema design of HBase tables or the way a client uses an HBase cluster may impact the scalability factor of HBase. Ranjeeth Karthik Selvan Kathiresan and Gurpreet Multani outline the most important things to consider when scaling your HBase cluster to accommodate high-volume and high-velocity data.

Michael Kehoe is an engineer at LinkedIn working on architecting and maintaining reliable, scalable large system infrastructure. He possesses high-level skills in maintaining Linux and Windows servers and their respective infrastructure services. Michael’s previous work experience has included building small satellites at NASA and writing thermal environments software at Rio Tinto. He holds a degree from the University of Queensland, where he focused on networks.

Presentations

Traffic shifts: Avoiding disasters at scale Session

LinkedIn conducts regular traffic shifts during peak hours to ensure that it has sufficient capacity to handle extra load during disaster situations. Michael Kehoe and Anil Mallapur discuss how LinkedIn uses traffic shifts to mitigate user impact by migrating live traffic between its data centers and stress test site-wide services for improved capacity handling and member experience.

Ann Kilzer is a site reliability engineer at Indeed. Previously, she worked in backend development and privacy research. Ann holds a master’s degree in computer science from the University of Texas. She enjoys textile arts and trains with the local circus.

Presentations

Canary in a coal mine: Building infrastructure resiliency with canary data reloads Session

Remember the old practice of the canary in the coal mine, where miners used fragile feathered friends as a failure detector for toxic gasses? In software, a canary run is a trial executed on one machine before the rest of the cluster runs. Ann Kilzer explains how Indeed created a canary service leveraging Consul’s key value store to improve the resilience of data reloads in any infrastructure.

Ananth Kini is director of central product management with Oracle, where he and his team help Oracle sales and partners understand and implement PaaS products and extend SaaS with PaaS.

Presentations

DevOps to the cloud: Modern application development (sponsored by Oracle) Tutorial

Ananth Kini explores how to develop apps in the cloud. Ananth walks you through the software development lifecycle (SDLC) for cloud-native projects, touching on simplifying deployment with Agile development, collaborating, and automating DevOps and continuous delivery, all in the cloud.

Matt Klein is a software engineer at Lyft and the architect of Envoy. Matt has been working on operating systems, virtualization, distributed systems, and networking and making systems easy to operate for 15 years across a variety of companies. Some highlights include leading the development of Twitter’s C++ L7 edge proxy and working on high-performance computing and networking in Amazon’s EC2.

Presentations

Lyft's Envoy: Experiences operating a large service mesh Session

Over the past several years, Lyft has migrated from a monolith to a sophisticated service mesh powered by Envoy. Matt Klein explains why Lyft developed Envoy, focusing primarily on the operational agility that the burgeoning service mesh SoA paradigm provides, and shares lessons learned along the way.

Sean Knox is a friendgineer working on Azure Container Services and Kubernetes at Microsoft. Previously, he was part of Deis’s solutions architecture team helping teams make this crazy containers thing work best for them. He lives in San Francisco with his partner and their tuxedo cat.

Presentations

Real-world Kubernetes 2-Day Training

Kubernetes has emerged as the leading platform for containerized applications. Jason Hansen and Sean Knox offer a deep dive into Kubernetes, from concept to implementation, sharing detailed explanations of its architecture, security, and use cases.

TRAINING: Real-world Kubernetes (Day 2) Training Day 2

Kubernetes has emerged as the leading platform for containerized applications. Jason Hansen and Sean Knox offer a deep dive into Kubernetes, from concept to implementation, sharing detailed explanations of its architecture, security, and use cases.

Justin Li is a production engineer at Shopify, where he works on performance, parsers, and distributed systems. To unwind after making the computers go fast, he attempts to make the office karts go fast instead.

Presentations

Standing on the shoulders of giants: Unleashing the power of scriptable load balancers Session

Once reserved for companies large enough to write a load balancer from scratch, load balancer middleware can be a powerful tool for scaling applications. Emil Stolarsky and Justin Li explain how Shopify uses scriptable load balancers to solve difficult infrastructure problems, such as sharding across data centers, handling flash sales, and responding quickly to DDoS attacks.

Bryan Liles is a principal engineer on the cloud engineering team at Capital One. When not helping a huge back move to the public cloud, he gets to speak at conferences on topics ranging from machine learning to building the next generation of developers. In his free time, Bryan races cars in straight lines and around turns and builds robots and devices.

Presentations

Application tracing tutorial Tutorial

In the past, applications were monolithic, and tracing flows for performance and bottlenecks was straightforward, as there was likely a single code base. In today's world, with multiple processes constituting a single application, tracing becomes more challenging. Bryan Liles offers a hands-on demonstration for implementing tracing in modern applications.

Phillip Liu is the CTO and a founder of SignalFx. Phil has more than 20 years of experience in distributed systems. Previously, he was a software architect at Facebook, where he led development of Facebook’s infrastructure-as-a-service platform and several key, web-scale application management solutions and played a pioneering role in the development of the data center automation category as a distinguished technologist at Hewlett Packard and chief architect at Opsware.

Presentations

Removing engineering friction: Creating an evolutionary culture (sponsored by SignalFx) Keynote

Phillip Liu explores the one thing that has become a driver of ever better engineering: constant removal of friction for engineers to not only build and ship code but also understand how code is used and how it works and operates. The end result is a culture that promotes many possible ways to address given challenges and surfaces novel approaches, which may have never arisen otherwise.

Charity Majors is the cofounder and CTO of Honeycomb, a new startup focused on mining machine data. Previously, Charity ran infrastructure at Parse and was an engineering manager at Facebook. She also worked with the RocksDB team to build and deploy the world’s first Mongo + Rocks in production. Charity likes single malt scotch.

Presentations

Database reliability engineering: What, why, and how? Session

SRE is becoming quite the ubiquitous term, but what about DBRE? Laine Campbell and Charity Majors dive into DBRE, exploring the paths to this craft and how to culturally evolve and support it. Laine and Charity focus on organizational scale, self-service, and force multipliers in recoverability, observability, availability, security, release management, and infrastructure.

Meet the Experts with Laine Campbell and Charity Majors Meet the Experts

Laine and Charity have three little words for you: database reliability engineering. Bring your questions and ideas.

Anil Mallapur is a site reliability engineer at LinkedIn, where he works on automating load testing of LinkedIn data centers using live traffic. Previously, he worked on protocols like InfiniBand and RDMA (remote direct memory access) and built distributed applications. He is interested in building scalable internet applications. Anil holds an MS in computing from the University of Utah.

Presentations

Traffic shifts: Avoiding disasters at scale Session

LinkedIn conducts regular traffic shifts during peak hours to ensure that it has sufficient capacity to handle extra load during disaster situations. Michael Kehoe and Anil Mallapur discuss how LinkedIn uses traffic shifts to mitigate user impact by migrating live traffic between its data centers and stress test site-wide services for improved capacity handling and member experience.

Dianne Marsh is a director of engineering at Netflix, where she leads a team responsible for tools and systems used for continuous integration, delivery, and deployment to the AWS cloud by nearly all engineers in the company—which are often released as open source tools to the broad community. Dianne coauthored Atomic Scala with Bruce Eckel. She holds a master of science degree in computer science from Michigan Technological University.

Presentations

Looking back to move forward Keynote

Our industry moves fast. What does that mean for our teams, for our careers, for companies that we join or create? Beyond committing to a career of continuous learning, what does remaining relevant in tech look like? Dianne Marsh looks back at 30 years as a software professional, addressing how things have changed and how they've stayed the same.

Avan (Avantika) Mathur is the product manager for ElectricFlow. Previously, Avan was the global technical account manager at Electric Cloud, helping large enterprises across the finserv, retail, and embedded industries accelerate their DevOps adoptions. Prior to Electric Cloud, she worked as a Linux kernel developer at IBM. Avan has worked with customers to design complex automation solutions and optimize their delivery pipeline, speed up software-driven innovation, and increase Agile throughput. She holds a degree in computer science.

Presentations

CD for DBs: Database deployment strategies Session

Avan Mathur shares strategies for database deployments and rollbacks as well as some patterns and best practices for reliably deploying databases as part of your CD pipeline, safely rolling back database code, ensuring data integrity, and more.

Caitie McCaffrey is a backend brat and distributed systems diva at Twitter. Previously, Caitie spent the majority of her career at 343 Industries, Microsoft Game Studios, and HBO building the large-scale services and systems that power the entertainment industry. Caitie has a degree in computer science from Cornell University and has worked on several video games including Gears of War 2, Gears of War 3, Halo 4, and Halo 5. She maintains a blog at CaitieM.com and frequently discusses technology on Twitter.

Presentations

The verification of a distributed system Session

Testing and verifying distributed systems is critically important. Caitie McCaffrey shares strategies for proving a distributed system is correct, including both formal methods and more practical forms of testing, such as fault injection and property-based testing, ensuring you are confidant that your systems are doing the right thing.

Allison Miller works in product management at Google, mitigating risks to Google and end users. Previously, Allison held technical and leadership roles in security, risk analytics, and payments/commerce at Electronic Arts, Tagged.com, PayPal/eBay, and Visa International. Allison is a proven innovator in the security industry and regularly presents research on risk analytics, cybersecurity, and economics. She is known for her expertise in designing and implementing real-time risk prevention and detection systems running at internet scale.

Presentations

Ground truth in cyberspace: How to launch effective defenses built out of AI Session

Automation is critical for effective operations and security ops. In large-scale systems, manual intervention has to be the exception, not the expectation. But how can security be automated, given the complexity involved? Many platforms turn to ML or AI deployed in risk models. Allison Miller discusses data-driven decision tech and explains how ML and automation creates better defenses.

Eric Minick is an offering management lead at IBM, where he is responsible for leading the product management team overseeing the UrbanCode suite and related offerings. Previously, Eric was a developer, technical seller, and evangelist at UrbanCode before its acquisition by IBM. Eric is an internationally recognized authority on continuous delivery and DevOps. He has a background in developing delivery tools and has helped dozens of Global 2000 companies implement modern delivery approaches. Eric shares this experience as a speaker, blogger, and author of Application Release and Deployment for Dummies.

Presentations

Transform organizational culture for DevOps success (sponsored by IBM) Session

An organization’s ability to adopt a DevOps approach for software delivery often hinges on a cultural transformation that may be more difficult than technology issues. Eric Minick explains how high-performing organizations have embraced culture change, as well as the impact on organizations that haven’t. If you're thinking about embarking on your own DevOps journey, remember—culture is key.

Arijit Mukherji was the first employee at SignalFx, where he has spent the last four years designing, developing and managing many aspects of the product. Arijit has focused on the monitoring space for the past 10 years, in a career that has spanned IP telephony, VoIP conferencing, and network virtualization. Previously, was an original developer on Facebook’s Metric Infrastructure (ODS) team and managed Facebook’s network tools development as well as data visualization for monitoring. He holds a BTech from the Indian Institute of Technology and an MS from UC Davis.

Presentations

Lessons and best practices learned from monitoring next-generation infrastructure (sponsored by SignalFx) Session

Modern infrastructure and DevOps practices are evolving rapidly. These trends pose a new set of monitoring challenges. Arijit Mukherji shares real-world examples demonstrating what these challenges are, some approaches that worked, and metrics system capabilities that helped SignalFx deal with the challenge.

Gurpreet Multani is a principal performance engineer at Salesforce, where he leads initiatives to scale various big data technologies, such as Apache HBase, Apache Solr, and Apache Kafka. He is particularly interested in finding ways to optimize code to reduce bottlenecks, consume fewer resources, and achieve more out of available capacity in the process.

Presentations

Scaling HBase for big data (sponsored by Salesforce) Session

Even though HBase is considered a highly scalable distributed solution, there are cases where the schema design of HBase tables or the way a client uses an HBase cluster may impact the scalability factor of HBase. Ranjeeth Karthik Selvan Kathiresan and Gurpreet Multani outline the most important things to consider when scaling your HBase cluster to accommodate high-volume and high-velocity data.

Rock Mutchler has architected, worked on, and managed many high profile applications over the last 18 years—everything from 100-user sites that cater to a specialized niche needs to sites that deliver a responsive and unique environment to millions of users a day. Rock is on the cutting edge of development operations and brings a unique and multifaceted perspective to maximize the value of every client’s investment. He is frequently called in to fight fires and put teams back on a well-oiled and efficient path for future successes and has worked on some of the most well-known brands in the world, including, Activision, Apple, eBay, Paypal, LiveNation, Verizon, Starbucks, the NASA Ames Research Center, Shutterfly, NameCheap, Career Builder, ustream.tv, Facebook, TheRealReal.com, Signatures Network, u2.com, Kiss.com, and 200 other music and fan sites—the list goes on and on, reaching as far back as Friendster. Rock is also widely known and respected as a performance expert in the PHP community. Contributing back to the open source community, Rock ran two San Francisco Bay Area PHP user groups that he built to over 1,000 members in little over a year.

Presentations

Building resilient systems on AWS 2-Day Training

Rock Mutchler shares best practices for designing and deploying resilient, fault-tolerant systems on AWS and offers deep dives into managed versus unmanaged services, monitoring and observability, high-availability design patterns, fault-tolerant and self-healing systems, disaster recovery and business continuity approaches, and DDoS mitigation.

Jasmin Nakic is lead software engineer on Salesforce’s Frontier Scale performance team, where he helps analyze enterprise customer workloads and simulate large-scale benchmarks. Jasmin focuses on bringing advanced predictive analytics and machine-learning methods to massive amounts of system performance data. Previously, he did database and application development at Teradata, KickFire, KLA-Tencor, and Oracle. Jasmin started his computer science adventure in high school in his small hometown in Bosnia, where he wrote short programs to analyze results from astronomical observations on HP calculators. Later, at the University of Sarajevo, he studied system programming, compilers, advanced architectures, and networks and wrote a thesis on object-oriented approaches to database programming.

Presentations

Predictive system performance data analysis (sponsored by Salesforce) Tutorial

Jasmin Nakic and Samir Pilipovic examine the application of a linear regression predictive model on time series performance data, discussing and evaluating different models to find the optimal choice for a given dataset. All steps will be supported with Python-based scripts so that you can easily implement similar models on your own data.

Sangeeta Narayanan leads the Edge Developer Experience team at Netflix, which focuses on creating solutions that increase development velocity and provide operational insight into system health and behavior. Sangeeta has held various roles in her career in fields such as test engineering, sales engineering, and engineering management. Throughout all those experiences, the common theme has been her passion for simplifying the process of developing and operating software.

Presentations

Lessons learned from operating a serverless-like platform at scale Session

Netflix operates a customizable API that allows the creation of optimized experiences on 1,000+ devices by providing developers a serverless-like platform and experience. Sangeeta Narayanan shares lessons learned operating and scaling the platform over the years and Netflix's approaches to some of the challenges it faced.

Paco Nathan leads the Learning Group at O’Reilly Media. Known as a “player/coach” data scientist, Paco led innovative data teams building ML apps at scale for several years and more recently was evangelist for Apache Spark, Apache Mesos, and Cascading. Paco has expertise in machine learning, distributed systems, functional programming, and cloud computing with 30+ years of tech-industry experience, ranging from Bell Labs to early-stage startups. Paco is an advisor for Amplify Partners and was cited in 2015 as one of the top 30 people in big data and analytics by Innovation Enterprise. He is the author of Just Enough Math, Intro to Apache Spark, and Enterprise Data Workflows with Cascading.

Presentations

Computable content: Notebooks, containers, and data-centric organizational learning

O'Reilly recently launched Oriole, a new learning medium for online tutorials that combines Jupyter notebooks, video timelines, and Docker containers run on a Mesos cluster, based the pedagogical theory of computable content. Paco Nathan explores the system architecture, shares project experiences, and considers the impact of notebooks for sharing and learning across a data-centric organization.

Dawn Parzych is the director of product and solution marketing at Catchpoint. Dawn enjoys researching, writing, and speaking about trends related to application performance, user perception, and how they impact the digital experience. In 15+ year career, Dawn has held a wide variety of roles in the application performance space at Instart Logic, F5 Networks, and Gomez.

Presentations

Perception and bias and metrics, oh my! (sponsored by Catchpoint) Keynote

Human perception and biases can influence how metrics are interpreted. While valid metrics can open lines of communication across and within teams, using vanity metrics or data to shame others can be counterproductive. Dawn Parzych explains how you can make a real and lasting impact on your organization by understanding the influence assumptions and biases have and how to present credible data.

Lisa Phillips is vice president of site reliability engineering at Fastly. With 18 years of experience in Internet and Web technologies with emphasis on systems and database administration, architecture, engineering, and management, Lisa isn’t afraid of hard problems or scale. She brings extensive experience in implementation and management of Internet services to ensure highest levels of system availability and performance globally.

Presentations

Incident Command: The far side of the edge Session

Fastly operates the edge for many large web properties. To deal with emerging threats to its network, Fastly created a process that allows it to respond effectively to incidents: Incident Command, which rapidly coordinates teams during an incident. Maarten Van Horenbeeck and Lisa Phillips take you to the far side of the edge, demonstrating the protocols that work during an incident.

Samir Pilipovic is a senior software engineer on the Performance Engineering team at Salesforce. His primary interest is performance and optimization of large enterprise deployments in the cloud.

Presentations

Predictive system performance data analysis (sponsored by Salesforce) Tutorial

Jasmin Nakic and Samir Pilipovic examine the application of a linear regression predictive model on time series performance data, discussing and evaluating different models to find the optimal choice for a given dataset. All steps will be supported with Python-based scripts so that you can easily implement similar models on your own data.

Tony Pujals is a Docker Captain and the director of cloud engineering at Appcelerator, where he focuses on improving the process of building, deploying, orchestrating, and monitoring containerized microservices. Tony is fanatical about Docker, Go, Node.js, APIs, microservices, serverless computing, distributed systems, and scalable cloud architecture. He is a co-organizer of the Mountain View Docker meetup.

Presentations

Docker production: Orchestration, security, and beyond Tutorial

Starting where previous Docker workshops leave off, Bret Fisher, Laura Frank, and Tony Pujals dive into the new Swarm mode clustering (services), failover, blue-green deployments, monitoring, logging, troubleshooting, and security, covering the latest built-in features and common third-party tools as they walk you through installing them on your own five-node cloud Swarm cluster.

Meet the Experts with Bret Fisher, Laura Frank, and Tony Pujals Meet the Experts

If you want the lowdown on using Docker in production, bring any and all questions to Bret, Laura, and Tony. They've got you covered.

David Radcliffe is a production engineer lead at Shopify. He moonlights on the Ops team for RubyGems.org and is active in the open source community.

Presentations

Genesis: Automating data center management with help from PXE and Chef Session

The flexibility and speed offered by cloud computing solutions have raised the bar for bare metal deployments. Automation is essential to speedy, reliable provisioning and capacity management. David Radcliffe explores the tools Shopify uses, such as Genesis, to automate its data center and empower developers to move quickly and keep up with the times.

Roy Rapoport manages the Insight Engineering organization at Netflix, which writes the powerful telemetry platform and graphics, alerting, and analytics systems on top of it, that allow Netflix to have complete real-time visibility into its operations and systems—in the cloud, on customer devices, and anywhere else where Netflix operates. Roy has more than 20 years of experience in technology in a wide variety of disciplines, from Unix systems engineering to network architecture and implementation, software development and software testing, and release management. He has a passion for leading highly effective, highly motivated teams and the technical background and acumen to pitch in and be hands-on. Roy is a master of no trade but has a keen holistic vision for how people and technologies work together most effectively. He codes for fun.

Presentations

From placid planners to passionate pioneers: In pursuit of the next thing Session

When you're a scrappy startup, being nimble, agile, and flexible comes with the territory. But how do you maintain agility when you're a much, much larger company? Hope is not lost. Roy Rapoport shares critical leadership practices—focusing on encouraging failure, growing heretics, and empowering dissent—that will help you maintain a technical and organizational edge.

Meet the Experts with Roy Rapoport Meet the Experts

Roy invites you to chat with him about implementing the ideas he discussed in his presentation, his experience with Netflix's culture, the repercussions to other organizations, and any other non-NDAed area you might be interested in.

Patrick Reynolds leads the Git Infrastructure team at GitHub, providing Git as a service to the rest of the GitHub and to users at large. He brings more than 10 years of academic and startup expertise on distributed systems, replication, and application performance management, but he had to learn all the Git stuff on the job. He enjoys sailing, cycling, and teaching kids, including his own, to code.

Presentations

Git as a multisite service Session

GitHub uses Spokes, a custom application-level replication system, to provide redundancy and scalable capacity for the Git service. Originally, Spokes was limited to a single physical site. Patrick Reynolds offers an overview of Spokes and explains how GitHub extended it to span multiple sites, transparently providing read-anywhere, write-anywhere replication for all Git content.

Henry Robinson is a software engineer with some experience in open-source distributed systems, including Apache Zookeeper, Apache Flume and Apache Impala.

Presentations

How to scale a distributed system Session

It seems like everyone is building a distributed system. However, there's no common body of knowledge about how these systems should be built and scaled, beyond what is squirreled away in various academic papers. Henry Robinson shares lessons learned from over eight years spent building distributed systems and outlines a framework for thinking about distributed scaling challenges.

Michael Sage is technical lead for customer engagements at Fugue. He has over 15 years’ experience as a solutions architect and consultant helping teams of all sizes with software delivery and performance management. Previously, Michael worked with industry-leading companies Mercury Interactive, New Relic, BlazeMeter, and Sauce Labs. He lives in San Francisco.

Presentations

Automating cloud infrastructure for CI/CD pipelines (sponsored by Fugue) Session

With the ready availability of cloud services, teams no longer need to invest in expensive testing environments, and no longer need to wait their turn to use them. Michael Sage demonstrates how to spin up and tear down exact clones of production environments using Jenkins 2 multibranch pipelines and Fugue.

Corey Scobie is vice president of Open Platform and product experience at Akamai Technologies, where he leads strategy and delivery of Akamai’s command and control console, Luna Control Center, as well as the portfolio of customer- and developer-facing applications and APIs. Corey leads the Open Platform Initiative, a company-wide effort to simplify and expand access to Akamai’s core technologies, products, and services using a developer-driven, self-service approach. Corey has over 20 years of experience in IT and has been involved in the API and Open Platform revolution for much of the past 15 years. Previously, Corey was chief strategist for appliances and application optimization at IBM and held senior positions with Akana (previously SOA Software), DataPower Technologies, and Orsus Solutions.

Presentations

Internet traffic growth: Why platforms are critical for developers (sponsored by Akamai) Keynote

Using statistics about internet traffic patterns and growth from the past decade as a backdrop, Corey Scobie shares insights as to how and why edge computing clouds are so critical to the success of builders of scalable apps.

Gwen Shapira is a system architect at Confluent, where she helps customers achieve success with their Apache Kafka implementation. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in building real-time reliable data-processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, the coauthor of Hadoop Application Architectures, and a frequent presenter at industry conferences. She is also a committer on Apache Kafka and Apache Sqoop. When Gwen isn’t coding or building data pipelines, you can find her pedaling her bike, exploring the roads and trails of California and beyond.

Presentations

When it absolutely, positively has to be there: Reliability guarantees in Kafka Session

Kafka provides the low latency, high throughput, high availability, and scale that financial services firms require. But can it also provide complete reliability? Gwen Shapira and Jeff Holoman walk you through everything that happens to a message, from producer to consumer, and pinpoint all the places where data can be lost if you're not careful.

Adam Shepard is a senior software architect at AudienceScience.

Presentations

Scaling a user delivery network for real-time audience targeting Session

Adam Shepard peels back the covers on a user delivery network—a worldwide distributed data store powering over 80 billion transactions a day at millisecond speed. Join in to learn about eventually consistent data architectures, tiered and hybrid storage layers, and what it takes to manage that much data at scale.

Dharma Shukla is a distinguished engineer at Microsoft and the founder of Azure Cosmos DB, a globally distributed, multitenant database service on Azure. Over his career, Dharma has worked on a range of distributed systems and databases at Microsoft and elsewhere.

Presentations

Lessons learned from building a globally distributed database service from the ground up Keynote

Dharma Shukla explores Azure Cosmos DB, discussing the internals of the system design and the various design trade-offs Azure had to make while building the service. Dharma also shares his experience and lessons learned operating a globally distributed database service worldwide while maintaining comprehensive service level agreements.

Ben Sigelman is the cofounder and CEO of LightStep, where he’s building reliability management for modern systems. An expert in distributed tracing, Ben is the coauthor of the OpenTracing standard, a project within the Linux Foundation’s Cloud Native Computing Foundation (CNCF). Previously, he built Dapper, Google’s production distributed systems tracing infrastructure, and Monarch, Google’s fleet-wide time series collection, storage, analysis, and alerting system. Ben holds a BSc in mathematics and computer science from Brown University.

Presentations

The holy grail of systems analysis: From what to where to why Session

Most sudden latency regressions in a distributed system are throughput or queueing problems. Now that some monitoring technologies can observe a system with full fidelity, we can connect the dots from a high-latency outlier request to the contended resource it’s waiting on. Ben Sigelman explains why this workflow could change the way we understand critical-path latency in distributed systems.

Andy Smith (aka “Termie”) is CTO of Wercker. Previously, he was a cofounder at OpenStack building cloud infrastructure and worked on the platform layer at Google App Engine, the application layer at Jaiku, and the client side at Flock. Andy is an open source Golang and Python programmer. His notable side projects include OAuth and BarCamp.

Presentations

Moving fast with microservices: Building and deploying containerized applications in a cloud-native world (sponsored by Oracle) Session

Micha Hernandez van Leuffen explains how current delivery systems are falling behind and why we need to change the mental model, create new best practices, and treat containers as first-class citizens. Along the way, Micha shares how Wercker implements continuous delivery in combination with Kubernetes.

Ines Sombra is director of engineering at Fastly, where she spends her time helping the web go faster. Ines holds an MS in computology with an emphasis on cheesy ’80’s rock ballads. She has a fondness for steak, fernet, and a pug named Gordo. In a previous life, she was a data engineer.

Presentations

Thursday opening welcome Keynote

Program chairs Mary Treseler, James Turnbull, and Ines Sombra open the second day of keynotes.

Wednesday opening welcome Keynote

Program chairs Mary Treseler, James Turnbull, and Ines Sombra open the first day of keynotes.

Daniel “Spoons” Spoonhower is a cofounder at LightStep, a company that makes complex microservice applications more transparent and reliable. An expert in distributed tracing, he is a core contributor to the OpenTracing project, a CNCF project. Previously, Spoons spent almost six years as a staff software engineer on Google’s Infrastructure and Cloud Platform teams, where he ate lots of snacks and worked on various products. Spoons holds a PhD from Carnegie Mellon University in computer science.

Presentations

Distributed tracing and the future of chargeback and capacity planning Session

As software grows more complex, doing chargebacks and capacity planning gets more challenging. Specifically, it becomes more difficult to attribute storage and other low-level requests to high-level services. Daniel Spoonhower shows how the distributed tracing concept of context propagation can be used to overcome this problem, without any maintenance costs.

Phil Stanhope is vice president of technology strategy at Dyn, where he is responsible the strategy and planning for telemetry services that help unify engineering, infrastructure, architecture, systems engineering, analytics, security, cloud and CDN monitoring, and network operations. Previously, Phil held technology and executive leadership positions at companies like Yottaa and Wimba and was the chief architect at Aspen Technology, CTO at Adesso Systems, director at Perot Systems, founder of Cambridge Object Technologies, and platform architect at Lotus. Phil is a known thought leader and has served on numerous advisory boards and technology adoption programs over the past 25 years. He is a regular speaker at conferences like Velocity, Security, Monitorama, and WebPerfDays. He is the author of Get in the Groove by Wiley, which focuses on building peer-to-peer solutions with the Groove (Microsoft) platform. Phil holds a BS in computer and information science from the University of Massachusetts, Amherst. He and his wife live in Newburyport, MA.

Presentations

Rethink DNS for DevOps: Three ways DNS with intelligent response makes your applications better (sponsored by Oracle + Dyn) Session

For more than 30 years, the DNS has been one of the fundamental protocols of the internet, yet, despite its accepted importance, it has never quite gotten the due it deserves. Andy Smith explains why it's time to rethink DNS and realize the role it can play in building and running high-performance, distributed web applications.

Karl W. Stewart is senior product manager for web experience at SOASTA. Karl is a 25-year product executive, with stints at Oracle, Palm, Verizon Wireless, and Qualcomm, and has served as a product consultant for a number of startups.

Presentations

Uncovering abandoned revenue with massive-scale microfocus testing (sponsored by SOASTA, now a part of Akamai) Session

The next wave of testing is massive-scale microfocus testing, and it is uncovering millions of dollars of abandoned revenue. Karl Stewart explains how digital leaders are using massive-scale microfocus testing to guarantee their success.

Emil Stolarsky is a production engineer at Shopify, where he works on performance, scriptable load balancers, and DNS tooling. When he’s not trying to make Shopify’s global performance heat map green, he’s shivering over a spiked cup of coffee in the great Canadian north.

Presentations

Standing on the shoulders of giants: Unleashing the power of scriptable load balancers Session

Once reserved for companies large enough to write a load balancer from scratch, load balancer middleware can be a powerful tool for scaling applications. Emil Stolarsky and Justin Li explain how Shopify uses scriptable load balancers to solve difficult infrastructure problems, such as sharding across data centers, handling flash sales, and responding quickly to DDoS attacks.

Brad Stoner is a senior engineer at AppDynamics. Over his 14-year career, Brad has held roles in performance engineering, systems engineering, and operations management. Previously, he managed the Load and Performance team at H&R Block in pursuit of improved application performance and quality. He is also the founder of Sandbreak Digital Solutions, a consulting company specializing in web application performance testing, web page optimization, frontend optimization, capacity testing, infrastructure validation, and cloud testing.

Presentations

Bulletproof your CI pipeline: Using APM to augment your automated performance testing (sponsored by AppDynamics) Session

As release velocity increases, teams are finding innovative ways to detect and resolve performance issues earlier in the development cycle. Brad Stoner explores how to implement an automated performance testing strategy and explains how leveraging APM (application performance management) tools can reduce time to market while increasing overall quality.

M​ary Treseler is vice president of content strategy at O’Reilly Media, ​where she leads an editorial team that covers a wide range of topics from DevOps to design, and the chair of O’Reilly’s Velocity Conference. Mary has been working on technical content for 25 years, acquiring and developing content in areas such as programming, software engineering, and product design. A Boston native, Mary lives​ oceanside​ ​in Padanaram, MA.

Presentations

Thursday opening welcome Keynote

Program chairs Mary Treseler, James Turnbull, and Ines Sombra open the second day of keynotes.

Wednesday opening welcome Keynote

Program chairs Mary Treseler, James Turnbull, and Ines Sombra open the first day of keynotes.

James Turnbull is the CTO of Empatico. A longtime member of the open source community, James is the author of nine technical books about open source software: The Terraform Book, The Art of Monitoring, The Logstash Book, The Docker Book, Pro Puppet, Pulling Strings with Puppet, Pro Linux System Administration, Pro Nagios 2.0, and Hardening Linux. He was formerly CTO at Kickstarter and an advisor at Docker. James likes food, wine, books, photography, and cats. He is not overly keen on long walks on the beach or holding hands.

Presentations

Thursday opening welcome Keynote

Program chairs Mary Treseler, James Turnbull, and Ines Sombra open the second day of keynotes.

Wednesday opening welcome Keynote

Program chairs Mary Treseler, James Turnbull, and Ines Sombra open the first day of keynotes.

Lisa van Gelder is senior vice president of technology at Bauer Xcel Media. Lisa has been writing software for over 17 years, ever since she started building websites in 1999, but she soon discovered that backend problems were way more fun. Her career has taken her from small startups to large media organizations between London and New York, including the Guardian newspaper and the BBC. She is mostly powered by coffee.

Presentations

A/B testing sexism: Interviewing as a female executive in tech Session

Lisa van Gelder shares what she learned from an accidental A/B test. Last year, she interviewed for a new executive job at the same time as two (white, male) friends, and they compared notes. Lisa explains how "unqualified" is used to reject marginalized groups in tech and what we can do about it—both as individuals interviewing and as hiring managers looking to improve the interview process.

Maarten Van Horenbeeck is vice president of security engineering at Fastly, a content delivery network that speeds up web properties around the world. He is also a board member and former chairman of the Forum of Incident Response and Security Teams (FIRST), the largest association of security teams, counting 300 members in over 70 countries. Previously, Maarten managed the Threat Intelligence team at Amazon and worked on the Security teams at Google and Microsoft. Maarten holds a master’s degree in information security from Edith Cowan University and a master’s degree in international relations from the Freie Universitat Berlin. When not working, he enjoys backpacking, sailing, and collecting first-edition travel literature.

Presentations

Incident Command: The far side of the edge Session

Fastly operates the edge for many large web properties. To deal with emerging threats to its network, Fastly created a process that allows it to respond effectively to incidents: Incident Command, which rapidly coordinates teams during an incident. Maarten Van Horenbeeck and Lisa Phillips take you to the far side of the edge, demonstrating the protocols that work during an incident.

Meet the Experts with Maarten Van Horenbeeck and Lisa Phillips Meet the Experts

Maarten and Lisa are here to talk about infosec in distributed systems, security incident management, and compliance and security in a DevOps environment. Stop by and chat.

Seth Vargo is the director of technical advocacy at HashiCorp. Previously, he worked at Chef (Opscode), CustomInk, and a few Pittsburgh-based startups. He is the author of Learning Chef. Seth is passionate about reducing inequality in technology. When he is not writing, working on open source, teaching, or speaking at conferences, Seth enjoys spending time with his friends and advising nonprofits. He loves all things bacon.

Presentations

Microservices secrets management with Vault Tutorial

It's great that you've moved to microservices, but how are you distributing secrets? Seth Vargo offers an overview of Vault's unique approach to secret management by providing secrets as a service for your services (and humans too), which is highly scalable and easily customizable to fit any environment.

Kathleen Vignos is a full stack engineer turned manager who has led engineering teams at Twitter and Wired. She’s worked at two startups (one of which she founded), traveled the western US for management consulting and professional services, taught business software programming at the university level, won a hackathon, and built dozens of websites. Other experiences include everything from being on call as a COBOL programmer for Y2K to modifying a React app for a hack week project. She holds engineering degrees from UCLA and Michigan.

Presentations

Managing engineering teams through constant change Session

Constant change—caused by high attrition, frequent reorganization, shifting priorities, and management turnover, among other reasons—is the new normal. It takes months to onboard a new team member and get them adding value. Kathleen Vignos offers tips, shortcuts, and stories for stabilizing a team and finding a path to productivity amid the chaos.

Miles Ward is global head of solutions for Google Cloud, where he focuses on everything from delivering next-generation solutions to challenges in big data and analytics, application migration, infrastructure automation, and cost optimization. Miles is a three-time technology startup entrepreneur with a decade of experience building cloud infrastructures. Previously, he was a core part of the Obama for America 2012 “tech” team, crashed Twitter a few times, helped NASA stream the Curiosity Mars Rover landing, and put Skype back online in a pinch. He also plays a mean electric sousaphone.

Presentations

Google Cloud Spanner: Global consistency at scale Session

Google Cloud Spanner, Google's public launch of the internal Spanner service, makes available a new basic primitive for application design: globally consistent transactions. Want to know how it all works? Join Miles Ward for a detailed, demo-filled, nuanced look at the useful applications of Spanner for your workload.

James Wickett is head of research at Signal Sciences, where he works at the intersection of the DevOps and security communities. James is a supporter of the Rugged Software and Rugged DevOps movements. Seeing the gap in software testing, James founded Gauntlt, an open source project, to serve as a Rugged testing framework. He is the author of Hands-on Gauntlt and DevOps Fundamentals on Lynda.com. James got his start in technology when he founded a startup as a student at the University of Oklahoma. He has worked in environments ranging from large, web-scale enterprises to small, rapid-growth startups. He is a dynamic speaker on topics in DevOps, infosec, cloud security, security testing, Rugged DevOps, and serverless. James is the creator and founder of the Lonestar Application Security Conference, the largest annual security conference in Austin, TX. He also runs DevOps Days Austin and is on the global DevOps Days board. James holds several security certifications, including CISSP and GWAPT. In his spare time, he’s trying to learn how to make a perfect BBQ brisket.

Presentations

Serverless security: A pragmatic primer for builders and defenders Session

Serverless is the design pattern for writing applications at scale without the necessity of managing infrastructure. It adds simplicity and a new economic model to cloud computing, but it creates some unique security challenges. James Wickett explores practical security approaches for serverless in four key areas: the software supply chain, the delivery pipeline, data flow, and attack detection.

Dominic Williams is chief scientist of the DFINITY project, headquartered in Zug, Switzerland, and president and CTO of String Labs, a Palo Alto- based studio, incubator, and investor focused on advanced open protocol projects. His recent technical works include DFINITY technologies such as the Threshold Relay/Probabilistic Slot Protocol blockchain consensus mechanisms, the Blockchain Nervous System, and the PHI “crypto fiat” autonomous loan issuance system.

Presentations

Toward an intelligent and infinitely scalable decentralized cloud (sponsored by DFINITY) Session

DFINITY, a new kind of open cloud computing resource, takes the form of a decentralized network that conjures a performant "blockchain computer" with unbounded capacity that will act much like a gigantic shared mainframe for the world. Dominic Williams introduces the project and explores the foundational decentralized computing techniques it makes use of.

Jamie Winsor is a lead engineer at Chef Software and the coauthor of Habitat, an open source project built upon distributed system protocol Butterfly to provide a self-healing, self-configuring, stack-agnostic, frictionless abstraction for running applications—regardless of their complexity—to software developers. Jamie has been a software engineer in the video game industry for 10 years, focusing on networked application servers on such titles as League of Legends, Lord of the Rings Online, and Dungeons and Dragons Online. One of Jamie’s responsibilities in his game development tenure was to bring what we today know as DevOps into the daily lives of the other developers on his team, which Jamie accomplished by building, evangelizing, and teaching methods to his peers. He draws on that experience today in building Habitat, as he helps enable all software developers, regardless of their experience, bring their ideas to life without investing in the details of operationalizing an application.

Presentations

Building distributed systems is accessible. I promise. Session

Understanding and building distributed systems can be a daunting task, but like most other software development patterns, distributed systems mimic concepts in the real world that you're already familiar with. Jamie Winsor walks you through building a mental model to help you understand the basics of building distributed systems based on concrete, real-world systems.

Martin Woodward is the principal program manager for DevOps in Microsoft, where he focuses on Visual Studio Team Services and Team Foundation Server. Previously, Martin was executive director of the .NET Foundation, helping drive Microsoft’s move to open source, and was responsible for the Java, Linux, and Mac tooling in the Developer division, where he helped introduce Git into Microsoft.

Presentations

In depth: What we learned moving 65,000 Microsofties to DevOps on the public cloud (sponsored by Microsoft) Session

Martin Woodward tells the full story of transforming Microsoft’s internal engineering systems from a collection of in-house tools built up over decades to One Engineering System with a globally distributed 24x7x365 service on the public cloud, utilizing modern techniques and industry-recognized open source technologies.

What we learned moving 65,000 Microsofties to DevOps on the public cloud (sponsored by Microsoft) Keynote

Martin Woodward tells the story of how Microsoft’s internal engineering systems are being transformed from a collection of disparate in-house tools built up over decades to One Engineering System with a globally distributed 24x7x365 service on the public cloud, utilizing modern techniques and industry-recognized open source technologies.

Christine Yen is the cofounder of Honeycomb, a startup with a new approach to observability and debugging systems with data. Christine has built systems and products at companies large and small and likes to have her fingers in as many pies as possible. Previously, she built Parse’s analytics product (and leveraged Facebook’s data systems to expand it) and wrote software at a few now-defunct startups.

Presentations

The problem with preaggregated metrics Session

Preaggregated metrics and time series form the backbone of many monitoring setups. They have many redeeming qualities but simply aren't sufficient for capturing or responding to the many ways things can go wrong in modern or complex systems. Christine Yen outlines the problems inherent in the use and implementation of preaggregated metrics.