This talk is about the unexpected things Dave learned along the way trying to convince programmers to try Go and how they might translate to the experiences that all have working in an ecosystem of open source projects.
Like Cinderella's "The good in the potty, the bad in the croppy," Sabine Wojcieszak explains why you should take a closer look at your habits and decide which of them will support your DevOps endeavors and which will harm them.
GDPR was likely one of the biggest challenges in data management that occurred in 2018. Yulia Trakhtenberg dives into a one-year retrospective about how it was executed in reality at a large-scale data organization.
The Splice engineering team grew almost 10 times in 18 months. The delivery practices that worked when it was 5 people broke way before it got to 50. Juan Pablo Buritica explains how the engineering team accelerated delivery using industry insights and data.
Andy Kwiatkowski takes a deep dive into how Shopify saved a million dollars a year in infrastructure costs by rolling its own autoscaler.
Sweden's Television manages online products that range from providing news to TV series and are used by millions of people. To make sure that it creates content that engages, entertains, and educates, it started its own platform for collecting and analyzing user data. Ismail Elouafiq highlights the architectural choices the company made and the lessons it learned in building its data ecosystem.
Psychological safety is one of the leading indicators of a high-performing team. Yet, Lena Reinhard explains, forging deep human relationships and building trust can be difficult when your team is distributed or largely interacts on screens.
If you're interested in learning a framework of reference to enable continuous deployment to Kubernetes for business-critical production applications, join in. Priyanka Sharma bridges the gap between how to make large-scale migrations of production applications and the nitty gritty details that engineering managers and leaders need to consider.
Ho-Ming Li outlines how to use chaos engineering to accelerate your understanding of how your network can break (packet loss, black hole attacks, latency injection, and packet corruption) and impact your services.
While it’s great to think big, it's important to start small and sensibly. David Jungwirth explains how Enterprise Studio by HCL Technologies helped an enterprise achieve a deployment time reduction of 99%, double its releases, and massively reduce its overhead costs for each release with few small improvements over a period of two and a half years.
Software is eating the world, and security will be eaten as well if it doesn't evolve. Kelly Shortridge exposes why chaos and resilience engineering represents the future of security programs—and why it catalyzes the dawn of defensive innovation. You'll examine how adopting distributed, immutable, and ephemeral infrastructure (the "DIE" triad) can create powerful security benefits.
Molly Struve gives you the tools and strategies you need to build a monitoring system that will scale with your team and your infrastructure.
We never change the amount of work or technical debt; we just shift it, and with it, we change how it emerges and appears. Heidi Waterhouse explains how you can handle this level of uncertainty.
Robin Marx (University of Hasselt, Expertise Centre for Digital Media EDM)
Deploying HTTP/2 correctly can be challenging in practice, and HTTP/3 will make things even more difficult as the underlying QUIC protocol runs over user datagram protocol (UDP). Robin Marx explores practical proxying, caching, load balancing, and routing issues and how to overcome them.
GitOps is the practice of continuous delivery using Git repos as the single source of truth, managing infrastructure and applications in an immutable and declarative manner. Michael Hausenblas motivates the model and shows it in action, using Kubernetes and a number of tools.
Michael Hobbs takes a look at how best to ensure your service owners can succeed with responsibilities and concerns that were traditionally the domain of ops teams prior to the deployment of Kubernetes for production load within a business.
Out-of-the-box Kubernetes makes it easy to deploy and scale your applications within one Kubernetes cluster in one single region. But it's also possible to deploy an application over multiple clusters in different regions, so it becomes truly highly available even if a complete region fails. Learn how to deploy one application across multiple federated Kubernetes clusters with Bastian Hofmann.
The reliability of cloud services tends to operate in the perpetual present tense—focused more on maintaining systems right now more than preparing for a far future. Ingrid Burrington explores how reframing the time scales of computation can change and maybe improve the way your build infrastructure.
Gilles Dubuc takes a deep dive into how Wikipedia interprets large amounts of real user performance data and the many pitfalls you can fall into when doing so.
Knative is a Kubernetes-based platform to build, deploy, and manage modern serverless workloads. It provides a set of middleware components that are essential to build modern, source-centric, and container-based applications that can run anywhere. Join Nikhil Barthwal to explore using Knative to build and deploy modern serverless workloads in a vendor neutral fashion.
Karthik Gaekwad explores why the Oracle Container Engine for Kubernetes (OKE) is one of Oracle Cloud's most popular platforms. You'll learn the good and some ugly lessons learned along the way on how to manage, operate, and scale Kubernetes at a cloud provider scale.
Laurent Bernaille examines the lessons he learned operating large Kubernetes clusters.
Rob Skillington and Łukasz Szczęsny explore scaling monitoring, alerting, and configurational complexity for a single view of your applications, databases, infrastructure, and operations across all regions using M3 and Prometheus.
Julia Biro explains technical solutions and insights from building a true multiregion active-active file service using Lambda@Edge and S3 (with buckets in multiple AWS regions).
Once restricted to companies like Netflix, chaos engineering is becoming a common practice in organizations of all sizes. Paul Osman outlines techniques Under Armour uses to measure service health with chaos engineering. He details its operational maturity model and how the company uses it to blamelessly identify teams that need additional help and action items to improve resiliency and happiness.
By empowering you to ask new questions of your software, observability fuels curiosity about the world as it is, not how you expect it to be. In the end, after all, Christine Yen explains, "Nines don't matter if your users aren't happy."
Operating cloud native infrastructure is more than just spinning up a container orchestrator. Auxiliary services are required in order to operate effectively and provide developers with a true platform experience. Josh Michielsen explores how Condé Nast operates multiple Kubernetes clusters across the world, with a focus on observability, testing, app delivery, and developer experience.
In the world of software development or technology in general, performance often gets overlooked or is looked at late. Daniel Drozdzewski examines the philosophical aspects and benefits of keeping performance at the forefront of your mind.
Dmitrii Dolgov takes a deep dive into how to troubleshoot intricate performance issues in PostgreSQL using such tools as strace, perf, extended Berkeley Packet Filter (eBPF). And stay curious.
Time and money are generally the resources we focus on when building applications. Yet we can’t buy trust; it builds slowly and can be broken quickly when we don’t factor it in to our development process. Jennifer Davis examines how to leverage security practices to enable an all-team approach to security.
For years, Janna Brummel and Robin van Zijll have been told no to any external hosting. They've always lost time by not being able to use open source and cloud native products without adjustments. All because they work for a bank. Things are changing now: Janna and Robin are proving it's possible to run APIs in a secure container platform in the public cloud.
Ana Oprea examines SRE and security best practices for designing, operating, and scaling dependable infrastructure.
In software development, test-driven development (TDD) is the process of writing tests and then developing functionality to pass the tests. Rosemary Wang explores methods of adapting and applying TDD to configuring and deploying infrastructure as code.
Even software, written in high-level cross-platform language with no assembly can fail multiple ways when ported to a different CPU architecture. Ignat Korchagin examines the issues Cloudflare encountered when porting its software stack to ARM64.
Build pipelines are commonly used in the industry to build and roll out changes to cloud accounts. Typically, wide permissions are granted to those systems, making them an interesting attack vector. Take a look with Andreas Sieferlinger at typical vulnerabilities and examine the case of the confused deputy—a trusted third-party party—and how these vulnerabilities can be mitigated in real-life.
Regardless of all the technical benefits that Kubernetes brings, team interactions are still key for successfully delivering and running services. Manuel Pais explores how team design affects the success of Kubernetes adoption.
Switching databases requires a lot of effort from engineering teams, and Christian Grabowski walks you through steps you can take to reduce the amount of work needed to achieve payoff. NS1 created an abstraction layer of wire protocols between old and new databases, which allowed it to develop advanced functionality in new services, while legacy services required minimal changes.
Jenn Strater walks you through the best practices she's learned since transitioning from a software engineer at various product companies to working for a company that focuses on build automation.
This talk shows how good abstractions make it possible to identify and apply solutions to seemingly unrelated problems from different disciplines to build better systems with less effort.
Building and maintaining distributed systems is hard. Industry tools and recommended practices are evolving at an ever-increasing velocity. New platform choices reduce infrastructure management and add operational complexity obscuring the value of operation skills. Often, bureaucratic decisions drive practices and tool choices.
Jonathan Johnson introduces you to Kubernetes for software engineers through concepts and a hands-on tutorials using KataCoda.com/javajon.
Open source tools for dashboarding and metrics have seen massive adoption in recent years. Riding the hype, the new, shiny tools are inevitably confronted with overblown expectations and problematic usage patterns, causing frustration and criticism. Björn Rabenstein outlines how to use dashboards and metrics effectively rather than condemning them altogether.