4–7 Nov 2019
Please log in

Autoscaling in reality: Lessons learned from adaptively scaling Kubernetes

Andy Kwiatkowski (Shopify)
14:2015:00 Wednesday, 6 November 2019
Location: Hall A1
Average rating: ****.
(4.67, 3 ratings)

Who is this presentation for?

  • SREs and developers who work with large-scale, complex deployments with varying traffic and usage patterns




Cloud providers often come with a checkbox to enable a simple CPU-based autoscaler. However, if your application runs complex deployments on thousands of servers across multiple regions and has to wrestle the occasional celebrity flash sale, you might need to go further to react quicker, allow for more complex scaling rules, and create extra fail-safes to prevent capacity shortages.

Andy Kwiatkowski dives into what it took for Shopify to create its own autoscaler, from writing traffic-smoothing algorithms to dealing with regional evacuations and to handling noise from a system continuously deployed 50 times a day. He details creating a more useful utilization signal and share battle-tested ideas for creating a highly fault-tolerant tool you can trust to scale your entire infrastructure.

Prerequisite knowledge

  • General knowledge of large-scale server architectures

What you'll learn

  • Discover the benefits and drawbacks of automatically scaling your infrastructure
  • Identify if and when you should consider writing your own autoscaler
  • Learn best practices for creating a fault-tolerant system that manages critical infrastructure
Photo of Andy Kwiatkowski

Andy Kwiatkowski


Andy Kwiatkowski is a senior production engineer at Shopify, where he helps drive capacity planning, autoscaling, and job infrastructure. Being a hyper-growth company, the infrastructure needs of Shopify are constantly changing, and Andy works with the world-class team at Shopify to keep up with ever-growing demand. Previously, he was an engineering manager at D2L, a leader in cloud-based learning management systems, and he spent 12 years as developer in the video game industry, working for large companies such as Electronic Arts, Rockstar Games, and his own development studio.

  • Oracle Cloud Infrastructure
  • Cloudflare
  • JFrog
  • Akamas
  • Aqua Security Software
  • Fastly
  • Google
  • Instana
  • JetBrains
  • LaunchDarkly
  • LightStep
  • OVHcloud
  • SignalFx
  • VictorOps
  • Wayfair
  • Blameless
  • Chronosphere
  • FusionReactor
  • humanitec
  • replex GmbH
  • StackState
  • Datadog
  • GitLab
  • Gremlin
  • StormForger
  • SysEleven GmgH
  • Vamp.io

Contact us


For conference registration information and customer service


For more information on community discounts and trade opportunities with O’Reilly conferences


For information on exhibiting or sponsoring a conference


For media/analyst press inquires