Build & maintain complex distributed systems
17–18 October 2017: Training
18–20 October 2017: Tutorials & Conference
London, UK

SRE classroom: A hands-on tutorial

Salim Virji (Google)
13:3017:00 Wednesday, 18 October 2017
Systems Engineering
Location: King's Suite - Sandringham Level: Intermediate
Average rating: ****.
(4.40, 5 ratings)

Who is this presentation for?

  • Site reliability engineers, system engineers, system administrators, and engineering managers who want to better understand quantitative evaluation of microservices-based projects

Prerequisite knowledge

  • Experience using a text editor
  • Familiarity with the Go programming language (writing and running a "Hello, World" program) and Google Compute Engine

Materials or downloads needed in advance

  • A laptop
  • A GitHub account

What you'll learn

  • Learn to evaluate and build microservice-based systems
  • Understand key SRE principles of failure tolerance, capacity planning, and load balancing


Salim Virji explores the key concepts behind microservices before guiding you through applying the concepts to evaluate and build systems of your own.

Topics include:

  • Consensus in distributed systems
  • Request routing and load balancing
  • Capacity planning
  • Failure tolerance

Salim Virji


Salim Virji is a site reliability engineer at Google working on user-facing applications such as Drive and Spreadsheets. Salim’s experience includes planet-scale storage, low-latency distributed applications, and his favorite, distributed consensus.

Comments on this page are now closed.


30/10/2017 10:50 GMT

It was very good to have real Google SREs available to speak with. In contrast, a lot of the time SRE translates instead to “ops work” or “toil”, which are well understood so more of that sort of thing is not helpful.