Build resilient systems at scale
May 27–29, 2015 • Santa Clara, CA

Lessons learned for large-scale apps running in a hybrid cloud environment – Intuit’s journey

Dana Quinn (Intuit)
9:55am–10:05am Thursday, 05/28/2015
Location: Mission City Ballroom
Average rating: ***..
(3.31, 45 ratings)
Slides:   1-PPTX 

Although Intuit is still on its journey to the public cloud, there’s enough road behind us to learn from what’s in the rear-view mirror. We’ll share three major lessons we’ve learned as we adopted a hybrid cloud environment, with some workloads shifting to the public cloud while others stayed on our internal servers.

Lesson 1: What workloads to shift to the public cloud?
We started our journey assuming there would be very little we could host in the cloud. Intuit fiercely guards its customers’ sensitive financial data, and cloud security was a big concern.

As we dug in, we realized we had many more workloads that could run in the public cloud than we expected. Not everything touched data, and cloud security was sufficient in some cases. And the number of cloud-capable workloads keeps increasing as we solve encryption challenges.

I’ll give two examples that will illustrate how to choose which of your services have the right combination of security needs and can benefit from cloud advantages.

Lesson 2: Hybrid toolset or cloud native toolset?
Standard hybrid cloud toolsets offer the benefit of using the same tools across both internal and public cloud environments. But is this ease worth what you give up? A cloud-native toolset offers best-of-breed management tools for cloud environments. We chose to use cloud-native tools because we’re able to take advantage of new capabilities and tools as soon as our cloud provider releases them. Another benefit of cloud-native tools is that people you hire will have far more experience with native tools than with hybrid tools. I’ll give examples of how our use of cloud-native tools has given us the capabilities we need.

Lesson 3: Bumps along the way
Watch your spending! You want to give engineering teams the ability to get up and running quickly, so you let them launch infrastructure themselves. And so it’s not surprising that you will quickly have many more servers running than before. Scaling up to run a large load test is great, but someone needs to remember to shut those servers off when they’re no longer needed!

Don’t treat your cloud like your data center — doing things by hand doesn’t work in a cloud environment! Automate automate automate! You need to develop new patterns for key activities.

You have all these new automation capabilities — you need to practice them! Recycle all your instances every month, whether they need it or not. If you don’t, can you trust your automation when you desperately need it? This means you may want to track new metrics, like the average age of your instances, to help track the maturity of automation capabilities.

You probably have more workloads that you can shift to the public cloud than you realize. We recommend using cloud-native toolsets so you can access all your providers’ tools and attract people with the right experience. Lastly, you will have bumps along the way. Learn from them and keep going!

This keynote is sponsored by Intuit

Photo of Dana Quinn

Dana Quinn


Dana has 15 years of experience working in large scale, mission-critical web environments. Dana has been with Intuit for three years as director of App Ops for the CTO Dev organization, leading App Ops teams for a key set of online platforms used by Intuit’s products. Prior to working at Intuit, Dana worked at Yahoo! as senior director of Cloud Service Engineering and as senior Unix systems administrator at PlanetOut. Dana loves living in Oakland, California with his family and enjoys being schooled in Minecraft by his 8-year-old son.