Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Database migrations don't have to be painful, but the road will be bumpy

Adrian Lungu (Adobe), Serban Teodorescu (Adobe)
3:50pm4:30pm Thursday, March 28, 2019
Secondary topics:  Data Platforms

Who is this presentation for?

Engineers, database engineers, data scientists and product managers

Level

Intermediate

Prerequisite knowledge

For the first part, there are no technical prerequisites. For the second part, operating systems knowledge is important.

What you'll learn

The main takeaway for the audience is the extension of testing in production, applied for databases. They will not only see how it was applied on large scale databases, but also have a real story on how we could push the technology limits having such safety net in place.

Description

“Change is the only constant”… Upgrades are inevitable! Believe it o not it also applies to your database. There are many drivers for such changes: technology stack updates; new data centre or event cloud migrations. A database migration is not an easy task. Preparation and execution takes time, without leaving room for mistakes! And there is no going back! Is that the only way though?

Inspired by the Green / Blue deployment technique, the Adobe Audience Manager team developed an Active / Passive database migration procedure that was successfully applied twice to upgrade the entire database technology stack. The first migration was focused only on software upgrades. For the second upgrade, the team’s confidence was so high that a twist was added – a couple of more changes besides the Cassandra version: AWS instance type, Operating System, disk settings, memory settings, JVM and a few more. What could go wrong after all? Well… ALL of them.

For the first part, Adrian Lungu and Serban Teodorescu describe the migration technique and present an extensible database client that makes all the active / passive management possible with just a configuration change. Yes, you heard it right. No code changes, just configurations.

But a database migration is never a smooth road. Especially when your databases are some 200-nodes Cassandra clusters with hundreds of TB of data and downtime is not an option. The presentation go through a series of tales and lessons learned during the migration of over 500 Cassandra nodes. Most of the incidents are not Cassandra related, but rather generic – hardware, drivers, operating system or the JVM. There are a lot of lessons that we learned the hard way, debugging for days and searching for that metric anomaly or that log line that would give us a hint on what went wrong. From these events, we gained valuable information for any Dev-Ops or Engineer that we’d like to share.

Photo of Adrian Lungu

Adrian Lungu

Adobe

Adrian Lungu is a Computer Scientist at Adobe working with Audience Manager, a leading solution in the DMP market. Ever since he joined the team, over 4 year ago, the Cassandra clusters were his main focus, trying to build a scalable architecture that would keep up with the exponential growth of the product. Adrian holds a degree in Computer Science and Engineering from Politehnica University of Bucharest and a DataStax Certified Apache Cassandra Professional certification.

Photo of Serban Teodorescu

Serban Teodorescu

Adobe

Serban Teodorescu is an SRE at Adobe, where he is part of a small team that manages 20+ Cassandra clusters for Adobe Audience Manager. Before this he was a Python programmer, and he’s still trying to find out how a developer that preferred SQL databases ended up as an SRE for a Cassandra team. Apart from Cassandra and Python he’s also interested in automating infrastructure provisioning with Terraform.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)