How do you test to ensure the stability of a production platform that currently supports 23 million customers, 19 million brokerage accounts, over 400,000 trades per day, and assets close to 5 trillion dollars? After the flash crash and other market issues, we were asked to figure it out – and soon. We needed to protect our customers and to do it, we needed to test at bigger than production scale.
Even without the scale and the stakes, performance testing can be a daunting task when the data and the infrastructure do not match production. This is especially true in a world-class brokerage system that has evolved over 30 years and mixes the newest technologies with massive legacy back ends. We knew building a duplicate environment for end-to-end production scale testing was prohibitively expensive, but we also knew there had to be a way.
Was it somehow possible to run production peak, end-to-end performance tests using our current platforms? Upon review, we quickly realized that the systems that were, at the time, dedicated to performance testing would not scale to the level we needed, and there was no way to get the results we required. We needed production. The answer was clear. So, we began to explore.
We looked at our DR systems and realized that, coupled with some production front ends, we just might be able to build “production” from a bunch of parts. The more we dug, the better it looked. We found our DR mainframe to be the ideal back-end target, in that the system is constantly synchronized with production, contained all production code, all production data, production equivalent processing power and storage, and support teams that understood how it all worked. We also learned that we could turn up the power and push tests that were 2x and 3x what the market has shown at peak. We would build “production!”
After months of preparation, we ran our first set of back-end focused tests, using production scale load from within the firm. While not perfect out of the gate, it proved highly successful, leading to continued expansion of the program; a program which now encompasses driving full-scale market open testing from the cloud.
Now, volumes that are 1x and 2x of production are average. We developed synthetic users, synthetic accounts, synthetic instruments, the ability to replay production transactions, and even tested multi-site operations – all at a scale equivalent to the entire production capacity of the firm. Our testing has led to findings that have greatly strengthened the stability and the resiliency of our platforms, assuring the best possible experience for our customers.
So, join us to learn if you want to learn how to break your production systems without ruining your company!
Kyle Parrish has worked across many industries and many roles throughout his career, and is currently a performance architect at a global financial services and brokerage firm, working to assure that every customer’s experience is exceptional. He started out running systems and teams in electronic tax filing, migrated into academic and research support, and then ventured into consulting. Now he finds himself driving new test data and process strategies across business units and environments in the financial services industry. When not finding new ways to do old things, he can be found outside with his wife and kids in North Carolina.
Dave Halsey is the VP of Performance Engineering for Fidelity Institutional, a division of Fidelity Investments servicing financial intermediary firms. Dave has over 15 years of experience building and leading Performance Engineering organizations specializing in the financial services industry.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org