Enterprises are getting increasingly comfortable with moving traditional workloads to Spark. However, despite its popularity, Spark remains an esoteric technology within enterprises, and many for whom technology is not their core competence, are wary of building internally managed applications on Spark, in part owing to the lack of a steady talent pool and a fear of budget overruns. As such, there is still a constant struggle to balance the ability to support advanced technology platforms within enterprises with matrix organizations, complex funding channels, and business demands.
Vickye Jain and Raghav Sharma explain how they built a very high-performance data processing platform powered by Spark that balances the considerations of extreme performance, speed of development, and cost of maintenance. Vickye and Raghav had to negotiate conflicting objectives such as:
Vickye and Raghav also offer an overview of the architecture itself, which consists of several elastic clusters, external orchestrators providing full visibility into jobs, a combination of job servers and traditional Spark applications, and deep integration with technical experts with domain experts for rapid development.
Vickye Jain is a technology manager at ZS Associates, where he jointly runs the big data expertise center. Vickye has extensive experience implementing large-scale big data platforms for Fortune 200 companies in the US. He and his team have implemented very large-scale ETL offloading use cases, data lakes, and high-performance data processing platforms that have had transformation business impact on commercial, R&D, and operations organizations within life sciences.
Raghav Sharma is a solution delivery manager at ZS Associates, where he specializes in big data platforms, cloud-based analytical solutions, and information architecture and helps lead the delivery of technology consulting engagements in the big data space for life sciences industry clients.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com