Many applications in fields like healthcare, genomics, financial services, self-driving technology, telecommunications, ads services, government, and media are being built on what is popularly known today as the big data stack. What’s unique about the big data stack is that it’s composed of multiple distributed systems and almost every application interacts with multiple distributed systems. For example, an SQL query may interact with Spark for its computational aspects, with YARN for its resource allocation and scheduling aspects and with HDFS or S3 for its data access and I/O aspects. Or a streaming application may interact with Kafka, Flink, and HBase. The nature of such distributed applications is that they interact with many different components that could be independent or interdependent—what’s often referred to in popular literature as “having many moving parts.” Nonetheless, besides the extremely high complexity of the big data stack systems, enterprises need to be able to provision for resources, usage, cost, job scheduling, and so on.
Prediction modeling lets you see the future. Making accurate predictions depends on having the right monitoring data and the right model. Enabling all the monitoring data in the big data stack to be collected and stored in a single place opens up interesting opportunities to apply statistical analysis and learning algorithms to this data. These algorithms can generate insights that, in turn, can be applied manually by the user or automatically.
Shivnath Babu and Alkis Simitsis detail how to build a Magic 8 Ball for the big data stack—a decomposable time series model for optimal cost and resource allocation that offers enterprises a glimpse into their future needs and enables effective and cost-efficient project and operational planning.
Topics include:
Shivnath Babu is the CTO at Unravel Data Systems and an adjunct professor of computer science at Duke University. His research focuses on ease of use and manageability of data-intensive systems, automated problem diagnosis, and cluster sizing for applications running on cloud platforms. Shivnath cofounded Unravel to solve the application management challenges that companies face when they adopt systems like Hadoop and Spark. Unravel originated from the Starfish platform built at Duke, which has been downloaded by over 100 companies. Shivnath has won a US National Science Foundation CAREER Award, three IBM Faculty Awards, and an HP Labs Innovation Research Award.
Alkis Simitsis is a chief scientist for cybersecurity analytics at Micro Focus. Alkis has more than 15 years of experience building innovative information and data management solutions in areas like real-time business intelligence, security, massively parallel processing, systems optimization, data warehousing, graph processing, and web services. He holds 26 US patents and has filed over 50 patent applications in the US and worldwide. He’s published more than 100 papers in refereed international journals and conferences (top publications cited 5,000+ times) and frequently serves in various roles in program committees of top-tier international scientific conferences. He’s also an IEEE senior member and a member of the ACM.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2019, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com