Once upon a time, a company’s analytic infrastructure was stored in a relational database that took an ETL tool to get data into it. Then, the Cambrian explosion of big data technologies came: the cloud, with its expectation of elastic scalability; fast data technologies, which brought an expectation that all data was in near real time; and the machine learning renaissance, with its advanced data-processing techniques. As a result, stakeholders now expect to be able to get the data they need in near real time, along with the tools and capabilities they need to work with it with minimal friction.
Michael Bevilacqua-linn shares an architecture for a cloud-based end-to-end data infrastructure that handles everything from classic analytic use cases to real-time operational analysis to modern machine learning techniques in an elastically scaleable and secure manner. Michael explains how Comcast is continuing to evolve this architecture within the company, where it collects millions of events per second, storing petabytes of data per day and making it all accessible through a variety of tools for a large set of stakeholders.
Michael Bevilacqua-Linn is a software engineer on Facebook’s Canopy team, where he works to scale and expand Facebook’s tracing system and other observability tooling.
Michael has been programming computers ever since he dragged an Apple IIGS that his parents got for opening a bank account into his fifth grade class to explain loops and variables to a bunch of preteenagers. His current hobbies are comprised entirely of changing diapers and expounding upon the benefits of footie pajamas to a skeptical 2 year old.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org