What are the essential components of a data platform? John Akred, Stephen O’Sullivan, and Gary Dusbabek explain how the various parts of the Hadoop, Spark, and big data ecosystems fit together in production to create a data platform supporting batch, interactive, and real-time analytical workloads.
By tracing the flow of data from source to output, John, Stephen, and Gary explore the options and considerations for components, including:
Other topics include:
With over 15 years in advanced analytical applications and architecture, John Akred is dedicated to helping organizations become more data driven. As CTO of Silicon Valley Data Science, John combines deep expertise in analytics and data science with business acumen and dynamic engineering leadership.
A leading expert on big data architectures, Stephen O’Sullivan has 25 years of experience creating scalable, high-availability data and applications solutions. A veteran of Silicon Valley Data Science, @WalmartLabs, Sun, and Yahoo. Stephen is an independent adviser to enterprises on all things data..
An Apache Cassandra committer and PMC member, Gary Dusbabek specializes in building distributed systems. His recent experience includes creating an open source high-volume metrics processing pipeline and building out several geographically distributed API services in the cloud.
Comments on this page are now closed.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.