Brian Bloechle demonstrates how to implement typical data science workflows using Apache Spark. You’ll learn how to wrangle and explore data using Spark SQL DataFrames and how to build, evaluate, and tune machine learning models using Spark MLlib.
Outline
Demonstrations and exercises will be conducted in Python using Cloudera Data Science Workbench.
Brian Bloechle is an industrial mathematician and data scientist as well as a technical instructor at Cloudera.
Glynn Durham is a senior instructor at Cloudera. Previously, he worked for Oracle, Forté Software, MySQL, and Cloudera, spending five or more years at each.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com