Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Spark camp: Apache Spark 2.0 for analytics and text mining with Spark ML

Jacob D Parr (JParr Productions)
9:00am5:00pm Tuesday, March 6, 2018

What you'll learn

  • Explore Apache Spark 2.0 core concepts with a focus on Spark's machine learning library

Description

Jacob Parr introduces you to Apache Spark 2.0 core concepts with a focus on Spark’s machine learning library, using text mining on real-world data as the primary end-to-end use case. Join Jacob to explore and wrangle data using Spark’s DataSet and DataFrame abstractions. You’ll use the Spark ML API to build an ML pipeline to transform free text into useful features via Spark ML’s Transformer abstraction (e.g., one-hot encoding and term frequency counting) and learn about model selection, training and fitting, and validation and inspection, as well as parameter tuning with grid search parameter selection.

The class will consist of approximately 50% hands-on programming labs in Scala and 50% lecture and discussion.

Photo of Jacob D Parr

Jacob D Parr

JParr Productions

Jacob Parr is a Databricks-certified Apache Spark instructor and assists Databricks with the development of its Apache Spark curriculum. Besides teaching Spark, Jacob delivers trainings to dozens of companies in topics spanning REST, microservices, Selenium, and big data. In his more than 20 years in software development, Jacob has developed applications in telecom, real estate, ecommerce, enterprise taxation, and more. Jacob has been a frequent guest speaker at Spark Summit and continues to speak and teach at events like Strata Data Conference. Happily married with three adult children, Jacob and his wife Angela love to travel around the US or simply cuddle up by a warm fire with their two Boston Terriers. In his spare time, Jacob plays practical jokes, flies drones, chases his nieces and nephews with an arsenal of Nerf guns, or simply piddles around with his N scale train sets.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Comments

Shrinivas Deshpande | MANAGER
02/06/2018 9:35am PST

Will this course start with fundamentals or expect attendees to know basic concepts beforehand? How does one get most benefit if not worked on Spark before, but interested?