Mar 15–18, 2020

Data engineering workshop (Day 2)

Location: Winchester 1/2

Who is this presentation for?

Data engineers, data architects, developers




Day 1

Data ingestion

  • Learn to ingest structured data
  • Learn to ingest semistructured data

Data exploration

  • Explore data with Spark
  • Explore data with Hive
  • Build dashboards

Data batch pipelines

  • Build batch data pipelines
  • Orchestrate data pipelines with Airflow

Processing your data

  • Join structured and unstructured datasets
  • Deriving value from your joined datasets

Day 2

Optimizing your cloud platform

  • Autoscaling rules
  • Managing heterogeneous clusters
  • Estimating cost

Fill your data lake with batch and streaming data

  • Learn how to take advantage of batch and streaming datasets

Data mining

  • Learn how to use Spark MLlib for data mining
  • Build a simple recommendation engine

Contest awards

Prerequisite knowledge

  • Experience with object oriented programming and writing SQL
  • A Gmail account (for sign up and account enablement)
  • A basic understanding of big data, Apache Spark, Apache Hive, Spark SQL, and cloud computing (useful but not required)

What you'll learn

  • Learn how to ingest data, build data pipelines, and deploy analytics and machine learning applications using popular data processing engines such as Apache Spark and Hive
Photo of Jorge Villamariona

Jorge Villamariona


Jorge Villamariona is a senior technical marketing engineer on the product marketing team at Qubole. Over the years, Jorge has acquired extensive experience in relational databases, business intelligence, big data engines, ETL, and CRM systems. He enjoys complex data challenges and helping customers gain greater insight and value from their existing data.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

For conference registration information and customer service

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

For media/analyst press inquires