July 20–24, 2015
Portland, OR
Paco Nathan

Paco Nathan
Director, O'Reilly Learning, O'Reilly Media

Website | @pacoid

Paco Nathan is an O’Reilly author (Just Enough Math and Enterprise Data Workflows with Cascading) and a “player/coach” who’s led innovative data teams building large-scale apps. He is director of community evangelism for Apache Spark with Databricks, and an advisor to Amplify Partners. Paco is an expert in machine learning, cluster computing, and enterprise use cases for big data. His interests include Spark, Ag+Data, open data, Mesos, PMML, Cascalog, Scalding, Clojure, Python, Chatbots, and NLP.

Sessions

1:30pm–5:00pm Tuesday, 07/21/2015
Paco Nathan (O'Reilly Media), Haichuan Wang (Huawei), Jacky Li (Huawei technology), Vimal Das Kammath V (Huawei)
This tutorial provides a hands-on introduction to Apache Spark, with coding exercises for Spark apps showing Python, Scala, R, and SQL. We will review the Spark core API, how to build a pipeline with SQL + DataFrames, plus look through the broader Spark ecosystem: Tungsten, Streaming, MLlib, and GraphX. Read more.
10:40am–11:20am Thursday, 07/23/2015
Data Portland 256
Paco Nathan (O'Reilly Media)
Herein, an open source developer community considers itself _algorithmically_. This project shows how to surface data insights from the developer email forums for just about any Apache open source project. It leverages machine learning and advanced analytics in Apache Spark, but also makes use of Docker containers for standalone NLP services. Read more.
4:10pm–5:40pm Thursday, 07/23/2015
Sponsored E 143/144
Paco Nathan (O'Reilly Media), Jacky Li (Huawei technology)
This session provides an introduction to Apache Spark, with a brief overview of how/why it evolved, then covering the Spark core API, with examples in Python and Scala, how to build a pipeline with SQL + DataFrames, plus look through the broader Spark ecosystem: Tungsten, Streaming, MLlib, GraphX, Packages, etc. Plus many links out case studies of production use cases at scale for Spark. Read more.