O'Reilly、Cloudera 主办
Make Data Work

成为Apache Spark明星路上的技巧 (Tricks of the trade to be an Apache Spark rock star)

This will be presented in English.

Ted Malaska (Capital One)
14:00–14:40 Friday, 2017-07-14
Spark及更多发展 (Spark & beyond), 英文讲话 (Presented in English)
地点: 多功能厅2(Function Room 2) 观众水平 (Level): Intermediate

必要预备知识 (Prerequisite Knowledge)

A working knowledge of Spark

您将学到什么 (What you'll learn)

Learn how to build and run unit tests for Spark

描述 (Description)

编写一个可以让你得到结果的Apache Spark应用程序是一回事。使用本书中的所有技巧让你的应用尽可能快地运行则是另外一回事。本次会议将侧重于讲解这些技巧。


It’s one thing to write an Apache Spark application that gets you to an answer. It’s another thing to know you used all the tricks in the book to make it run as fast as possible. Ted Malaska shares some of those tricks.

Join Ted to discover patterns and approaches that may not be apparent at first glance but that can be game-changing when applied to your use cases. You’ll learn about nested types, multithreading, skew, reducing, Cartesian joins, and other fun stuff.

Photo of Ted Malaska

Ted Malaska

Capital One

Ted Malaska is a director of enterprise architecture at Capital One. Previously, he was the director of engineering in the Global Insight Department at Blizzard; principal solutions architect at Cloudera, helping clients find success with the Hadoop ecosystem; and a lead architect at the Financial Industry Regulatory Authority (FINRA). He has contributed code to Apache Flume, Apache Avro, Apache Yarn, Apache HDFS, Apache Spark, Apache Sqoop, and many more. Ted is a coauthor of Hadoop Application Architectures, a frequent speaker at many conferences, and a frequent blogger on data architectures.



WeChat QRcode


Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2


ORB Data Site