Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Big data at speed

Ted Malaska (Capital One), Mark Grover (Lyft)
2:00pm–2:40pm Thursday, 09/13/2018
Data engineering and architecture
Location: 1A 06/07 Level: Intermediate
Secondary topics:  Transportation and Logistics

Who is this presentation for?

  • Big data architects and developers

Prerequisite knowledge

  • A basic understanding of the Hadoop ecosystem and streaming architectures

What you'll learn

  • Learn best practices for low-latency big data processing

Description

Big data’s first and most formidable use case remains batch. However, the needs of the industry are changing, focusing on speed (i.e., making decisions as quickly as possible). Many details go into building a big data system for speed, from determining a respectable latency until data access and where to store the data to solving multiregion problems—or even knowing just what data you have and where stream processing fits in. Mark Grover and Ted Malaska share challenges, best practices, and lessons learned doing big data processing and analytics at scale and at speed.

Photo of Ted Malaska

Ted Malaska

Capital One

Ted Malaska is a director of enterprise architecture at Capital One. Previously, he was the director of engineering in the Global Insight Department at Blizzard; principal solutions architect at Cloudera, helping clients find success with the Hadoop ecosystem; and a lead architect at the Financial Industry Regulatory Authority (FINRA). He has contributed code to Apache Flume, Apache Avro, Apache Yarn, Apache HDFS, Apache Spark, Apache Sqoop, and many more. Ted is a coauthor of Hadoop Application Architectures, a frequent speaker at many conferences, and a frequent blogger on data architectures.

Photo of Mark Grover

Mark Grover

Lyft

Mark Grover is a product manager at Lyft. Mark’s a committer on Apache Bigtop, a committer and PPMC member on Apache Spot (incubating), and a committer and PMC member on Apache Sentry. He’s also contributed to a number of open source projects, including Apache Hadoop, Apache Hive, Apache Sqoop, and Apache Flume. He’s a coauthor of Hadoop Application Architectures and wrote a section in Programming Hive. Mark is a sought-after speaker on topics related to big data. He occasionally blogs on topics related to technology.

Comments on this page are now closed.

Comments

Picture of Mark Grover
Mark Grover | PRODUCT MANAGER
09/13/2018 5:40am EDT

Hi all,
Ted and I are super excited to see you all real soon!
We’ve got a lot of fun and somewhat controversial discussions lined up for you. See you soon!

The slides are posted at go.lyft.com/big-data-at-speed