Presented By O'Reilly and Cloudera
December 5-6, 2016: Training
December 6–8, 2016: Tutorials & Conference
Singapore

Storage designs done right equal faster processing and access

Ted Malaska (Blizzard Entertainment)
4:15pm–4:55pm Wednesday, December 7, 2016
Hadoop use cases
Location: 321/322 Level: Intermediate
Average rating: *****
(5.00, 2 ratings)

Prerequisite Knowledge

  • Familiarity with data modeling

What you'll learn

  • Explore a set of optimal storage design patterns and learn the reasons for the resulting performance and speed of each

Description

The recent advancement in distributed processing engines, from Spark to Impala to Spark Streaming or Storm, has proved exciting. However, if your design only focuses on the processing layer to get speed and power then you may be missing half the story, leaving a significant amount of optimization untapped. Ted Malaska looks down the stack and describes a set of storage design patterns and schemas implemented on HBase, Kudu, Kafka, SolR, HDFS, and S3. By carefully tailoring how data is stored for each use case, processing and access times can be reduced by two to three orders of magnitude.

Photo of Ted Malaska

Ted Malaska

Blizzard Entertainment

Ted Malaska is a group technical architect on the Battle.net team at Blizzard, helping support great titles like World of Warcraft, Overwatch, and HearthStone. Previously, Ted was a principal solutions architect at Cloudera, helping clients find success with the Hadoop ecosystem, and a lead architect at the Financial Industry Regulatory Authority (FINRA). He has also contributed code to Apache Flume, Apache Avro, Apache Yarn, Apache HDFS, Apache Spark, Apache Sqoop, and many more. Ted is a coauthor of Hadoop Application Architectures, a frequent speaker at many conferences, and a frequent blogger on data architectures.

Comments on this page are now closed.

Comments

01/12/2017 9:33pm +08

Hi Ted,
I know this is a little late but are you able to share the slides to this talk?
Best regards,
Anton