Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA
Please log in

Foundations for successful data projects

Jonathan Seidman (Cloudera), Ted Malaska (Capital One)
9:00am12:30pm Tuesday, March 26, 2019
Average rating: ****.
(4.00, 6 ratings)

Who is this presentation for?

  • Technical leads, architects, managers, CTOs, CDOs, CIOs, and developers working on data projects



Prerequisite knowledge

  • Familiarity with data management concepts and systems such as relational databases
  • Experience building large software projects and knowledge of newer data management systems such as Hadoop or Cassandra (useful but not required)

What you'll learn

  • Learn best practices for delivering successful data projects


Most organizations have developed processes and practices for data management and development of large software projects. While many of these processes and practices are still relevant and valuable, the dramatic growth in volume and variety of data, along with new tools to manage this data, have caused these same organizations to struggle to adapt to this new landscape. This includes understanding how to evaluate new data management systems, how to properly staff projects to ensure success, and how to properly evaluate and manage risks when working with these new management systems.

Jonathan Seidman and Ted Malaska share guidelines and practices to provide a path through the process of developing data projects, from planning to implementation. You’ll leave with insights on managing and delivering your own successful data projects based on Jonathan’s and Ted’s years of experience working with multiple companies and customers.

Topics include:

  • Starting the planning process by understanding the key data project types
  • Selecting data management software in the new enterprise data space
  • Managing project risk, including technology risk, team risk, and requirements risk
  • Ensuring integrity of data through your entire data pipelines
  • Ensuring the integrity of data through effective data governance and management of data
Photo of Jonathan Seidman

Jonathan Seidman


Jonathan Seidman is a software engineer on the cloud team at Cloudera. Previously, he was a lead engineer on the big data team at Orbitz, helping to build out the Hadoop clusters supporting the data storage and analysis needs of one of the most heavily trafficked sites on the internet. Jonathan is a cofounder of the Chicago Hadoop User Group and the Chicago Big Data Meetup and a frequent speaker on Hadoop and big data at industry conferences such as Hadoop World, Strata, and OSCON. Jonathan is the coauthor of Hadoop Application Architectures from O’Reilly.

Photo of Ted Malaska

Ted Malaska

Capital One

Ted Malaska is a director of enterprise architecture at Capital One. Previously, he was the director of engineering in the Global Insight Department at Blizzard; principal solutions architect at Cloudera, helping clients find success with the Hadoop ecosystem; and a lead architect at the Financial Industry Regulatory Authority (FINRA). He has contributed code to Apache Flume, Apache Avro, Apache Yarn, Apache HDFS, Apache Spark, Apache Sqoop, and many more. Ted is a coauthor of Hadoop Application Architectures, a frequent speaker at many conferences, and a frequent blogger on data architectures.

Comments on this page are now closed.


Picture of Jonathan Seidman
Jonathan Seidman | SOFTWARE ENGINEER
03/21/2019 11:02am PDT

This tutorial will be a presentation only – there won’t be any pre-requisites for downloading software, etc.

daniella birch | WEB DEVELOPER
03/21/2019 5:48am PDT

are there any technology requirements to follow along with this tutorial?