Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Path-to-purchase analytics using a data lake and Spark

Joe Caserta (Caserta Concepts)
2:05pm–2:45pm Thursday, 09/29/2016
Sponsored
Location: 1B 03/04
Average rating: *****
(5.00, 1 rating)

What you'll learn

  • Learn how one of the largest membership interests groups in the country makes sense of the influx of information from streaming external data sources
  • Description

    While traditional methods may be proficient to collect and analyze uniform data, utilizing multiple structured and unstructured external data sources can be challenging. Joe Caserta explains how one of the largest membership interests groups in the country makes sense of the influx of information from streaming external data sources. This challenge is exciting because aside from collecting data from its ~40 million members, the group also needs to monitor digital and traditional interactions cohesively to predict and optimize a member’s path to purchase.

    Path-to-purchase analytics is at the core of the solution to segment and individualize potential member interactions on- and offline and increase high-value member loyalty. Joe outlines the architecture of the ingestion, data lake, data science, and data warehouse components built on AWS and Spark and discusses how his team designed and implemented a data lake in S3, ETL in Spark, member matching with GraphFrames, and a DW in Redshift to help revolutionize the way this membership interest group uses its data to become an analytics-driven company. You’ll learn how organize data within the lake to encourage data science experimentation and create models to increase a lasting engagement with your members.

    This session is sponsored by Caserta Concepts.

    Photo of Joe Caserta

    Joe Caserta

    Caserta Concepts

    Joe Caserta is president of Caserta Concepts, an award-winning New York-based innovation consulting and technology implementation firm specializing in big data analytics, data warehousing, business intelligence solutions, and helping clients maximize data value. A recognized big data strategy consultant, author, and educator, Joe is coauthor of the best-selling book The Data Warehouse ETL Toolkit (Wiley, 2004), a contributor to industry publications, and frequent keynote speaker and expert panelist at industry conferences and events. He also serves on the advisory boards of financial and technical institutions and is the organizer and host of the Big Data Warehousing Meetup group in NYC.