While traditional methods may be proficient to collect and analyze uniform data, utilizing multiple structured and unstructured external data sources can be challenging. Joe Caserta explains how one of the largest membership interests groups in the country makes sense of the influx of information from streaming external data sources. This challenge is exciting because aside from collecting data from its ~40 million members, the group also needs to monitor digital and traditional interactions cohesively to predict and optimize a member’s path to purchase.
Path-to-purchase analytics is at the core of the solution to segment and individualize potential member interactions on- and offline and increase high-value member loyalty. Joe outlines the architecture of the ingestion, data lake, data science, and data warehouse components built on AWS and Spark and discusses how his team designed and implemented a data lake in S3, ETL in Spark, member matching with GraphFrames, and a DW in Redshift to help revolutionize the way this membership interest group uses its data to become an analytics-driven company. You’ll learn how organize data within the lake to encourage data science experimentation and create models to increase a lasting engagement with your members.
This session is sponsored by Caserta Concepts.
Joe Caserta is president of Caserta Concepts, an award-winning New York-based innovation consulting and technology implementation firm specializing in big data analytics, data warehousing, business intelligence solutions, and helping clients maximize data value. A recognized big data strategy consultant, author, and educator, Joe is coauthor of the best-selling book The Data Warehouse ETL Toolkit (Wiley, 2004), a contributor to industry publications, and frequent keynote speaker and expert panelist at industry conferences and events. He also serves on the advisory boards of financial and technical institutions and is the organizer and host of the Big Data Warehousing Meetup group in NYC.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.