Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Yellow Pages (Canada): Our journey to speed of thought interactive analytics on top of Hadoop

Richard Langlois (IT Architecture & Strategy)
1:15pm–1:55pm Thursday, 09/29/2016
Location: 1 C04 / 1 C05

What you'll learn

  • Explore an internal use case at Yellow Pages (Canada) that delivers real-time analytics with Tableau, using OLAP on Hadoop and enabled by its stack (HDFS, Parquet, Hive, Impala, and AtScale)
  • Description

    Using Hadoop and other big data technologies, the YP Analytics application allows advertisers and media and advertising consultants to understand their digital presence and ROI. Richard Langlois explains how Yellow Pages (YP) used this expertise for an internal use case that delivers real-time analytics with Tableau, using OLAP on Hadoop and enabled by its stack (HDFS, Parquet, Hive, Impala, and AtScale).

    Yellow Pages’ first big data analytics use case, the YP Analytics application, uses Hadoop (Cloudera) and other big data technologies to help YP’s 244,000 advertisers understand their digital presence (ranking) and ROI with regard to the products and services they use with YP. With the delivery of YP Analytics, YP realized that its nationwide media and advertising consultants (MAC) needed the same information when meeting the advertisers. The MACs were and are still using a different sales application called Compass. In order to ensure information consistencies between these two applications, built by different teams and technologies, the YP team created a series of data services that can be used by any consuming applications, such as YP Analytics and Compass.

    The successes of these applications led YP’s internal teams to ask, “What about us?” For YP Analytics and Compass, all queries were known in advance and always in the context of a merchant or an account, which allowed the team to do multiple optimizations. However, these optimizations were not great for different internal ad hoc queries with other contexts than a merchant or an account, so YP decided to use OLAP on Hadoop. Richard offers an overview of the stack that has enabled OLAP on Hadoop (with more than 75 billion rows in production). The stack includes HDFS, Parquet, Hive, Impala, and AtScale for incredibly fast, real-time analytics and data exploration through Tableau, the tool chosen by YP’s end users. Richard also describes other recent use cases in advanced analytics for marketing campaign automation and sales recommendation engines using Spark, as well as recent work on reducing data analytics silos and experiments with search-based analytics.

    This session is sponsored by Tableau Software.

    Photo of Richard Langlois

    Richard Langlois

    IT Architecture & Strategy

    Richard Langlois is the president of IT Architecture & Strategy, which provides training and consulting services in big data, analytics, BI, enterprise architecture, and data governance. Previously, Richard was the director of search and big data analytics and director of enterprise data management for Yellow Pages (Canada), where his team provided development of solutions, data architecture and governance, and metadata management for all operational and analytics needs of Yellow Pages. Prior to his roles at Yellow Pages, Richard was enterprise architect adviser at National Bank and Desjardins Group and global chief architect at TataCommunications and led the Canadian BI practice at Capgemini. He also worked directly or though consulting mandates at Air Canada, Bell Canada, CN, Canadian Tire, GM, Hydro-Quebec, Investors Group, Seer Technologies, Sikorsky Aircraft, Texas Instruments, Unisys, and multiple government agencies.