Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY
Mauricio Aristizabal

Mauricio Aristizabal
Data Architect, Impact

Website | @mauincali

Mauricio Aristizabal is the data pipeline architect at Impact (formerly Impact Radius), a marketing technology company that helps brands grow by optimizing their paid marketing and media spend. Mauricio is responsible for massively scaling and modernizing the company’s analytics capabilities, selecting data stores and processing platforms, and designing many of the jobs that process internally and externally captured data and make it available to report and dashboard users, analytic applications, and machine learning jobs. He also assists the operations team with maintaining and tuning its Hadoop and Kafka clusters.

Sessions

2:55pm–3:35pm Wednesday, 09/12/2018
Location: 1A 23/24 Level: Intermediate
Secondary topics:  Data Integration and Data Pipelines
Average rating: **...
(2.67, 3 ratings)
Mauricio Aristizabal shares lessons learned from migrating Impact's traditional ETL platform to a real-time platform on Hadoop (leveraging the full Cloudera EDH stack). Mauricio also discusses the company's data lake in HBase, Spark Streaming jobs (with Spark SQL), using Kudu for "fast data" BI queries, and using Kafka's data bus for loose coupling between components. Read more.