Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Audi's journey to an enterprise big data platform

Carsten Herbe (Audi Business Innovation GmbH), Matthias Graunitz (Audi AG)
14:0514:45 Wednesday, 23 May 2018
Data engineering and architecture
Location: S11B Level: Intermediate

Who is this presentation for?

Anyone building an enterprise big data platform as well as for anyone who is using it and wants to understand the complexity behind the scenes.

Prerequisite knowledge

Some basic understanding of Hadoop, Kafka, Data Warehousing and Business Intelligence is required.

What you'll learn

An overview of the required steps to build an enterprise big data platform enhanced with the experience we made at Audi.

Description

This talk is about Audi’s journey from a first Hadoop PoC to a multi-tenant enterpise platform. Why a Big data platform at all? We explain the requirements that drove the development of this platform and explain the decisions we had to make on this during this journey.

During the process of setting up our big data infrastructure we often had to find the right balance between going for enterprise intregation versus speed. For instance, whether to use the existing Active Directory for both LDAP and KDC versus setting up our own KDC. Using a shared enterprise service like a Active Directory requires to follow certain naming conventions and restricted access, where running our own KDC brings much more flexibility but also adds another component to maintain to our platform. We show the advantages and disadvantages and explain why we’ve decided to chose a certain approach.

For data ingestion of both, batch and streaming data, we use Apache Kafka. We explain why we installed a separated Kafka cluster from our Hortonworks platform. We discuss the pros and cons of using the Kafka binary protocol and the HTTP REST protocol not only from a technical perspective but also from the organisational perspective as the source systems are required to push data into Kafka.

We give an overview of our current architecture including how some use cases are implemented on it.Some of them run exclusively on our new big data stack while others use it in conjunction with our data warehouse. The use cases cover all different kind of data from sensoric data of robots in our plants to click streams from web applications.

Building an enterprise platform does not only consist of technical tasks but also of organizational tasks: data ownership, authorization to access certain data sets or more financial one like internal pricing and SLAs.

Although we have already archieved quite a lot, our journey has not yet ended. There are still some open topics to adress, like providing a unified logging solution for applications spanning mutliple platforms. Or finally offering a notebook like Zeppelin to our analysts, which will require an uprade to the next HDP release. Or adressing legal issues like GDPR.

We will conclude our talk with a short glympse into our ongoing extension of our on-premise platform into a hybrid cloud platform.

Photo of Carsten Herbe

Carsten Herbe

Audi Business Innovation GmbH

Carsten works as a Big Data Architect at Audi Business Innovation GmbH. Audi Business Innovation GmbH, a subsidiary of Audi, is a small company focused on developping new mobility services as well as innovative IT solutions for Audi. Carsten has more than 10 year experience in delivering Data Warehouse and BI solutions to his customers. He started working with Hadoop in 2013 and since then he has focused on both big data infrastructure and solutions. Currently Carsten is helping Audi to build up their Big Data platform based on Hadoop and Kafka. Further, as an solution architect he is responsible for developing and running the first analytical applications on that platform.

Photo of Matthias Graunitz

Matthias Graunitz

Audi AG

Matthias Graunitz (AUDI AG) works as an Architect at Audis Competence Center for Big Data & Business Intelligence. AUDI AG is a German automobile manufacturer that designs, engineers, produces, markets and distributes luxury vehicles. Audi is a member of the Volkswagen Group and has its roots at Ingolstadt, Bavaria, Germany. Audi-branded vehicles are produced in nine production facilities worldwide. Matthias has 10 years+ experience in the field of Business Intelligence and Big Data. He is responsible for the architectural framework of the Hadoop Ecosystem, a separate Kafka Cluster as well as for the data science tool kits provided by the Center of Competence for all business departments at Audi.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)