Presented By O'Reilly and Cloudera
Make Data Work
5–7 May, 2015 • London, UK

Hadoop Platform conference sessions

A deep dive into the dominant big data stack, with practical lessons, integration tricks, and glimpse of the road ahead.

Tuesday, 05 May

Add to your personal schedule
9:00–12:30 Tuesday, 5/05/2015
Location: King's Suite - Balmoral
Gwen Shapira (Confluent), Mark Grover (Cloudera), Ted Malaska (Blizzard), Jonathan Seidman (Cloudera)
Average rating: ****.
(4.20, 20 ratings)
This tutorial will be valuable for developers, architects, or project leads who are already knowledgeable about Hadoop and are now looking for more insight into how it can be leveraged to implement real-world applications. Read more.
Add to your personal schedule
13:30–17:00 Tuesday, 5/05/2015
Location: King's Suite - Balmoral
Tom White (Cloudera), Joey Echeverria (Rocana), Ryan Blue (Cloudera)
Average rating: ***..
(3.50, 12 ratings)
In the second (afternoon) half of the Architecture Day tutorial, attendees will apply the best practices they learned in the morning session to build a data application for sessionizing user data. Read more.

Wednesday, 06 May

Add to your personal schedule
10:55–11:35 Wednesday, 6/05/2015
Location: King's Suite - Sandringham
Marcel Kornacker (Cloudera)
Average rating: ****.
(4.08, 13 ratings)
In this talk, attendees will learn about Impala’s approach to on-the-fly, automatic data transformation, which in conjunction with the ability to handle nested structures such as JSON and XML documents, addresses the needs of at-source analytics — including direct querying of your input schema, immediate querying of data as it lands in HDFS, and high performance on par with specialized engines. Read more.
Add to your personal schedule
11:45–12:25 Wednesday, 6/05/2015
Location: King's Suite - Sandringham
Average rating: ****.
(4.50, 4 ratings)
Cloudera Impala can be considered as an alternative solution to a relational database for data warehouse-like workloads. The CERN database community did a close evaluation of the Impala engine in respect to CERN's needs. In this presentation we will discuss our experience with the technology, and will report on a queries performance in comparison to data access using an Oracle RDBMS. Read more.
Add to your personal schedule
13:45–14:25 Wednesday, 6/05/2015
Location: King's Suite - Sandringham
Luke Han (Kyligence Inc), Yang Li (eBay)
Average rating: ****.
(4.00, 4 ratings)
Apache Kylin is an open source distributed analytics engine contributed by eBay Inc. that provides SQL interface and multi-dimensional analysis (OLAP) on Hadoop, supporting extremely large datasets. It was accepted as an Apache Incubator Project on Nov 25, 2014. Website: http://kylin.io Read more.
Add to your personal schedule
14:35–15:15 Wednesday, 6/05/2015
Location: King's Suite - Sandringham
Yanpei Chen (Cloudera), Dileep Kumar (Cloudera Inc)
Average rating: ****.
(4.50, 2 ratings)
SQL-on-Hadoop systems that support business intelligence (BI) use cases must handle hundreds or even thousands of concurrent users. We will talk about how to scale your SQL-on-Hadoop system to a large number of concurrent users, and how to verify that your system can support BI. Read more.
Add to your personal schedule
16:15–16:55 Wednesday, 6/05/2015
Location: King's Suite - Sandringham
Jairam Ranganathan (Cloudera)
Average rating: *****
(5.00, 3 ratings)
With hundreds of developers from a variety of organizations participating, Hadoop moves quickly. This talk will survey the important changes admins and users should be aware of and their impacts on various use cases. Read more.
Add to your personal schedule
17:05–17:45 Wednesday, 6/05/2015
Location: King's Suite - Sandringham
Neil Martin (comparethemarket.com)
Average rating: ***..
(3.00, 1 rating)
Compare the Market’s senior project manager Neil Martin will present the lessons learned whilst delivering a successful yet complex multifaceted project to reinvigorate the organization’s data infrastructure. Read more.

Thursday, 07 May

Add to your personal schedule
10:55–11:35 Thursday, 7/05/2015
Location: King's Suite - Sandringham
Joanne Hannaford (Goldman Sachs)
Average rating: ****.
(4.00, 7 ratings)
Goldman Sachs is a leading global investment banking, securities, and investment management firm that provides a wide range of financial services. Goldman executes hundreds of millions of financial transactions per day across nearly every market in the world. Learn how Goldman is harnessing knowledge, data, and compute power to maintain and increase its competitive edge. Read more.
Add to your personal schedule
11:45–12:25 Thursday, 7/05/2015
Location: King's Suite - Sandringham
Mark Samson (Cloudera)
Average rating: ****.
(4.50, 4 ratings)
The Hadoop ecosystem makes it possible to build an enterprise data hub capable of storing and analysing a wide variety of data. However, a platform with such broad capability triggers a question: how to organise the myriad data sets in a way that allows users to explore and access the data they need? This session will propose an information architecture for Hadoop that enables this. Read more.
Add to your personal schedule
13:45–14:25 Thursday, 7/05/2015
Location: King's Suite - Sandringham
Joey Echeverria (Rocana)
Average rating: ****.
(4.80, 5 ratings)
As the volume of data and number of applications moving to Apache Hadoop has increased, so has the need to secure that data and those applications. In this presentation, we'll take a brief look at where Hadoop security is today and then peer into the future. Read more.
Add to your personal schedule
14:35–15:15 Thursday, 7/05/2015
Location: King's Suite - Sandringham
Charles Lamb (Cloudera), Andrew Wang (Cloudera)
Average rating: ****.
(4.50, 2 ratings)
Encryption is a requirement for many business sectors dealing with confidential information. To meet these requirements, transparent, end-to-end encryption was added to HDFS. This protects data while it is in-flight and at-rest, and can be used compatibly with existing Hadoop apps. We will cover the design and implementation of transparent encryption in HDFS, as well as performance results. Read more.
Add to your personal schedule
16:15–16:55 Thursday, 7/05/2015
Location: King's Suite - Sandringham
Alan Gates (Hortonworks)
Average rating: ****.
(4.00, 2 ratings)
Starting in Hive 0.14, insert values, update, and delete have been added to Hive SQL. In addition, ACID compliant transactions have been added so users get a consistent view of data while reading and writing. This talk will cover the intended use cases, architecture, and performance of insert, update, and delete in Hive. Read more.