Presented By O'Reilly and Cloudera
Make Data Work
Feb 17–20, 2015 • San Jose, CA

Strata + Hadoop World in San Jose 2015 Schedule

Use the calendar icon [calendar icon] next to each listing you want to attend. Then use the personal schedule button below to generate your schedule.

Our friends at Dato developed a prototype for a session recommender for Strata + Hadoop World. Check it out and send us any feedback via @strataconf

Schedule Views

List Grid

  or 
All Topics

Tuesday, 02/17/2015

9:00am

Add to your personal schedule
9:00am–5:00pm Tuesday, 02/17/2015
Training
Location: 211 A
Dustin Clute (Cloudera), Michael Judd (Cloudera)
Average rating: ***..
(3.00, 1 rating)
Cloudera University’s four-day course for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub. Read more.
Add to your personal schedule
9:00am–5:00pm Tuesday, 02/17/2015
Training
Location: 211 B
Average rating: ****.
(4.67, 3 ratings)
Learn to develop machine learning, exploratory and predictive models at scale on data stored in-memory. This hands-on course will address exploratory statistical modeling with SAS Visual Statistics, a GUI designed for rapidly screening models and segments. Read more.
Add to your personal schedule
9:00am–5:00pm Tuesday, 02/17/2015
Training
Location: 211 C
Sameer Farooqui (Databricks), Jesse Anderson (Smoking Hand)
This three-day curriculum features advanced lectures and hands-on technical exercises for advanced Spark usage in data exploration, analysis, and building Big Data applications. Course materials emphasize architectural design patterns and best practices for leveraging Spark in the context of other popular, complementary frameworks for building and managing Enterprise data workflows. Read more.

Wednesday, 02/18/2015

8:00am

8:00am–9:00am Wednesday, 02/18/2015
Location: Coffee Break
Coffee Break Sponsored by Dataguise (1h)

9:00am

Add to your personal schedule
9:00am–5:00pm Wednesday, 02/18/2015
Training
Location: 211 A
Dustin Clute (Cloudera), Michael Judd (Cloudera)
Cloudera University’s four-day course for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub. Read more.
Add to your personal schedule
9:00am–5:00pm Wednesday, 02/18/2015
Training
Location: 211 B
Average rating: ***..
(3.00, 2 ratings)
Learn to develop machine learning, exploratory and predictive models at scale on data stored in-memory. This hands-on course will address exploratory statistical modeling with SAS Visual Statistics, a GUI designed for rapidly screening models and segments. Read more.
Add to your personal schedule
9:00am–5:00pm Wednesday, 02/18/2015
Training
Location: 211 C
Sameer Farooqui (Databricks), Jesse Anderson (Smoking Hand)
This three-day curriculum features advanced lectures and hands-on technical exercises for advanced Spark usage in data exploration, analysis, and building Big Data applications. Course materials emphasize architectural design patterns and best practices for leveraging Spark in the context of other popular, complementary frameworks for building and managing Enterprise data workflows. Read more.
Add to your personal schedule
9:00am–5:00pm Wednesday, 02/18/2015
Data Science
Location: LL20 A
Garrett Grolemund (RStudio), Nicholas Horton (Amherst College ), Winston Chang (RStudio)
Average rating: ****.
(4.36, 11 ratings)
From advanced visualization, collaboration, reproducibility to data manipulation, R Day at Strata covers a raft of current topics that analysts and R users need to pay attention to. The R Day tutorials come from leading luminaries and R committers, the folks keeping the R ecosystem apace of the challenges facing analysts and others who work with data. Read more.
Add to your personal schedule
9:00am–5:00pm Wednesday, 02/18/2015
Hardcore Data Science
Location: LL20 BC
Ben Lorica (O'Reilly Media), Ben Recht (University of California, Berkeley), Chris Re (Stanford University), Maya Gupta (Google), Alyosha Efros (UC Berkeley), Eamonn Keogh (University of California - Riverside), John Myles White (Facebook), Fei-Fei Li (Stanford University), Tara Sainath (Google), Michael Jordan (UC Berkeley), Anima Anandkumar (UC Irvine), John Canny (UC Berkeley), David Andrzejewski (Sumo Logic)
Average rating: ****.
(4.86, 7 ratings)
All-Day: Strata's regular data science track has great talks with real world experience from leading edge speakers. But we didn't just stop there—we added the Hardcore Data Science day to give you a chance to go even deeper. The Hardcore day will add new techniques and technologies to your data science toolbox, shared by leading data science practitioners from startups, industry, consulting... Read more.
Add to your personal schedule
9:00am–5:00pm Wednesday, 02/18/2015
Data Science
Location: LL20 D
Shawn Scully (Dato), Carlos Guestrin (Dato), Alice Zheng (Dato), Chris DuBois (Dato), Yucheng Low (Dato)
Average rating: ****.
(4.43, 7 ratings)
This all-day, hands-on training program provides a quick start to building and deploying predictive applications at scale. You will learn simple and effective ways of building powerful machine learning models and deployment them. We will walk you through all the steps of prototyping and production: data cleaning, feature engineerings, model building and evaluation, and deployment. Read more.
Add to your personal schedule
9:00am–5:00pm Wednesday, 02/18/2015
Data Science
Location: LL21 B
Andreas Mueller (NYU, scikit-learn), Jennifer Klay (Cal Poly San Luis Obispo), Peter Wang (Continuum Analytics, Inc.), Travis Oliphant (Continuum Analytics, Inc.), Andy Terrel (Fashion Metric), Matthew Rocklin (Continuum), William McKinney (Cloudera), Stefan van der Walt (UC Berkeley), Jonathan Frederic (IPython), Kyle Kelley (Rackspace)
Average rating: ****.
(4.62, 8 ratings)
Python has become an increasingly important part of the data engineer and analytic tool landscape. Pydata at Strata provides in-depth coverage of the tools and techniques gaining traction with the data audience, including iPython Notebook, NumPy/matplotlib for visualization, SciPy, scikit-learn, and how to scale Python performance, including how to handle large, distributed data sets. Read more.
Add to your personal schedule
9:00am–5:00pm Wednesday, 02/18/2015
Data-Driven Business Day
Location: LL21 C/D
Alistair Croll (Solve For Interesting), Cait O'Riordan (Shazam), Lutz Finger (LinkedIn), Kuang Chen (Captricity), Emi Nomura (Jawbone), AJ Loiacono (Truveris), Rosie Atkins (Groupon), Anne Johnson (Credit Suisse), Jerry Overton (CSC), Ann Johnson (Interana), Mark Madsen (Third Nature), Leah Hunter (Tech Journalist), Ellen Friedman (Independent), India Swearingen (United Way of the Bay Area), Satyam Priyadarshy (Halliburton), Joerg Blumtritt (Datarella)
Average rating: ***..
(3.50, 18 ratings)
All-Day: For business strategists, marketers, product managers, and entrepreneurs, Data-Driven Business looks at how to use data to make better business decisions faster. Packed with case studies, panels, and eye-opening presentations, this fast-paced day focuses on how to solve today's thorniest business problems with Big Data. It's the missing MBA for a data-driven, always-on business world. Read more.
Add to your personal schedule
9:00am–5:00pm Wednesday, 02/18/2015
Hadoop & Beyond
Location: LL21 E/F
Paco Nathan (O'Reilly Media), Holden Karau (IBM), Krishna Sankar (Volvo Cars), Reza Zadeh (Stanford University), Denny Lee (Concur Technologies), Chris Fregly (Flux Capacitor AI)
Average rating: ***..
(3.71, 17 ratings)
A full-day, hands-on tutorial introducing Apache Spark and libraries for building workflows: Spark SQL, Spark Streaming, MLlib, GraphX, etc. Read more.
Add to your personal schedule
9:00am–12:30pm Wednesday, 02/18/2015
Hadoop & Beyond
Location: 210 A/E
John Russell (Cloudera), Alan Choi (Cloudera)
Average rating: *....
(1.80, 5 ratings)
Impala is the massively parallel analytic database delivering interactive performance on Hadoop. In this half-day tutorial, we'll walk you through hands-on exercises, taking you from zero to up and running with Impala. Read more.
Add to your personal schedule
9:00am–12:30pm Wednesday, 02/18/2015
Hadoop in Action
Location: 210 D/H
Mark Grover (Cloudera), Jonathan Seidman (Cloudera), Gwen Shapira (Confluent), Ted Malaska (Cloudera)
Average rating: ****.
(4.54, 13 ratings)
Are you looking for a deeper understanding of how to integrate components in the Apache Hadoop ecosystem to implement data management and processing solutions? Then this tutorial is for you. We'll provide a clickstream analytics example illustrating how to architect solutions with Apache Hadoop along with providing best practices and recommendations for using Hadoop and related tools. Read more.
Add to your personal schedule
9:00am–12:30pm Wednesday, 02/18/2015
Hadoop & Beyond
Location: 210 B/F
Patrick McFadin (Datastax)
Average rating: ***..
(3.62, 8 ratings)
Apache Cassandra has proven to be one of the best solutions for storing and retrieving time series data. Add in Apache Spark and Kafka, you have an amazing time series solution. We will talk data models, go through deployment and code to build a functional, real-time application. Languages used: Java, Scala Read more.
Add to your personal schedule
9:00am–12:30pm Wednesday, 02/18/2015
Design & Interfaces
Location: 210 C/G
Jonathan Dinu (Zipfian Academy)
Average rating: **...
(2.43, 7 ratings)
The best insight you produce is only as good as your ability to explain it. As data scientists and engineers, our task is not only to execute robust analyses, but also to convince decision-makers to act on data. Through an example-driven approach, attendees will examine features of great graphics, techniques of effective visualization, and learn to use D3.js to create their own data narrative. Read more.

1:30pm

Add to your personal schedule
1:30pm–5:00pm Wednesday, 02/18/2015
Hadoop Platform
Location: 210 A/E
Kathleen Ting (Cloudera), Philip Zeyliger (Cloudera), Philip Langdale (Cloudera, Inc.), Miklos Christine (Databricks)
Average rating: ****.
(4.33, 6 ratings)
Hadoop is emerging as the standard for big data processing & analytics. However, as usage of the Hadoop clusters grow, so do the demands of managing and monitoring these systems. In this tutorial, attendees will get an overview of all phases for successfully managing Hadoop clusters, with an emphasis on production systems. Read more.
Add to your personal schedule
1:30pm–5:00pm Wednesday, 02/18/2015
Hadoop Platform
Location: 210 D/H
Tom White (Cloudera), Joey Echeverria (Rocana), Ryan Blue (Cloudera)
Average rating: ***..
(3.67, 3 ratings)
In the second (afternoon) half of the Architecture Day tutorial, attendees will apply the best practices they learned in the morning session to build a data application for sessionizing user data. Read more.
Add to your personal schedule
1:30pm–5:00pm Wednesday, 02/18/2015
Data Science
Location: 210 B/F
Kurt Hurtado (Elasticsearch Inc), Tal Levy (Elasticsearch)
Average rating: ****.
(4.50, 4 ratings)
This tutorial will provide an introduction to the individual components of the ELK stack followed by a discussion of use cases and a hands-on lab. This includes installing and configuring Elasticsearch, Logstash, and Kibana. Read more.
Add to your personal schedule
1:30pm–5:00pm Wednesday, 02/18/2015
Hadoop Platform
Location: 210 C/G
Manu Mukerji (TiVo), John Akred (Silicon Valley Data Science), Stephen O'Sullivan (Silicon Valley Data Science)
Average rating: ***..
(3.25, 16 ratings)
What are the essential components of a data platform? This tutorial will explain how the various parts of the Hadoop and big data ecosystem fit together in production to create a data platform supporting batch, interactive and realtime analytical workloads. Read more.

5:00pm

Add to your personal schedule
5:00pm–6:30pm Wednesday, 02/18/2015
Events
Location: Expo Hall (Hall 1,2,3)
Average rating: ****.
(4.00, 1 rating)
Grab a drink, mingle with fellow Strata + Hadoop World participants, and see the latest technologies and products from leading companies in the data space. Read more.

6:30pm

Add to your personal schedule
6:30pm–8:00pm Wednesday, 02/18/2015
Events
Location: Grand Ballroom 220
Average rating: ***..
(3.67, 3 ratings)
What new companies are at the leading edge of the data space? Meet some of the best, most innovative founders as they demonstrate their game-changing ideas at the Startup Showcase. Read more.
Add to your personal schedule
6:30pm–8:00pm Wednesday, 02/18/2015
Events
Location: Lower Level Foyer
Average rating: *****
(5.00, 1 rating)
If you’re a woman looking for like-minded communities to join, c’mon down to our meetup on Wednesday evening after the Opening Reception for more appetizers, drinks, and networking. In addition to meeting other women (and men) in the community, you’ll hear lightning talks from representatives of groups, companies, and projects that support diversity in the technology community. Read more.

Thursday, 02/19/2015

6:30am

Add to your personal schedule
6:30am–7:30am Thursday, 02/19/2015
Events
Location: W. San Carlos St.
Average rating: *****
(5.00, 5 ratings)
Please join Cloudera and O'Reilly Media for the Data Dash run / walk, held in conjunction with Strata + Hadoop World San Jose 2015. Read more.

8:45am

Add to your personal schedule
8:45am–8:55am Thursday, 02/19/2015
Keynotes
Location: Grand Ballroom 220
Roger Magoulas (O'Reilly Media), Doug Cutting (Cloudera), Alistair Croll (Solve For Interesting)
Average rating: ***..
(3.12, 8 ratings)
Program Chairs Roger Magoulas, Doug Cutting, and Alistair Croll welcome you to the first day of Strata + Hadoop World Keynotes. Read more.

8:55am

Add to your personal schedule
8:55am–9:10am Thursday, 02/19/2015
Keynotes
Location: Grand Ballroom 220
Amr Awadallah (Cloudera, Inc.)
Average rating: **...
(2.77, 26 ratings)
As Hadoop and the surrounding projects & vendors mature, their impact on the data management sector is growing. Amr will talk about his views on how that impact will change over the next five years. How central will Hadoop be to the data center of 2020? What industries will benefit most? Which technologies are at risk of displacement or encroachment? Read more.

9:00am

Add to your personal schedule
9:00am–5:00pm Thursday, 02/19/2015
Training
Location: 211 A
Dustin Clute (Cloudera), Michael Judd (Cloudera)
Cloudera University’s four-day course for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub. Read more.
Add to your personal schedule
9:00am–5:00pm Thursday, 02/19/2015
Training
Location: 211 C
Sameer Farooqui (Databricks), Jesse Anderson (Smoking Hand)
This three-day curriculum features advanced lectures and hands-on technical exercises for advanced Spark usage in data exploration, analysis, and building Big Data applications. Course materials emphasize architectural design patterns and best practices for leveraging Spark in the context of other popular, complementary frameworks for building and managing Enterprise data workflows. Read more.

9:10am

Add to your personal schedule
9:10am–9:15am Thursday, 02/19/2015
Keynotes, Sponsored
Location: Grand Ballroom 220
Eric Frenkiel (MemSQL)
Average rating: **...
(2.57, 21 ratings)
MemSQL CEO Eric Frenkiel will discuss the need for simplicity in enterprise data architecture, the convergence of transactions and analytics, and what is required to operationalize Spark and Hadoop in the enterprise.pipelines by integrating their technology with Hadoop, and Spark. Read more.

9:15am

Add to your personal schedule
9:15am–9:25am Thursday, 02/19/2015
Keynotes
Location: Grand Ballroom 220
Lisa Hammitt (Salesforce)
Average rating: **...
(2.44, 34 ratings)
Wearables contribute to Big Data and the insights are already realizing significant gains in key industries. Read more.

9:25am

Add to your personal schedule
9:25am–9:35am Thursday, 02/19/2015
Keynotes, Sponsored
Location: Grand Ballroom 220
Anil Gadre (MapR)
Average rating: ***..
(3.10, 21 ratings)
To get value out of today’s big and fast data, organizations must evolve beyond traditional analytic cycles that are heavy with data transformation and schema management. . . Read more.

9:35am

Add to your personal schedule
9:35am–9:40am Thursday, 02/19/2015
Keynotes, Sponsored
Location: Grand Ballroom 220
Average rating: *....
(1.92, 25 ratings)
In a landmark partnership, IBM and Twitter are combining advances in analytics, cloud and cognitive computing in a manner that has the potential to transform how institutions understand customers, markets and trends. Adam Kocoloski, CTO of IBM Cloud Data Services and co-founder of Cloudant will explain how when it comes to gaining insights from Big Data, the future is brighter than we know. Read more.

9:40am

Add to your personal schedule
9:40am–9:50am Thursday, 02/19/2015
Keynotes
Location: Grand Ballroom 220
DJ Patil (White House Office of Science and Technology Policy)
Average rating: ****.
(4.12, 33 ratings)
Data Science, where are we going? What impact can we expect? Read more.

9:50am

Add to your personal schedule
9:50am–10:00am Thursday, 02/19/2015
Keynotes
Location: Grand Ballroom 220
Solomon Hsiang (UC Berkeley)
Average rating: ***..
(3.55, 22 ratings)
Advances in data science empower leaders to make better decisions for society. By using new kinds of information unavailable during the last several millennia of government, we can avoid mistakes of the past. We will discuss how data and statistical inference are informing how we manage the global climate rationally, a defining policy challenge for our generation. Read more.

10:00am

Add to your personal schedule
10:00am–10:10am Thursday, 02/19/2015
Keynotes
Location: Grand Ballroom 220
Poppy Crum (Dolby Laboratories | Stanford University)
Average rating: ***..
(3.48, 21 ratings)
Our experience of the sensory world does not need to be constrained by our physical limitations. When navigating the environment our senses interact to perceive a robust non-veridical experience. Understanding these interactions and being able to define them perceptually and algorithmically allows technological developments that can facilitate sensory enhancement and optimization. Read more.

10:40am

Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Data Science
Location: LL20 A
Shankar Vedaraman (Netflix), Christopher Colburn (Netflix)
Average rating: ****.
(4.65, 23 ratings)
In this session we will talk through the challenges of anomaly detection in high cardinality dimensions, and specifically how we derive value through a combination of data science and traditional business intelligence. Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Business & Industry
Location: LL20 BC
Ross Fubini (Canaan Partners), Ari Gesher (Palantir Technologies), Wei Zheng (Trifacta), Omer Trajman (ScalingData), Sylvain Le Borgne (Havas Media)
Average rating: *****
(5.00, 2 ratings)
Big Data is existing it's buzz word phase and we are seeing applications which use big data infrastructure to power every day lives. This is a discussion from the front lines with panelists from industry and startups describing real deployed application powered by big data, but which are happy to be hiding the elephant behind beautiful interfaces. Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Sponsored
Location: LL20 D
Eric Frenkiel (MemSQL)
Average rating: ***..
(3.67, 3 ratings)
This session will cover approaches to building real-time pipelines with MemSQL, Hadoop, and Spark, including: How Novus built the premier financial portfolio management platform using MemSQL as a real-time data store and query engine Introduction to the MemSQL Spark connector Strategies for integrating Spark and Hadoop with real-time systems for transaction processing and operational analytics Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Law, Ethics & Open Data
Location: LL21 B
Laura Fennell (Intuit), Bill Loconzolo (Intuit)
Average rating: *****
(5.00, 5 ratings)
When your company stores some of the most sensitive customer data that exists, how do you build game changing big data innovations while maintaining customer trust and loyalty? Combine the two groups responsible for that vision--legal and data science--and unite them toward a common goal! We'll discuss how Intuit turned the typical data-legal model on its head to boost data-driven innovation. Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Enterprise Adoption
Location: LL21 C/D
Douglas Turnbull (OpenSource Connections)
Average rating: **...
(2.33, 3 ratings)
Today we've got NoSQL. But relational databases were the noSomething. What was that something? Why and where did relational databases come from? Then why years later are we seemingly focused on rejecting the lessons that led us to relational databases? This talk Lessons from the past that help strike a balance between the dueling promises of SQL and noSQL. Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Design & Interfaces
Location: LL21 E/F
Average rating: ***..
(3.71, 7 ratings)
The most frustrating part of data science is when customers don’t “get it”: endless revisions, recommendations not implemented, or data products not adopted. Exciting new research in neurology, cognitive psychology, and behavioral economics have a lot to say about why. We’ll explore the findings and implications for designing more successful “human-data interfaces.” Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Hadoop in Action
Location: 210 A/E
Josh Baer (Spotify), Rafal Wojdyla (Spotify)
Average rating: ****.
(4.25, 4 ratings)
There's many confusing and painful things about setting up and operating a 900 node Hadoop cluster used as the centerpiece in many of Spotify's Big Data initiatives, we'll go over a few interesting stories and frustrations which have influenced the direction of our architectural choices and the lessons we've learned from them. Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Sponsored
Location: 210 D/H
Ted Dunning (MapR Technologies), Ellen Friedman (Independent)
What’s important about a technology is what you can use it to do. We’ve looked at what a number of groups are doing with Apache Hadoop and NoSQL in production, and we’d like to relay what worked well for them and what did not. . . Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Hadoop Platform
Location: 210 B/F
Aaron Myers (Cloudera, Inc.), Daniel Templeton (Cloudera, Inc.)
Average rating: ***..
(3.00, 1 rating)
The Hadoop ecosystem is a vibrant and growing set of tools for taming data at massive scales. It's also less than straightforward at times. During this talk we'll take a light-hearted and interactive plunge into the dark corners of Hadoop to shine light on some of the trap doors and blind alleys one may encounter in the wild. Attendees will leave dazed, confused, and a hopefully little wiser. Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Spark in Action
Location: 210 C/G
Reynold Xin (Databricks), Matei Zaharia (Databricks)
Average rating: ****.
(4.00, 8 ratings)
Spark users have been pushing the boundary of data analytics. In this talk, we focus on the scalability dimension, including: - Multiple real-world use cases on PBs of data and on clusters with thousands of machines - Configuration and performance tuning tips learned from these deployments - Changes in recent releases of Spark for better handling of these workloads Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Machine Data / IoT
Location: 230 A
Chad Meley (Teradata), John Kreisa (Hortonworks)
Hadoop and The Internet of Things has enabled data driven companies to leverage new data sources and apply new analytical techniques in creative ways that provide competitive advantage. We will discuss real world case studies from the field that describe the strategies, architectures, and results from forward thinking companies across a variety of verticals. Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Sponsored
Location: 230 B
Lance Olson (Microsoft)
Average rating: **...
(2.00, 1 rating)
In this session, we will show you how easy it is to spin up a 32 node Storm cluster and give all attendees a free unlimited 30-day pass to deploy your own Hadoop cluster on Microsoft Azure. Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Hadoop & Beyond
Location: 230 C
Jay Kreps (Confluent)
Average rating: ****.
(4.85, 13 ratings)
What happens if you take everything that is happening in your company--every click, every impression, every database change, every application log--and make it all available as a real-time stream of well structured data? Companies such as LinkedIn have done this experiment and this talk will describe how this changes the way data is thought about and put to use in an organization. Read more.
Add to your personal schedule
10:40am–11:20am Thursday, 02/19/2015
Sponsored
Location: LL21 A
HP will discuss two innovations to help you take on analytics for your Hadoop Data. Read more.

11:30am

Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Data Science
Location: LL20 A
Adam Silberstein (Trifacta), Joe Hellerstein (UC Berkeley)
Average rating: ****.
(4.33, 12 ratings)
Leveraging a dataset’s summary or data profile to inform the analysis process isn't a new concept but in the changing data landscape this process needs to be rethought to handle the different shapes and sizes of big data. Trifacta's Joe Hellerstein and Adam Silberstein discuss new approaches to data profiling specifically designed for quickly understanding the content & quality of modern datasets. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Business & Industry
Location: LL20 BC
Michael Abbott (Kleiner Perkins Caufield & Byers), Eugene Mandel (Jawbone), Christopher Pouliot (Lyft), Mike Polcari (23andMe)
Average rating: ****.
(4.40, 10 ratings)
Most people are familiar with the basic principles driving today’s hottest big data and enterprise companies. But what’s really going on underneath the hood? In this session, Kleiner Perkins Caufield & Byers General Partner Michael Abbott unboxes a variety of startups in the space to examine the technology, architecture, and innovations they’ve harnessed to deliver superior products and services. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Sponsored
Location: LL20 D
Annika Jimenez (Pivotal), Kaushik Das (Pivotal), Rashmi Raghu (Pivotal), Woo Jung (Pivotal), Srivatsan Ramanujam (Pivotal)
Average rating: **...
(2.00, 2 ratings)
With a global team of 30 Data Scientists doing innovative work in almost every vertical market, Pivotal has a rich view into the trends impacting enterprises and their approach to Big Data. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Law, Ethics & Open Data
Location: LL21 B
Jonathan King (CenturyLink )
Average rating: ****.
(4.00, 2 ratings)
Our modern world is one where virtually everything is public by default, making the very notion of privacy radically different than the “private by default” era when the concept was first enshrined in law. This session will explore what we can do with the exploding volume of our personal data alongside the increasingly important question of what should we be doing with this data. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Enterprise Adoption
Location: LL21 C/D
Vnayak Borkar (X15 Software)
Average rating: ****.
(4.33, 3 ratings)
We will take a close look at use cases related to log data processing listing fundamental requirements that must be satisfied by log management systems. We will look at existing products and technologies harnessed for ingesting, storing, querying, and analyzing machine data. Finally, we will attempt to construct the archetype of the ideal platform for the management of log data. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Data Science
Location: LL21 E/F
Average rating: **...
(2.75, 4 ratings)
In far too many organizations, data scientists and designers work in silos, and quibble about who’s more important. This is a huge missed opportunity. At Intuit, we are reimagining how our data and design teams to work together to fuel innovation and surpass Intuit’s business goals. I will walk through methods we are using to bridge these two wildly different groups and share stories of success. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Hadoop in Action
Location: 210 A/E
Eric Sammer (Rocana)
Average rating: ****.
(4.00, 4 ratings)
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. In this session, we’ll describe one such system, in detail, handling terabytes an hour of event-oriented data, providing real time streaming, search, and SQL access to data. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Sponsored
Location: 210 D/H
Average rating: **...
(2.80, 5 ratings)
R has emerged as the language of data science. In this session, IBM will discuss and demonstrate Big R, a comprehensive set of capabilities that provides end-to-end integration with open source R, transparent execution on Hadoop, and seamless access to machine learning algorithms based on SystemML. Learn also about how Big R and Spark can be used with new geo-spatial and text analytic tooling. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Hadoop Platform
Location: 210 B/F
Joey Echeverria (Rocana)
Average rating: ***..
(3.57, 7 ratings)
As the volume of data and number of applications moving to Apache Hadoop has increased, so has the need to secure that data and those applications. In this presentation, we'll take a brief look at where Hadoop security is today and then peer into the future. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Spark in Action
Location: 210 C/G
Xuefu Zhang (Cloudera), Chengxiang Li (Intel)
Average rating: **...
(2.50, 10 ratings)
Hive is Hadoop's de facto standard SQL on big data, and Spark is gaining significant momentum as a new data processing platform beyond MapReduce. The marriage of the two will generate more waves of momentum in both communities. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Machine Data / IoT
Location: 230 A
Anirudh Todi (Twitter Inc.)
Average rating: ****.
(4.25, 4 ratings)
Twitter's users generate tens of billions of tweet views per day. Aggregating these events in real time - in a robust enough way to incorporate into our products - presents a massive scaling challenge. In this talk I'll introduce TSAR (the TimeSeries AggregatoR), a robust, flexible, and scalable service for real-time event aggregation designed to solve this problem and a range of similar ones Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Sponsored
Location: 230 B
Vin Sharma (Intel), Jason Dai (Intel)
Join this session to hear about lessons learnt in building these domain specific solutions, Intel’s reference architecture for data science and analytics services deployment in the cloud, and the new Intel initiative to advance the state of art in big data analytics on Hadoop and Spark. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Hadoop & Beyond
Location: 230 C
Jim Scott (MapR Technologies, Inc.)
Average rating: ****.
(4.50, 6 ratings)
Processing data from social media streams and sensors devices in real-time is becoming increasingly prevalent and there are plenty open source solutions to choose from. To help practitioners decide what to use when we compare three popular Apache projects allowing to do stream processing: Apache Storm, Apache Spark and Apache Samza. Read more.
Add to your personal schedule
11:30am–12:10pm Thursday, 02/19/2015
Sponsored
Location: LL21 A
Daniel Eklund (Think Big, a Teradata Company), Rick Stellwagen (Think Big, a Teradata Company)
Average rating: ***..
(3.00, 1 rating)
This presentation will highlight why the concept of the "data lake" is a game changing paradigm. Attendees will examine data lake architecture and tradeoffs in construction and operations in data lake design as seen from Think Big consulting engagements. Read more.

12:10pm

Add to your personal schedule
12:10pm–1:30pm Thursday, 02/19/2015
Events
Location: Lunch - Expo Hall (Hall 1,2,3)
Birds of a Feather (BoF) sessions are informal roundtable discussions happening during lunch on Thursday, February 19 and Friday, February 20. You can join any BoF table or start your own with a topic of your choice. The BoF sign-up board will be near the Registration area. Read more.

1:30pm

Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Ask Us Anything
Location: 211 B
Moderated by:
Mark Grover (Cloudera)
Panelists:
Jonathan Seidman (Cloudera), Gwen Shapira (Confluent), Ted Malaska (Cloudera)
Join the authors of Hadoop Application Architectures for an open Q/A session on considerations and recommendations for architecture and design of applications using Hadoop. Talk to us about your use-case and its big data architecture, or just come to listen in. Read more.
Add to your personal schedule
1:30pm–1:50pm Thursday, 02/19/2015
Data Science
Location: LL20 A
sasha laundy (Polynumeral)
Average rating: *****
(5.00, 1 rating)
Many development teams fail to set up logging properly so that when they bring in a data science team down the road, their data is missing, wrong, or lacking key fields. A quick data audit could catch many of these common mistakes, saving money, time, and insight. This talks covers the three things to check and several handy tools so you can sleep soundly, knowing your data are collecting safely. Read more.
Add to your personal schedule
1:30pm–1:50pm Thursday, 02/19/2015
Business & Industry
Location: LL20 BC
Vijay Subramanian (Rent the Runway)
Average rating: ****.
(4.00, 5 ratings)
At Rent the Runway, we have focused on using data to make decisions since day 1. But, the best manifestation is driving the strategy and building products using data, which has been critical to our growth. This talk will share examples that illustrate this, and how data is an unlikely hero behind the scenes of successfully renting sparkly designer dresses. Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Sponsored
Location: LL20 D
Learn how SAS applications use YARN in order to be a good citizen in a busy Hadoop cluster. Best practices and customer examples for several different user scenarios will be shared and discussed. Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Law, Ethics & Open Data
Location: LL21 B
Tatsiana Maskalevich (Stitch Fix)
Average rating: ****.
(4.00, 3 ratings)
During the last government shutdown, on "The Daily Show with Jon Stewart," John Oliver noted that congress has a 90% retention rate despite a 10% approval rating. Why? Gerrymandering has become a prime suspect. Is this true, or just truthy? Come find out how a state with a 51% Democrat, 49% Republican electorate enjoys a lopsided congressional delegation of 4 Democrats and 9 Republicans. Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Enterprise Adoption
Location: LL21 C/D
Average rating: ****.
(4.33, 3 ratings)
Explore several different approaches taken by organizations embarking on a data governance journey to meet their own unique business objectives. Review best practices and lessons learned. Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Design & Interfaces
Location: LL21 E/F
Alonzo Canada (Interana)
Average rating: ***..
(3.89, 9 ratings)
Data products are poised to go mainstream, but only if they are designed well. Most data products are designed by developers for developers. This talk discusses methods from Stanford's D.School used by companies like Yahoo!, Samsung, and Audi to design break-out products. These principles can help developers get beyond technology and design products for everyday users. Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Hadoop in Action
Location: 210 A/E
Sheetal Dolas (Hortonworks)
Average rating: ***..
(3.36, 11 ratings)
Businesses are moving from large-scale batch data analysis to large-scale real-time data analysis. Apache Storm has emerged as one of the most popular platforms for the purpose. This talk covers proven design patterns for real time stream processing. Patterns that have been vetted in large-scale production deployments that process 10s of billions of events/day and 10s of terabytes of data/day. Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Sponsored
Location: 210 D/H
Jagane Sundar (WANdisco)
Hadoop is now widely used to support mission-critical applications that operate within a ‘data lake’ infrastructure, but how can it overcome complete data center failures to guarantee continuous operation? In this session, we lay out the blueprint for a multi-data center Hadoop that solves the storage and compute problems in operating over the WAN using single coordinated, Paxos-based file system. Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Hadoop Platform
Location: 210 B/F
Spencer Herath (Accenture), Aaron Benz (Accenture)
Average rating: ****.
(4.50, 2 ratings)
HBase can be a good solution for hierarchical time series data. And we can access the data using both R and Python. This case study is a sanitized version of a solution we brought to a client that provided real business value—without requiring significant investment or time. We show how to move to a simple, scalable NoSQL solution without alienating the scientists who work with the data. Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Hadoop & Beyond
Location: 210 C/G
Richard Williamson (Silicon Valley Data Science)
Average rating: ***..
(3.00, 3 ratings)
Getting the full value from data often requires the combination of stream processing on new events combined with large scale historical analysis. While both these activities are served by Spark’s execution framework, leveraging multiple persistence layers is key to efficiently and extensibly enabling these use cases. Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Machine Data / IoT
Location: 230 A
Ian Eslick (VitalLabs)
Average rating: ****.
(4.00, 3 ratings)
Capturing and integrating device-based and other health data for research is frustratingly difficult. We explain the open source technology frame​work for capturing and routing device-based health data for use by healthcare providers and for access, via a Trusted Analytic Container, to ​​researchers​ we developed, working with O'Reilly and the Robert Wood Johnson Foundation.​ Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Sponsored
Location: 230 B
Dorman Bazzell (Capgemini), GOUTHAM BELLIAPPA (CAPGEMINI), David Freeman (Pentaho)
Tasked with improving engagement and data integrity with emphasis on a self-serve framework, Sears Hometown and Outlet (SHO) forged ahead along their journey in Big Data. With the help of Pentaho and CapGemini, SHO has transitioned from costly and rigid legacy systems to a dynamic, company owned/managed system. . . Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Hadoop & Beyond
Location: 230 C
Eric Schmidt (Google)
Average rating: ***..
(3.71, 7 ratings)
Map Reduce, Millwheel and other technologies changed the way data scientists approached data problems. New technologies like Spark and Cloud Dataflow deal with the complexity of stringing together map reduces and creating end-to-end programming logic from multiple steps by making Big Data into a concrete set of executable operations. Gain insights into programming options and what comes next. Read more.
Add to your personal schedule
1:30pm–2:10pm Thursday, 02/19/2015
Sponsored
Location: LL21 A
Emma McGrattan (Actian)
Average rating: *****
(5.00, 1 rating)
In this session you will hear of some of the fascinating use cases for SQL in Hadoop based on real-world customer examples. You will learn some of the innovative techniques that have emerged to overcome limitations of the Hadoop platform that enable features one expects in a proven mature database. Read more.

1:50pm

Add to your personal schedule
1:50pm–2:10pm Thursday, 02/19/2015
Data Science
Location: LL20 A
Jake Klamka (Insight), Kathy Copic (Insight Data Science)
Average rating: ****.
(4.00, 3 ratings)
Scientists make the best data scientists. Yet there is a skills gap that exists between quantitative data analysis done in a research context and data science in industry. The Data Science Fellows Program has helped over 150 PhDs make the transition, in this session it's founder will share lessons learned in bridging that gap, and the lessons that can be applied to building data science teams. Read more.
Add to your personal schedule
1:50pm–2:10pm Thursday, 02/19/2015
Business & Industry
Location: LL20 BC
Adam Jorgensen (Pragmatic Works)
Average rating: ***..
(3.71, 7 ratings)
Retail buyers are the backbone of the industries’ profitability. These individuals drive organizational goals with their performance. Many decisions are made by intuition and “gut” feeling, where predictive analytics would have made significant improvements in outcomes. This session takes real world experiences and shows how to transform retail performance through data driven buying decisions. Read more.

2:20pm

Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Ask Us Anything
Location: 211 B
Andreas Mueller (NYU, scikit-learn), Jennifer Klay (Cal Poly San Luis Obispo), Peter Wang (Continuum Analytics, Inc.), Travis Oliphant (Continuum Analytics, Inc.), Andy Terrel (Fashion Metric), Matthew Rocklin (Continuum), William McKinney (Cloudera), Stefan van der Walt (UC Berkeley), Kyle Kelley (Rackspace), Jonathan Frederic (IPython)
Average rating: *****
(5.00, 1 rating)
Join the presenters of the PyData Tutorials for further discussions on some of the most used tools in the Python data stack. This is a great opportunity to ask questions and share insight with those who have authored or contributed to: * scikit-learn * NumPy * Bokeh * IPython * Numba * Blaze * pandas * scikit-image Read more.
Add to your personal schedule
2:20pm–2:40pm Thursday, 02/19/2015
Data Science
Location: LL20 A
Michelangelo D'Agostino (Civis Analytics)
Average rating: ****.
(4.50, 2 ratings)
If we want to use data to understand human behavior and to design successful interventions to change that behavior, social scientists and data scientists will need to work together. However, the two often approach problems differently and speak strikingly different languages. This talk will present success stories and tips for productive collaboration between social scientists and data scientists. Read more.
Add to your personal schedule
2:20pm–2:40pm Thursday, 02/19/2015
Business & Industry
Location: LL20 BC
Lu Cheng (Airbnb), Lisa Qian (Airbnb)
Average rating: ****.
(4.71, 7 ratings)
According to industry research, only half of travelers today know exactly where they want to go before planning a trip. This session will provide a holistic view on some of our recent personalization products aimed at inspiring travel. We will start with the product vision, describe the powerful algorithms deployed, and finally explain how we evaluated the long term effects of our product. Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Sponsored
Location: LL20 D
Fintan Quill (Kx Systems Inc.), Doug Talbott (Bedarra Research Labs)
Average rating: ****.
(4.33, 3 ratings)
One of the first industries to invest heavily in Big Data analytics was financial services, where firms have been pushing the boundaries on speed and scale in dynamically processing large volumes of structured market data for the past twenty years to gain competitive advantage. . . Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Law, Ethics & Open Data
Location: LL21 B
Alysa Z. Hutnik (Kelley Drye & Warren LLP), Lauri Mazzuchetti (Kelley Drye)
Average rating: ***..
(3.00, 2 ratings)
Privacy laws as to a company’s obligations on data collection, use, disclosure are changing rapidly. Failing to understand how the laws affect a company’s personal data assets can result in media exposes, regulatory investigations, Congressional hearings and lawsuits. This session will provide guidance on “privacy by design” compliance and practical tips to avoid becoming a target of scrutiny. Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Enterprise Adoption
Location: LL21 C/D
Steven Beeckman (Ministry of Defence of Belgium)
Average rating: ***..
(3.33, 3 ratings)
While cutting edge startups use Spark to see their data analysed in real-time, older and bigger organisations still struggle to share their data in a structural way between its HR, finance and operations departments. This talk will discuss how the belgian MoD opened its data using open source tools and becomes more and more data-driven. Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Design & Interfaces
Location: LL21 E/F
Etan Lightstone (New Relic)
Average rating: ***..
(3.60, 5 ratings)
As Director of UX Design at software analytic company New Relic, my core focus is trying to present the over 200 billion data points across more than three million applications we monitor in a way that provides meaning so customers can make good decisions for their business. Today I’ll share some of what I’ve learned along the way. Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Hadoop in Action
Location: 210 A/E
Tigran Khrimian (FINRA)
Average rating: ****.
(4.25, 4 ratings)
FINRA, an independent regulator charged with protecting investors, processes 30 billion market events per day and analyzes the data in search of patterns that indicate possible manipulation of US financial markets. This talk provides an overview of FINRA's Big Data architecture behind that process. Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Sponsored
Location: 210 D/H
Average rating: **...
(2.00, 1 rating)
This talk discuss how to do realtime analytics with a SQL like query language. We will discuss role of Complex Event Processing in realtime analytics, and then discuss a scalable CEP engine that let users write their queries using declarative SQL like CEP query language, but let them execute those queries using a graph of CEP nodes deployed on top of Apache Storm Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Hadoop Platform
Location: 210 B/F
Sumeet Singh (Yahoo), Thiruvel Thirumoolan (Yahoo!, Inc.)
Average rating: ****.
(4.50, 2 ratings)
Hadoop has allowed us to move towards a unified source of truth for all of organization's data. Managing data location, schema knowledge and evolution, fine-grained business rules based access control, and audit and compliance needs will become critical with increasing scale of operations. In this talk, we will share an approach in tackling these challenges with a data discovery tool. Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Spark in Action
Location: 210 C/G
Cliff Click (0xdata), Michal Malohlava (0xdata, Inc)
Average rating: ***..
(3.77, 13 ratings)
H2O's powerful Machine Learning algorithms coupled with Spark's SQL and scala data munging, a potent combination solving real-world problems. Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Machine Data / IoT
Location: 230 A
Anant Jhingran (Apigee)
Average rating: ****.
(4.00, 1 rating)
In this session, Apigee VP of products Anant Jhingran will discuss how the combination of APIs and data is leading to the next generation of adaptive apps. Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Sponsored
Location: 230 B
Andy Palmer (Tamr, Inc.)
Average rating: *****
(5.00, 2 ratings)
As IT and big data/analytics investments increase, so do data silos. To get full value from these investments, businesses must embrace the variety of data silos - now. Current top-down approaches are tapped out. Innovations in data unification can overcome silos virtually, delivering 360-degree views of customers, long-tail opportunities in supply chains, and other business opportunities. Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Hadoop & Beyond
Location: 230 C
Jacques Nadeau (Dremio)
Average rating: ****.
(4.75, 8 ratings)
I will talk about how Drill achieves high performance with flexibility and ease of use. Includes: First read planning and statistics. Flexible code generation depending on workload. Code optimization and planning techniques. Dynamic schema subsets. Advanced memory use and moving between Java and C. Making a static typing appear dynamic through any-time and multi-phase planning. Read more.
Add to your personal schedule
2:20pm–3:00pm Thursday, 02/19/2015
Sponsored
Location: LL21 A
Matt Ingenthron (Couchbase, Inc.), Justin Michaels (Couchbase), Michael Kehoe (LinkedIn)
Average rating: ****.
(4.00, 1 rating)
Justin Michaels of Couchbase will provide an overview of the use case and review how this is handled within Couchbase while providing real-time access to user data. Matt Ingenthron of Couchbase will talk about key features of the underlying components to enable processing at the scale required by deployments such as AT&T and PayPal. Read more.

2:40pm

Add to your personal schedule
2:40pm–3:00pm Thursday, 02/19/2015
Data Science
Location: LL20 A
Noelle Sio (Pivotal)
Average rating: ****.
(4.33, 3 ratings)
While many companies use data science to increase profits, some nonprofits are using it to save lives! Crisis Text Line connects teens in crisis to counselors via text message, and recently partnered with DataKind and Pivotal on a pro bono project to more quickly route teens to help. Go behind the scenes to learn how they came together to make an impact and how you can too! Read more.
Add to your personal schedule
2:40pm–3:00pm Thursday, 02/19/2015
Business & Industry
Location: LL20 BC
Dave Holtz (Airbnb)
Average rating: ****.
(4.71, 7 ratings)
In a 2013 New York Times column, Thomas Friedman claimed that “Airbnb’s real innovation is not online rentals. It’s ‘trust’.” This session will discuss recent experiments conducted at Airbnb to improve the frequency and honesty of reviews, and the methods used to evaluate changes in the quality of subsequent reviews and the impact of these changes on other key business metrics. Read more.

4:00pm

Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Data Science
Location: LL20 A
Average rating: **...
(2.00, 4 ratings)
When building a real-time data application, we must decide what tradeoffs are permissible without eroding core functionality. As the purpose of data applications become more complex, and the size of the data stores analyzed expand, maintaining integrity and speed becomes increasingly difficult to solve. Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Business & Industry
Location: LL20 BC
Moderated by:
Alistair Croll (Solve For Interesting)
Panelists:
Jeremy Edberg (CloudNative), Jerry Overton (CSC), Tatsiana Maskalevich (Stitch Fix), Anne Johnson (Credit Suisse)
Average rating: ***..
(3.80, 10 ratings)
Ruthless optimization squeezes every ounce of advantage from the current business model. But it takes a leap of faith—not something the numbers tend to encourage—to truly innovate. When we’re informed by data, are we blinded by opportunity? Or does data pave the way for the best innovations, forcing us to take a harder look at bad ideas that will never work out? Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Sponsored
Location: LL20 D
Jeff Pollock (Oracle)
In this session you’ll learn about how to apply Data Discovery and Deep Data Storage for new breakthroughs in data warehousing. We’ll discuss the benefits of using Hadoop technologies like Spark, Kafka, and Hive together with enterprise information architecture and data governance best practices. Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Law, Ethics & Open Data
Location: LL21 B
Pankaj Mathur (Acxiom)
Average rating: *....
(1.00, 2 ratings)
Companies using data especially the ones deploying analytics driven workflow are challenged about right mix of first party and third party data. A large part of challenges are due to lack of clarity about data sources and its reliability, privacy laws and logistics needed for mass scale data aggregation. Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Enterprise Adoption
Location: LL21 C/D
Martin Waterhouse (Chevron)
Average rating: **...
(2.00, 1 rating)
Efficiency, cost effectiveness, organizational capability, corporate standards, risk aversion, shareholder returns, innovation and talent management, are stated essential ingredients for any large enterprise. When it comes to today's challenge of obtaining, engaging, developing and retaining dynamic technical talent there are few LESS appealing places to seek employment. Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Design & Interfaces
Location: LL21 E/F
Ari Gesher (Palantir Technologies), James Thompson (Palantir Technologies)
Average rating: ****.
(4.90, 10 ratings)
From its inception, Palantir Technologies has been about integrating the best of big data technology into systems that enable subject matter experts (as opposed to data scientists or programmers) to move through huge volumes of data and do their own data analysis. Equal parts data system design and UX, we break down the design of building systems usable by mere mortals. Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Hadoop in Action
Location: 210 A/E
John Carnahan (Ticketmaster)
Average rating: ****.
(4.57, 7 ratings)
We will describe how we have used Storm, stream-processing and machine-learned classifiers to improve access to tickets during onsales and how this can extend to similar recipes for trend prediction and anomaly detection. We will also describe how we use tools such Kafka, Storm and Hbase to build an optimal solution for real-time “n-squared” marketing. Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Sponsored
Location: 210 D/H
Oreilly_BSchmarzo Bill (EMC Consulting)
Average rating: *****
(5.00, 1 rating)
CIOs and business executives alike are looking for ways to mine the potential value of their customer, product and operational data as they consider where and how to start their Big Data journey. What are the organizational ramifications of big data? How can CIOs foster a culture of data-driven decision-making? How can the data lake play support an organization’s business transformation efforts? Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Hadoop Platform
Location: 210 B/F
Kathleen Ting (Cloudera), Miklos Christine (Databricks)
Average rating: ***..
(3.50, 2 ratings)
The next generation of MapReduce, YARN, has widely touted job throughput and Apache Hadoop cluster utilization benefits. Less known are the pitfalls littering the migration path to YARN. Learn from our extensive field experience to avoid those pitfalls and get your YARN cluster configured right the first time. Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Spark in Action
Location: 210 C/G
Tathagata Das (Databricks)
Average rating: ****.
(4.17, 6 ratings)
Spark Streaming extends Apache Spark to do large scale stream processing. In this talk, I am going to explain about how Spark Streaming is revolutionizing the way Big "Streaming" Data applications are being written, and making it as easy as 1-2-3. Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Machine Data / IoT
Location: 230 A
Kirk Borne (George Mason University )
Average rating: ***..
(3.67, 3 ratings)
I will introduce USA’s next big astronomy project (LSST) and describe how this telescope requires massive data stream analytics – to discover and respond to exotic rapidly changing events in the Universe. I will discuss parallels between big data astronomy and Decision Science-as-a-Service for Business, Cybersecurity Information and Event Management, and Marketing Automation using Hadoop. Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Sponsored
Location: 230 B
George Corugedo (RedPoint Global Inc.)
Learn how Hadoop 2.0 and its YARN architecture can make a serious impact on the previously intractable problem of data quality and serve as a super-charged marshaling area for accessing, cleansing and delivering high-quality data Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Hadoop & Beyond
Location: 230 C
Michael Stonebraker (Tamr, Inc.)
Average rating: ***..
(3.88, 8 ratings)
The explosion of internal data sources, data “lakes” (e.g., Hadoop), external public data sources, and feeds from the Internet of Things is creating a tsunami of diverse data sources for enterprises to leverage. Top-down data-integration and data-scientist tools won’t scale to meet integration demands. Learn how a scalable data curation platform can help enterprises with data integration at scale. Read more.
Add to your personal schedule
4:00pm–4:40pm Thursday, 02/19/2015
Sponsored
Location: LL21 A
Greg Goldsmith (Attivio)
In this session Greg will share his insights on this gap from his years of experience in the visual exploratory data discovery & advanced analytics space working with customers and most of the major players in the Big and small data management ecosystem. Read more.

4:50pm

Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Data Science
Location: LL20 A
Brian Granger (Cal Poly San Luis Obispo), Fernando Pérez (University of California at Berkeley), Kyle Kelley (Rackspace)
Average rating: ****.
(4.00, 7 ratings)
The Jupyter/IPython Notebook is a web-based interactive computing platform for Data Science in Python, Julia, R and other languages. In this talk we will describe our recent work to bring the Notebook to larger groups of users, both on the open web and within organizations. The focus will be on new collaboration features and deploying the Notebook in these contexts. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Business & Industry
Location: LL20 BC
Michael Dauber (Amplify), Matthew Ocko (Data Collective), Max Gazor (Charles River Ventures), Cack Wilhelm (Scale Venture Partners), Arif Janmohamed (Lightspeed Venture Partners)
Average rating: ****.
(4.50, 4 ratings)
To anticipate who will succeed and invest wisely, investors spend a lot of time trying to understand the longer-term trends within an industry. In this panel discussion, we’ll consider the big trends in Big Data, asking top-tier VCs to look over the horizon discuss the visions they have two or more years in the future. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Sponsored
Location: LL20 D
Rado Kotorov (Information Builders)
Average rating: ***..
(3.50, 2 ratings)
This session uses actual case studies to illustrate how organizations are innovating, changing and growing their businesses with Big Data. The presentation will discuss the data requirements and the front end analytic applications used to deliver game changing Big Data initiatives. . . Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Business & Industry
Location: LL21 B
Scott Donaldson (FINRA)
In 2014, FINRA developed a new system to analyze the complex linkages between orders and trades in the US equities capital markets. This session will highlight the outcomes of the big data solution that allowed FINRA’s analysts to more efficiently conduct regulatory reviews and improve accessibility to over a trillion market events, and highlight the lessons learned during the implementation. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Enterprise Adoption
Location: LL21 C/D
Pamela Peele (UPMC)
Average rating: ****.
(4.14, 7 ratings)
Big data is the sexy new frontier for many businesses but it’s expensive to stand up in an organization and expensive to buy from an external vendor. What is the most fundamental way to demonstrate that data science matters to the organization? This session covers the meaningful data consumption metric that every data science group needs to track. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Design & Interfaces
Location: LL21 E/F
Julie Rodriguez (Sapient Global Markets)
Average rating: ****.
(4.57, 7 ratings)
Designing data visualizations presents us with unique and interesting challenges: how to tell a compelling story; how to deliver important information in a forthright, clear format; and how to make visualizations beautiful and engaging. In this talk, Julie will share a few disruptive designs and connect those back to vizipedia, her compiled data visualization library. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Hadoop in Action
Location: 210 A/E
Nick Curcuru (MasterCard Advisors), Craig Duncan (MasterCard), Craig Hibbeler (MasterCard Advisors)
Average rating: **...
(2.20, 5 ratings)
In this session, attendees will gain an understanding of the technology and processes crucial to delivering a secure platform. Additionally, they’ll benefit from insights on how to ensure their organization’s Hadoop environment complies with stringent security requirements. Recent implementations of compliance programs will be highlighted as part of the discussion. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Sponsored
Location: 210 D/H
Mike Flannagan (Cisco)
Organizations are experiencing unprecedented complexity in managing their data, with the rise of Big Data, Cloud and overall hyper connectivity of our world. Cisco is building solutions to help our customers adopt Big Data solutions, solve business problems using Analytics, and harness the power of an intelligent infrastructure to provide highly differentiated Data and Analytics solutions. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Hadoop Platform
Location: 210 B/F
Yuliya Feldman (MapR Technologies)
Average rating: *....
(1.50, 2 ratings)
The good news: Hadoop has a lot of tools. The bad news: Hadoop has a lot of tools, and conflicting priorities. This talk shows how advances in YARN and Mesos allow you to run multiple distinct workloads together. We show how to use SLA and latency rules along with preemption in YARN to maintain high throughput while guaranteeing latency for applications such as HBase and Drill. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Spark in Action
Location: 210 C/G
LianHui Wang (Tencent)
Average rating: ***..
(3.00, 3 ratings)
In this talk, we introduce the general data architecture of Tencent with a focus on our Spark use cases on a GAIA (our improved resource manager based on YARN) cluster of 8000+ nodes. We contrast Spark with the previous MapReduce use cases, followed by tuning methods and optimizations for large scale clusters. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Machine Data / IoT
Location: 230 A
Jeremy Heffner (Azavea)
Average rating: ****.
(4.83, 6 ratings)
We often face the need to analyze the count of discrete events which occur at a specific time and place whether they be crime events, taxi requests, or phone calls. Forecasting these space-time events brings particular challenges: finding suitable tools for geographic processing and techniques for modeling the data. The session will cover the lessons learned in building such a system. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Sponsored
Location: 230 B
Josh Byrd (GoPro), Darren Chinen (GoPro)
Average rating: ****.
(4.75, 8 ratings)
In this session, GoPro discusses their process for transforming the extreme volume and variety of datasets landing in GoPro’s data lake into usable formats for analysis tools or predictive modeling demands improving their ability to overcome the current technical and human bottlenecks that typically limit the productivity of these efforts. Read more.
Add to your personal schedule
4:50pm–5:30pm Thursday, 02/19/2015
Hadoop & Beyond
Location: 230 C
Randy Guck (Dell Software)
Average rating: *****
(5.00, 1 rating)
Not all big data problems require big cluster solutions. Doradus OLAP compresses data into compact shards, yielding fast analytical queries using little disk even for big data sets. Learn how Doradus leverages OLAP techniques, columnar storage, and Cassandra to yield sophisticated query features while using amazingly little disk space. Read more.

5:30pm

Add to your personal schedule
5:30pm–7:00pm Thursday, 02/19/2015
Events
Location: Expo Hall (Hall 1,2,3)
Average rating: *****
(5.00, 2 ratings)
Quench your thirst with vendor-hosted libations and snacks while you check out all the cool stuff in the Expo Hall. Read more.

7:00pm

Add to your personal schedule
7:00pm–9:00pm Thursday, 02/19/2015
Events
Location: City National Civic
Average rating: *****
(5.00, 2 ratings)
Join Data After Dark on our World Tour! Celebrate the global reach of Strata + Hadoop World as we pay homage to some of big data’s biggest markets. Read more.

Friday, 02/20/2015

8:45am

Add to your personal schedule
8:45am–8:50am Friday, 02/20/2015
Keynotes
Location: Grand Ballroom 220
Alistair Croll (Solve For Interesting), Doug Cutting (Cloudera), Roger Magoulas (O'Reilly Media)
Average rating: **...
(2.50, 4 ratings)
Program Chairs, Roger Magoulas, Doug Cutting, and Alistair Croll, welcome you to the second day of Strata + Hadoop World keynotes. Read more.

8:50am

Add to your personal schedule
8:50am–9:00am Friday, 02/20/2015
Keynotes
Location: Grand Ballroom 220
Eddie Garcia (Cloudera)
Average rating: **...
(2.52, 21 ratings)
Open data is quickly gaining momentum and when applied as data for good, it becomes a much more powerful concept that we should all consider as good data stewards. Organizations to cities are starting to share data like traffic conditions or climate sensors and allowing others to use this open data to improve quality of life. Read more.

9:00am

Add to your personal schedule
9:00am–5:00pm Friday, 02/20/2015
Training
Location: 211 A
Dustin Clute (Cloudera), Michael Judd (Cloudera)
Average rating: **...
(2.00, 1 rating)
Cloudera University’s four-day course for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub. Read more.
Add to your personal schedule
9:00am–9:05am Friday, 02/20/2015
Keynotes
Location: Grand Ballroom 220
Alistair Croll (Solve For Interesting)
Average rating: ****.
(4.17, 24 ratings)
Roughly every decade, some kind of military or enterprise technology makes its way into the mainstream: the personal computer; the consumer Internet; the mobile phone; the Internet of Things. What happens when Big Data turns into a consumer product? Strata chair Alistair Croll offers some speculation about what data will do to the way we live, love, work, and play. Read more.

9:05am

Add to your personal schedule
9:05am–9:10am Friday, 02/20/2015
Keynotes, Sponsored
Location: Grand Ballroom 220
Michael Greene (Intel)
Average rating: **...
(2.20, 25 ratings)
The exponential growth of digitally stored data and the transition of data science from academia to real world applications hold the promise of improving nearly every aspect of our lives. Read more.

9:10am

Add to your personal schedule
9:10am–9:25am Friday, 02/20/2015
Keynotes
Location: Grand Ballroom 220
Eden Medina (Indiana University, Bloomington)
Average rating: ****.
(4.23, 44 ratings)
We are often told that past holds lessons on how to approach the present, but we rarely look to older technologies for inspiration. Rarer still do we look at the historical experiences of less industrialized nations to teach us about the technological problems of today. Read more.

9:25am

Add to your personal schedule
9:25am–9:35am Friday, 02/20/2015
Keynotes
Location: Grand Ballroom 220
Matei Zaharia (Databricks)
Average rating: ****.
(4.11, 37 ratings)
As the Apache Spark userbase grows, the developer community is working to adapt it for ever-wider use cases. 2014 saw fast adoption of Spark in the enterprise and major improvements in its performance, scalability and standard libraries. Read more.

9:35am

Add to your personal schedule
9:35am–9:40am Friday, 02/20/2015
Keynotes, Sponsored
Location: Grand Ballroom 220
Moderated by:
Roman Shaposhnik (Pivotal Inc.)
Panelists:
Average rating: **...
(2.24, 33 ratings)
In the wake of the Open Data Platform initiative announced earlier this week, Roman Shaposhnik, Director of Open Source strategy at Pivotal and a VP of Apache Software Foundation Incubator will talk about how a well-defined, fully validated ODP common core platform is going to address some of the biggest customer pain points around rapid evolution and standardization in the big data area Read more.

9:40am

Add to your personal schedule
9:40am–9:50am Friday, 02/20/2015
Keynotes, Sponsored
Location: Grand Ballroom 220
Joseph Sirosh (Microsoft)
Average rating: ****.
(4.67, 33 ratings)
Join Microsoft’s Joseph Sirosh for a surprising conversation about a farmer's dilemma, a professor's ingenuity and how cloud, data and devices came together to fundamentally re-imagine an age old way of doing business. Read more.

9:50am

Add to your personal schedule
9:50am–10:00am Friday, 02/20/2015
Keynotes
Location: Grand Ballroom 220
Jeffrey Heer (Trifacta | University of Washington)
Average rating: ****.
(4.39, 36 ratings)
Keynote with Jeffrey Heer, Co-Founder, Trifacta Read more.

10:40am

Add to your personal schedule
10:40am–12:40pm Friday, 02/20/2015
Data Science
Location: 211 C
Get certified as a Spark Developer at Strata + Hadoop World in San Jose. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Data Science
Location: LL20 A
Irina Borisova (Chegg), Asim Mathur (eBay)
Average rating: ****.
(4.50, 6 ratings)
In this talk we are addressing the following aspects of machine translation development at eBay: - leveraging huge amounts of transactional and behavioral data for development and evaluation of our MT systems; - adapting evaluation metrics to reflect the eBay buyer experience and measuring translation quality and impact on the shopping experience of our international users. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Business & Industry
Location: LL20 BC
Moderated by:
Cornelia Levy-Bencheton (CLB Strategic Consulting, LLC.)
Panelists:
Michele Chambers (RapidMiner), Alice Zheng (Dato), Neha Narkhede (Confluent)
Average rating: *****
(5.00, 2 ratings)
The future is all about information. It will belong to those who can find it, understand it and know how to use it. In this panel discussion, we explore evidence-based benefits of welcoming more women into the tech community and of increasing female talent power on work teams. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Sponsored
Location: LL20 D
Scott Gray (IBM), Adriana Zubiri (IBM)
This session covers a number of these challenges that Hadoop presents to efficient query processing and discussed a number of the novel approaches that modern SQL-on-Hadoop solutions take in order to overcome these hurdles. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Design & Interfaces
Location: LL21 B
Tye Rattenbury (Trifacta), Jeffrey Heer (Trifacta | University of Washington)
Average rating: ****.
(4.50, 4 ratings)
The ability of software to recognize patterns in usage, data, or other inputs to improve a user’s experience & productivity is an expected attribute of modern software. Trifacta’s Jeffrey Heer and Tye Rattenbury discuss design and software architecture principles for creating intelligent software that incorporates learning to make the process of transforming data more intuitive and efficient. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Connected World
Location: LL21 C/D
Oliver Mainka (SAP Labs LLC)
Average rating: ****.
(4.50, 2 ratings)
In Predictive Maintenance and Service makers of assets (like automotive) or the operators of assets (like mining or manufacturing) bring together machine sensor data and maintenance data to better understand when and why machines fail, but also predict future failures, and needed business activities. This presentation gives an overview of the topic and what SAP customers have done. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Machine Data / IoT
Location: LL21 E/F
Average rating: ****.
(4.00, 1 rating)
Sensor devices are proliferating fast now that manufacturing price is under $100. And while more data is being generated, more is also being thrown out because of the resource gap between compute and network. We must make new computational trade-offs whilst ensuring quality. In this talk, we discuss these trade-offs and examine architectures for peer-based and crowd-sourced model generation. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Hadoop & Beyond
Location: 210 A/E
Kurt Brown (Netflix)
Average rating: ****.
(4.83, 18 ratings)
The Netflix Data Platform is a constantly evolving, large scale infrastructure running in the (AWS) cloud. We are especially focused on performance and ease of use, with initiatives including Presto integration, Spark, and our Big Data Portal and API. This talk will dive into the various technologies we use, the motivations behind our approach, and the business benefits we get. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Sponsored
Location: 210 D/H
Joseph Sirosh (Microsoft)
Average rating: ****.
(4.60, 5 ratings)
Armed with just a browser, data scientists can develop sophisticated machine learning models, and deploy them in a few clicks in cloud-hosted APIs that can be called from any device. The APIs scale elastically to power high volume intelligent apps for phones, websites and the internet of things. . . Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Hadoop in Action
Location: 210 B/F
Gwen Shapira (Confluent)
Average rating: ****.
(4.43, 7 ratings)
Organizations do not store, process and analyze data for their amusement. They plan to use the data to drive business decisions. If data validity is uncertain, the data is useless for decision making. In this session we will show how to design architectures that allow to prove and improve data validity at every step of the decision making process. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Hadoop Platform
Location: 210 C/G
Nasser Manesh (Altiscale, Inc.)
Average rating: ****.
(4.00, 1 rating)
In this from-the-trenches, DevOps-focused talk we explore operational issues in running Hadoop on top of Docker containers in a production, multi-tenant setup. With Hadoop's native Docker support still in the works and Docker being more of a development tool, a production deployment of the two together is like swimming in treacherous waters... Here's a lantern and a lifeboat to the rescue. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Security
Location: 230 A
Ram Shankar Siva Kumar (Microsoft (Azure Security Data Science)), Marco Di Placido (O365 Security Signals )
The audience will learn about the novel ways of using ranking algorithms in intrusion detection systems, how to provide consistent security monitoring in a constantly changing environment and finally, data scientists will walk away with an actionable framework for testing their system even with the lack of labelled attack data. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Sponsored
Location: 230 B
Nitesh Ambastha (Credit Suisse), David Brewster (Paxata), Nenshad Bardoliwalla (Paxata)
Average rating: ****.
(4.00, 1 rating)
In this lively, technical session, Nitesh Ambastha, Global Head of Data IT, Private Banking & Wealth Management Products at Credit Suisse talks about what his organization demands from vendors who sell data preparation, data quality and governance technologies. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Spark in Action
Location: 230 C
Dean Wampler (Lightbend)
Average rating: ****.
(4.40, 5 ratings)
Spark is an open-source computation platform for Big Data. All the major Hadoop vendors have embraced Spark as a replacement for MapReduce, because Spark offers much better performance, a more powerful and productive API, and support for event processing. Spark's secrets for success are the Scala programming language and Functional Programming. We'll explore why. Read more.
Add to your personal schedule
10:40am–11:20am Friday, 02/20/2015
Sponsored
Location: LL21 A
Carter Shanklin (Hortonworks), Mostafa Mokhtar (Hortonworks)
Average rating: *****
(5.00, 2 ratings)
This session will examine Hive performance past, present and future. Read more.

11:30am

Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Ask Us Anything
Location: 211 B
Manu Mukerji (TiVo), John Akred (Silicon Valley Data Science), Stephen O'Sullivan (Silicon Valley Data Science), Tatsiana Maskalevich (Stitch Fix), Harrison Mebane (Silicon Valley Data Science)
What does successful big data and data science really look like? As consultants out in the field, we've learned a lot of lessons and have great stories to tell about success, failure, and how to negotiate a path through a fast-moving technology landscape. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Data Science
Location: LL20 A
Anuranjita Tewary (Intuit), Lucian Lita (Intuit), Jonathan Goldman (Intuit)
Average rating: ****.
(4.25, 4 ratings)
Data scientists navel gazing in a corner. Engineers not thinking, just refactoring. Product just making slides. That’s no way to build data products. Is it even possible to have them play well together, without promising free lunches, unlimited gummy bears, and a Red Bull IV? We share our experience about what worked and what didn’t, both in a startup and in a big company environment. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Business & Industry
Location: LL20 BC
Azarias Reda (Republican National Committee )
Average rating: ***..
(3.80, 5 ratings)
In the 2014 election cycle, the Republican National Committee spent significant amount of resources on engineering and data science to help GOP senate candidates across the country. As the first ever Chief Data Officer of the RNC, Azarias led this effort. In this talk, he will discuss some of the lessons learned helping the republican party use data and engineering to win the US Senate. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Sponsored
Location: LL20 D
Richard Caudle (DataSift)
This session will outline strategies for cost effectively turning enormous streams of Social Data into intelligence for use in any application. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Design & Interfaces
Location: LL21 B
Danyel Fisher (Microsoft Research), Miriah Meyer (University of Utah)
Average rating: ****.
(4.00, 3 ratings)
We lots of things "data visualization," from a news interactive, to spreadsheets, to an infographic counting calories. These surface similarities hide deep differences in what it means to interact with data. In this talk, we cross disciplines—from data science to design—to enliven our techniques and encourage us to try new methods for creating visualizations. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Data Science
Location: LL21 C/D
Shai Fine (Intel)
Average rating: ****.
(4.00, 3 ratings)
We will introduce the concept of Machine Learning Building Blocks - elements that can be mapped into hardware and software primitives and patterns. We demonstrate the implication of this concept in designing some specific workloads. Finally, we look at the Workload Optimization Framework, which includes a comprehensive Machine Learning workload suite, composed of sampled & constructed workloads Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Machine Data / IoT
Location: LL21 E/F
Joseph Adler (Confluent), Robert Johnson (Interana)
Average rating: ***..
(3.57, 7 ratings)
Thirty years ago, data warehouses revolutionized data storage at big companies, storing summarized data in a strict structure and making it possible to efficiently analyze data. We believe that modern technology lets you adopt a simpler and more powerful scheme to organize historical data: time ordered raw event logs. In this session, we'll explain why raw data is better. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Hadoop & Beyond
Location: 210 A/E
Costin Leau (Elastic)
Average rating: ***..
(3.00, 4 ratings)
Search is more than typing words into a box. It's evolved into the backbone for today’s analytics demands​,​​ and an asset for businesses ​to ​ask the right questions to make sense of their data. This session will highlight how a versatile, agile search and analytics platform can give shape to data, and uncover the "uncommonly common” trends within, to make the right data-driven decisions. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Sponsored
Location: 210 D/H
Jon Bock (SnowFlake)
Tens of thousands of simultaneous game players generate lots of data. For online game-maker Kixeye, that data provides insights that drive decisions about game play and monetization. In this session Kixeye and Snowflake will discuss Snowflake’s data warehouse cloud service and how Kixeye uses it to get data insight with the performance, elasticity, and flexibility made possible by the cloud. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Hadoop in Action
Location: 210 B/F
Marcel Kornacker (Cloudera)
Average rating: *****
(5.00, 1 rating)
n this talk, attendee will learn about Impala’s approach to on-the-fly, automatic data transformation, which in conjunction with the ability to handle nested structures such as JSON and XML documents, addresses the needs of at-source analytics — including direct querying of your input schema, immediate querying of data as it lands in HDFS, and high performance on par with specialized engines. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Hadoop Platform
Location: 210 C/G
Ron Bodkin (Think Big Analytics)
Average rating: ****.
(4.50, 2 ratings)
YARN has featured in the marketing of Hadoop distributions for the past 2 years. All the major vendors now ship production versions. What is the real world state of deployment? What does it let you do? What are the limitations? In this talk we review three distinct deployments look at benefits and challenges, and highlights lessons for those considering to take the plunge. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Security
Location: 230 A
David Freeman (LinkedIn)
Average rating: ***..
(3.71, 7 ratings)
LinkedIn's Security Data Science team is tasked with detecting bad activity on the LinkedIn site and building proactive solutions to keep it from happening in the first place. In this talk we'll explore various types of abuse we see at LinkedIn and discuss some of the solutions we've built to defend against it. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Sponsored
Location: 230 B
David Dobbins (Rackspace Hosting), Chris Lalonde (ObjectRocket)
Average rating: **...
(2.00, 2 ratings)
During this session learn how you can rapidly deploy a modern data platform and watch a live demo that highlights how our easy-to-use control panels and API’s with simple bridges allow you to manage, integrate, and gain insights from your data environments in minutes. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Spark in Action
Location: 230 C
Patrick Wendell (Databricks)
Average rating: *****
(5.00, 3 ratings)
Apache Spark is a popular engine for large scale analytics. This talk will give insights into tuning and debugging a production Spark deployment. It will start with details about Spark internals and an overview of the runtime behavior of a Spark application. I'll explain how to diagnose performance bottlenecks and get the best performance out of Spark jobs. Read more.
Add to your personal schedule
11:30am–12:10pm Friday, 02/20/2015
Sponsored
Location: LL21 A
Ozgun Erdogan (Citus Data)
Average rating: *****
(5.00, 1 rating)
PostgreSQL has recently become the most popular database for technology companies. In part, it owes this success to rethinking the monolithic SQL database, and providing an extensible architecture instead. In this talk, we will describe key challenges associated with scaling out SQL. We will then show PostgreSQL extensions that overcome these challenges, and describe how they do so. Read more.

12:10pm

Add to your personal schedule
12:10pm–1:30pm Friday, 02/20/2015
Events
Location: Lunch - Expo Hall (Hall 1,2,3)
Birds of a Feather (BoF) sessions are informal roundtable discussions happening during lunch on Thursday, February 19 and Friday, February 20. You can join any BoF table or start your own with a topic of your choice. The BoF sign-up board will be near the Registration area. Read more.

1:30pm

Add to your personal schedule
1:30pm–1:50pm Friday, 02/20/2015
Data Science
Location: LL20 A
Harrison Mebane (Silicon Valley Data Science)
Average rating: *....
(1.67, 3 ratings)
This session will discuss how to build a resilient, multi-modal event-detection system based on error-prone sources—video, audio, natural language, and external APIs. We will briefly review event-detection techniques and then demonstrate how to combine these across multiple data streams. Read more.
Add to your personal schedule
1:30pm–1:50pm Friday, 02/20/2015
Business & Industry
Location: LL20 BC
Nima Sarshar (inPowered)
Average rating: ****.
(4.50, 2 ratings)
There is no shortage of opinions expressed across the Web on virtually any topic. This enables a diversity of voices to be heard, but often leaves users overwhelmed. We report on our implementation of a big data platform that identifies and ranks experts on a large number of topics. It allows users to cut through the noise and discover opinions expressed by credible experts in topics of interest. Read more.
Add to your personal schedule
1:30pm–1:50pm Friday, 02/20/2015
Connected World
Location: LL21 C/D
Matt Asay (Adobe)
Average rating: ****.
(4.33, 3 ratings)
Silicon Valley may be the center of Big Data technology production, but its application is having a far bigger impact on old-school industries like agriculture and brick-and-mortar retailing. This session will detail some of the world's most innovative applications from some of the world's oldest organizations. Read more.
Add to your personal schedule
1:30pm–2:10pm Friday, 02/20/2015
Machine Data / IoT
Location: LL21 E/F
Robert Grossman (University of Chicago)
Average rating: ****.
(4.60, 15 ratings)
Finding anomalies is essential for a wide range of applications, including cybersecurity, event detection and health and status monitoring. Anomaly techniques that scale successfully to large datasets tend to integrate machine learning with good data engineering. We discuss three case studies and extract eight techniques that have proved effective for detecting anomalies in large scale systems. Read more.
Add to your personal schedule
1:30pm–2:10pm Friday, 02/20/2015
Hadoop & Beyond
Location: 210 A/E
Jairam Ranganathan (Cloudera)
With hundreds of developers from a variety of organizations participating, Hadoop moves quickly. This talk will survey the important changes admins and users should be aware of and their impacts to various use cases. Read more.
Add to your personal schedule
1:30pm–2:10pm Friday, 02/20/2015
Sponsored
Location: 210 D/H
Rahul Pathak (Amazon Web Services)
Average rating: ***..
(3.00, 1 rating)
Join us as we explore the big data services of AWS and watch a speaker-led tutorial and a link to a lab in which you can take with you. Read more.
Add to your personal schedule
1:30pm–2:10pm Friday, 02/20/2015
Hadoop in Action
Location: 210 B/F
Yanpei Chen (Cloudera), Karthik Kambatla (Cloudera)
Average rating: ***..
(3.00, 1 rating)
You will never look at SSDs the same after this presentation. We discuss how SSDs improve the performance of MapReduce workloads. We identify cost-per-performance as a more pertinent metric than cost-per-capacity when evaluating SSDs versus HDDs for performance, and quantify that SSDs can achieve up to 70% higher performance for 2.5x higher cost-per-performance. Read more.
Add to your personal schedule
1:30pm–2:10pm Friday, 02/20/2015
Hadoop Platform
Location: 210 C/G
Julien Le Dem (Dremio)
Average rating: *....
(1.40, 5 ratings)
Parquet is a columnar format designed to be efficient and interoperable across the hadoop ecosystem. Its integration in most processing frameworks and serialization models makes it easy to use in existing ETL and processing pipelines, while giving flexibility of choice on the query engine. Read more.
Add to your personal schedule
1:30pm–2:10pm Friday, 02/20/2015
Security
Location: 230 A
Terence Spies (Voltage Security)
This talk by Voltage Security CTO Terence Spies presents options for securing data and speeding Hadoop implementation. Attendees will leave with strategies to de-risk Hadoop implementations in multi-platform Enterprise environments. Read more.
Add to your personal schedule
1:30pm–2:10pm Friday, 02/20/2015
Sponsored
Location: 230 B
Anand Venugopal (Impetus Technologies Inc.)
This talk will address an emerging problem in the Modern Enterprise Data Landscape and a possible realization of a "Smart Enterprise Big Data Bus" using an open source stack including Apache Kafka and Apache Storm. Read more.
Add to your personal schedule
1:30pm–2:10pm Friday, 02/20/2015
Spark in Action
Location: 230 C
Bing Xiao (Huawei)
Average rating: **...
(2.33, 3 ratings)
In this talk, we’ll walk through a recent implementation at one of world’s top 5 mobile operators (a company with 300 million subscribers) Read more.
Add to your personal schedule
1:30pm–2:10pm Friday, 02/20/2015
Sponsored
Location: LL21 A
Ryan Michaluk (Allstate), Alexander Gray (Skytree, Inc.)
Allstate’s foundation is data. We extract value from our data by applying machine learning to make data-driven decisions. In this session, we discuss Allstate’s drive for better business results by using machine learning on Hadoop. Read more.

1:50pm

Add to your personal schedule
1:50pm–2:10pm Friday, 02/20/2015
Data Science
Location: LL20 A
Daniel Crankshaw (UC Berkeley)
Average rating: ****.
(4.67, 3 ratings)
In this talk, I will introduce Velox, the newest component of the Berkeley Data Analytics Stack. Velox is the missing piece in the predictive analytics stack enabling interactive applications ranging from content recommendations to personalized search by addressing the challenges of serving and managing personalized machine learning models at scale. Read more.
Add to your personal schedule
1:50pm–2:10pm Friday, 02/20/2015
Business & Industry
Location: LL20 BC
John Haddad (Informatica)
Average rating: *****
(5.00, 1 rating)
What types of organizations generate the most revenue and profits from Big Data initiatives? They need to be agile and adopt new technologies, grow existing resource skills while attracting new skills, and yet still manage and govern data. In this session we’ll describe organization structures, roles, skills, and interactions that make these types of data-driven organizations successful. Read more.
Add to your personal schedule
1:50pm–2:10pm Friday, 02/20/2015
Connected World
Location: LL21 C/D
Adam Smith (Automated Insights)
Average rating: *****
(5.00, 1 rating)
This session tells the story of how a young technology company helped The Associated Press embrace cutting-edge data innovation at the heart of its business: the automation of corporate earnings stories. The challenges along the way offer valuable lessons for any brand engaged in a major data implementation – and any vendor who wants to help. Read more.

2:20pm

Add to your personal schedule
2:20pm–3:00pm Friday, 02/20/2015
Ask Us Anything
Location: 211 B
Paco Nathan (O'Reilly Media), Matei Zaharia (Databricks), Michael Armbrust (Databricks), Reynold Xin (Databricks), Holden Karau (IBM), Reza Zadeh (Stanford University), Sameer Farooqui (Databricks), Denny Lee (Concur Technologies), Chris Fregly (Flux Capacitor AI)
Join the Spark team for an informal question and answer session. Several of the Spark committers, trainers, etc., from Databricks will be on hand to field a wide range of detailed questions. Even if you don't have a specific question, join in to hear what others are asking. Read more.
Add to your personal schedule
2:20pm–2:40pm Friday, 02/20/2015
Data Science
Location: LL20 A
Michael Brown (comScore, Inc.)
Average rating: **...
(2.00, 2 ratings)
Bots don't drink soda, so advertisers don’t want to advertise to them. Accurately counting real people is critical in the digital ad industry. This session will show how comScore uses over 1.5 trillion events of data to separate real people from bots. I’ll describe how we use correlations at scale, heuristic classification, and multi-source anomaly detection to make decisions in real time. Read more.
Add to your personal schedule
2:20pm–3:00pm Friday, 02/20/2015
Business & Industry
Location: LL20 BC
Chris Neumann (None)
Average rating: **...
(2.00, 3 ratings)
Creating data analytics solutions for the cloud requires a new way of thinking about data architectures. Users expect to combine data seamlessly across services while IT demands that new tools leverage existing investments in security and administration. This talk will discuss the challenges of architecting for the cloud and present real-world case studies of the benefits of these architectures. Read more.
Add to your personal schedule
2:20pm–3:00pm Friday, 02/20/2015
Design & Interfaces
Location: LL21 B
Eric Colson (Stitch Fix)
Average rating: ****.
(4.80, 5 ratings)
Even the most data-driven organizations still incorporate “art” into their decision-making process. Values, culture, social norms, and biases influence decisions as much as the data. This isn’t always a bad thing—data can sometimes fail to tell the whole story. And, by combining data with the intellectual assets that reside in the heads of employees we can create new capabilities. Read more.
Add to your personal schedule
2:20pm–2:40pm Friday, 02/20/2015
Connected World
Location: LL21 C/D
Doug Stein (metacog, Inc.)
Average rating: ***..
(3.33, 3 ratings)
Big Data can transform learning from past's one way push of finely crafted content "at" learners to a two way data conversation that streams real-time feedback from students as learning challenges are tackled. What kind of rich opportunities exist in analyzing and visualizing that two-way data stream as learners interact with open-ended content? Read more.
Add to your personal schedule
2:20pm–3:00pm Friday, 02/20/2015
Machine Data / IoT
Location: LL21 E/F
Subutai Ahmad (Numenta, Inc.)
Average rating: ****.
(4.00, 1 rating)
The unprecedented increase in streaming data sources requires a new approach to analytical algorithms. Systems must be highly automated, adapt to changing statistics, and naturally deal with temporal data streams. They must require no batch training and should deploy custom models on the fly. It will be impossible to build scalable practical systems without these properties. Read more.
Add to your personal schedule
2:20pm–3:00pm Friday, 02/20/2015
Hadoop & Beyond
Location: 210 A/E
Fangjin Yang (Imply), Vadim Ogievetsky (Imply)
Average rating: ****.
(4.00, 2 ratings)
The maturation of big data technologies has enabled numerous organizations to derive insights from vast quantities of data. The next set of challenges we face involve building applications that allow us to visualize, navigate, and interpret this data. Creating intuitive user interfaces is often a cumbersome process requiring complex data transformations, integrations, and queries. Read more.
Add to your personal schedule
2:20pm–3:00pm Friday, 02/20/2015
Hadoop in Action
Location: 210 B/F
Ajit Gaddam (VISA)
Average rating: *****
(5.00, 2 ratings)
Vendors and pundits suggest plug-n-play options for Hadoop security - do this and in <20 mins, your petabytes of data is now secure. What happens when PowerPoint approaches fail in a real-world enterprise deployment? In this session, we will review techniques that worked, controls that completely failed, and create business processes we had to stand up. Read more.
Add to your personal schedule
2:20pm–3:00pm Friday, 02/20/2015
Hadoop Platform
Location: 210 C/G
Alan Gates (Hortonworks)
Average rating: ****.
(4.75, 4 ratings)
Starting in Hive 0.14, insert values, update, and delete have been added to Hive SQL. In addition, ACID compliant transactions have been added so that users get a consistent view of data while reading and writing. This talk will cover the intended use cases, architecture, and performance of insert, update, and delete in Hive. Read more.
Add to your personal schedule
2:20pm–3:00pm Friday, 02/20/2015
Security
Location: 230 A
Gary Davis (McAfee, a division of Intel Security)
Consumers are widely adopting wearable technology – Deloitte predicts there will be 100 million wearable cameras, smartwatches, fitness trackers and other gadgets on the market by 2020. With this mass adoption of wearable devices, comes a new data ecosystem that must be protected. Embracing the protection of this new, intricate data ecosystem is imperative to the success of wearable industry. Read more.
Add to your personal schedule
2:20pm–3:00pm Friday, 02/20/2015
Sponsored
Location: 230 B
Clint Sharp (Splunk)
Average rating: **...
(2.00, 1 rating)
In this session you will hear from big data expert, Clint Sharp, with real world experience on the architectural patterns and platform integrations used to solve real business problems with data. Read more.
Add to your personal schedule
2:20pm–3:00pm Friday, 02/20/2015
Hadoop & Beyond
Location: 230 C
Ted Dunning (MapR Technologies)
Average rating: ***..
(3.50, 2 ratings)
YARN and MESOS are often positioned as competitors for managing datacenter resources, but in reality they work together to seamlessly share datacenter resources. Why force IT to choose between these two great technologies, when we can show you how they work in concert. Read more.

2:40pm

Add to your personal schedule
2:40pm–3:00pm Friday, 02/20/2015
Data Science
Location: LL20 A
Jike Chong (Simply Hired)
Average rating: ***..
(3.67, 3 ratings)
Learn how tools based on nation-wide job market data can help both students and institutions improve outcomes from the job market level down to curriculum and course choice. Read more.
Add to your personal schedule
2:40pm–3:00pm Friday, 02/20/2015
Connected World
Location: LL21 C/D
June Andrews (Pinterest)
Average rating: ****.
(4.17, 6 ratings)
With LinkedIn's wealth of data we can answer questions previously limited by human resources. We can ask which industries have the most ties with health care? How do you meet Richard Branson? More seriously, what types of connections are used to find jobs? To answer these questions, we weave the algorithmic complexities and data harvesting into stories that enrich our understanding of the answers. Read more.

4:00pm

Add to your personal schedule
4:00pm–4:40pm Friday, 02/20/2015
Data Science
Location: LL20 A
Michael Conover (LinkedIn)
Average rating: *****
(5.00, 4 ratings)
Building real-time relevance systems for mobile presents a unique blend of challenges from both modeling and architectural perspectives. In this talk, we’ll take an in-depth look at the machine learning infrastructure that powers Connected, LinkedIn’s mobile application that helps our members nurture and leverage their professional networks. Read more.
Add to your personal schedule
4:00pm–4:40pm Friday, 02/20/2015
Business & Industry
Location: LL20 BC
Brian Ulicny (Thomson Reuters )
Average rating: ***..
(3.67, 3 ratings)
As the leading source of intelligent information, Thomson Reuters delivers must-have insight to the world’s financial and risk, legal, tax and accounting, intellectual property and science and media professionals, supported by the world’s most trusted news organization. Read more.
Add to your personal schedule
4:00pm–4:40pm Friday, 02/20/2015
Design & Interfaces
Location: LL21 B
Arianna McClain (IDEO), Coe Leta Stafford (IDEO), Kevin Ho (IDEO)
Average rating: ****.
(4.67, 6 ratings)
IDEO's Hybrid team brings all the design tools from IDEO's product design process to work with clients on data oriented projects. The team will share elements of their process and case studies to show how incorporating human-centered techniques from design can improve data as an input to decision making. Read more.
Add to your personal schedule
4:00pm–4:40pm Friday, 02/20/2015
Connected World
Location: LL21 C/D
Stewart Collis (aWhere Inc.)
As climate change increases weather variability, farmers must adapt. Add to this global population growth and diet changes that require world food production to have increased by 70 percent in 2050 means farmers will struggle to meet demand. Of the 580 million farmers in the world, 500 million have little access to technology or information to ensure agile adaptation. Big helps solve this problem. Read more.
Add to your personal schedule
4:00pm–4:40pm Friday, 02/20/2015
Machine Data / IoT
Location: LL21 E/F
Ben Hamner (Kaggle)
Average rating: *****
(5.00, 2 ratings)
The US is in an oil boom, driven by new technologies that enable the economic production of shale resources. Conventional exploration techniques don’t work well for these unconventional reserves. In this talk, Kaggle’s Chief Scientist will discuss Kaggle’s pioneering work in machine learning for oil exploration. ML for energy applications differs dramatically from consumer web applications. Read more.
Add to your personal schedule
4:00pm–4:40pm Friday, 02/20/2015
Data Science
Location: 210 A/E
Oscar Celma (Pandora)
Average rating: ****.
(4.40, 5 ratings)
Pandora is not the “Netflix for music.” The success of Pandora lies in the unique combination of a curated music catalog of 1.5M+ tracks with a sophisticated set of machine learning models that integrates contextual user feedback from more than 250M people. This talk will unveil Pandora’s dynamic ensemble learning approach that provides a truly personalized experience for each of our listeners. Read more.
Add to your personal schedule
4:00pm–4:40pm Friday, 02/20/2015
Hadoop in Action
Location: 210 B/F
Allen Day (MapR Technologies), Sungwook Yoon (MapR)
Genomics applications like the Genome Analysis Toolkit (GATK) have long used techniques like MapReduce to parallelize I/O, but have never before run on Hadoop. We will describe what we did to build an end-to-end GATK-based genome analysis pipeline on Hadoop, show how it scaled at lower platform cost, and demonstrate the results. Read more.
Add to your personal schedule
4:00pm–4:40pm Friday, 02/20/2015
Hadoop Platform
Location: 210 C/G
Monte Zweben (Splice Machine Inc.)
Once just the realm of Java jockeys and data scientists, Hadoop has become a mainstream tool for business analysts with the rapid proliferation of SQL-on-Hadoop solutions. But there are pitfalls that can plague implementations as IT teams get their first exposure to production Hadoop environments. We’ll discuss the most common pitfalls companies face and how to get around them. Read more.
Add to your personal schedule
4:00pm–4:40pm Friday, 02/20/2015
Security
Location: 230 A
Woody Christy (Cloudera), Steve Anderson (Intel), Patrick Schots (Intel), Floris Grandvarlet (Cisco)
Average rating: **...
(2.00, 2 ratings)
There is often debate in the Hadoop community of the correct hardware combination for a cluster. In this talks, attendees will learn how varying different components impacts performance and how to chose the right components for their own workloads. Read more.
Add to your personal schedule
4:00pm–4:40pm Friday, 02/20/2015
Spark in Action
Location: 230 C
Vida Ha (Databricks), Holden Karau (IBM)
Average rating: ****.
(4.00, 3 ratings)
Writing efficient Spark programs requires a deeper understanding of Spark internals. In this talk, we present practical tips for writing better Spark programs for the beginner or intermediate Spark programmer. Read more.

4:40pm

Add to your personal schedule
4:40pm–5:45pm Friday, 02/20/2015
Events
Location: The Hub
Join attendees, speakers, and exhibitors as we end the conference on a sweet note with some gelato. Read more.