Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Schedule: Data-driven business management sessions

Learn how to build strategies and data-driven business models that deliver customer insight, drive efficiency and innovation in products and services, modernize architecture, reduce costs, and lower risk.

Add to your personal schedule
9:00am12:30pm Tuesday, March 6, 2018
Location: 210 A/E Level: Beginner
Burcu Baran (LinkedIn), Wei Di (LinkedIn), Michael Li (LinkedIn), Chi-Yi Kuan (LinkedIn)
Average rating: ****.
(4.44, 9 ratings)
Burcu Baran, Wei Di, Michael Li, and Chi-Yi Kuan walk you through the big data analytics and data science lifecycle and share their experience and lessons learned leveraging advanced analytics and machine learning techniques such as predictive modeling to drive and grow business at LinkedIn. Read more.
Add to your personal schedule
9:00am5:00pm Tuesday, March 6, 2018
Location: LL20 A
Madhav Madaboosi (BP), Meenakshisundaram Thandavarayan (Infosys), Matt Conners (Microsoft), Katie Malone (Civis Analytics), Mike Prorock (mesur.io), Thomas Miller (Northwestern University), Ann Nguyen (Whole Whale), Jennie Shin (Kaiser Permanente), Val Bercovici (PencilDATA), Wayde Fleener (General Mills), Joe Dumoulin (Next IT), Jules Malin (GoPro), Taylor Martin (O'Reilly Media), Divya Ramachandran (Captricity)
Hear practical insights from household brands and global companies: the challenges they tackled, approaches they took, and the benefits—and drawbacks—of their solutions. Read more.
Add to your personal schedule
9:00am5:00pm Tuesday, March 6, 2018
Location: LL20 B
David Boyle (MasterClass), Violeta Hennessey (Warner Bros.), April Chen (Civis Analytics), Sridhar Alla (Comcast), Noah Gift (UC Davis), Blake Irvine (Netflix), Kevin Lyons (Nielsen Marketing Cloud), Jennifer Webb (SuprFanz), Rizwan Patel (Caesars Entertainment), Anthony Accardo (Disney), Amanda Gerdes (Blizzard Entertainment), Violeta Hennessey (Warner Bros.), Aneesh Karve (Quilt), David Boyle (MasterClass), Peter Skomoroch (SkipFlag)
Hear from innovators in ad tech, measurement, automation, and audience engagement about where the media industry is today—and where it's likely to go next. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 6, 2018
Location: LL21 C/D Level: Intermediate
Ronny Kohavi (Microsoft), Alex Deng (Microsoft), Somit Gupta (Microsoft), Paul Raff (Microsoft)
Average rating: ****.
(4.00, 3 ratings)
Controlled experiments such as A/B tests have revolutionized the way software is being developed, allowing real users to objectively evaluate new ideas. Ronny Kohavi, Alex Deng, Somit Gupta, and Paul Raff lead an introduction to A/B testing and share lessons learned from one of the largest A/B testing platforms on the planet, running at Microsoft, which executes over 10K experiments a year. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 6, 2018
Location: 210 A/E Level: Non-technical
Nick Elprin (Domino Data Lab)
Average rating: *****
(5.00, 2 ratings)
The honeymoon era of data science is ending, and accountability is coming. Not content to wait for results that may or may not arrive, successful data science leaders deliver measurable impact on an increasing share of an enterprise's KPIs. Nick Elprin details how leading organizations have taken a holistic approach to people, process, and technology to build a sustainable competitive advantage. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 7, 2018
Location: 210 C/G Level: Beginner
Matthew Granade (Domino Data Lab)
Average rating: **...
(2.00, 1 rating)
Predictive analytics and artificial intelligence have become critical competitive capabilities. Yet IT teams struggle to provide the support data science teams need to succeed. Matthew Granade explains how leading banks, insurance and pharmaceutical companies, and others manage data science at scale. Read more.
Add to your personal schedule
1:50pm2:30pm Wednesday, March 7, 2018
Location: 210 A/E Level: Non-technical
Frances Haugen (Pinterest), Patrick Phelps (Pinterest)
Average rating: ****.
(4.67, 3 ratings)
Data science is most powerful when combined with deep domain knowledge, but those with domain knowledge don't work on data-focused teams. So how do you empower employees with diverse backgrounds and skill sets to be effective users of data? Frances Haugen and Patrick Phelps dive into the social side of data and share strategies for unlocking otherwise unobtainable insights. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 7, 2018
Location: 210 C/G Level: Non-technical
Katie Malone (Civis Analytics), Skipper Seabold (Civis Analytics)
Average rating: *****
(5.00, 1 rating)
A huge challenge for data science managers is determining priorities for their teams, which often have more good ideas than they have time. Katie Malone and Skipper Seabold share a framework that their large and diverse data science team uses to identify, discuss, select, and manage data science projects for a fast-moving startup. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 7, 2018
Location: Expo Hall 1 Level: Advanced
Secondary topics:  Expo Hall, Graphs and Time-series
Yu Xu (TigerGraph)
Average rating: *****
(5.00, 2 ratings)
Graph databases are the fastest growing category in data management. However, most graph queries only traverse two hops in big graphs due to limitations in most graph databases. Real-world applications require deep link analytics that traverse far more than three hops. Yu Xu offers an overview of a fraud detection system that manages 100 billion graph elements to detect risk and fraudulent groups. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 7, 2018
Location: 210 A/E Level: Intermediate
Ted Malaska (Blizzard Entertainment), Jonathan Seidman (Cloudera)
Average rating: ****.
(4.67, 3 ratings)
Recent years have seen dramatic advancements in the technologies available for managing and processing data. While these technologies provide powerful tools to build data applications, they also require new skills. Ted Malaska and Jonathan Seidman explain how to evaluate these new technologies and build teams to effectively leverage these technologies and achieve ROI with your data initiatives. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 7, 2018
Location: 210 D/H Level: Intermediate
Kapil Surlaker (LinkedIn), Ya Xu (LinkedIn)
Average rating: *****
(5.00, 3 ratings)
Metrics measurement and experimentation play crucial roles in every product decision at LinkedIn. Kapil Surlaker and Ya Xu explain why, to meet the company's needs, LinkedIn built the UMP and XLNT platforms for metrics computation and experimentation, respectively, which have allowed the company to perform measurement and experimentation very efficiently at scale while preserving trust in data. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 7, 2018
Location: 210 C/G Level: Non-technical
Stephanie Beben (Booz Allen Hamilton)
How can you most effectively use machine intelligence to drive strategy? By merging it in the right way with the human ingenuity of leaders throughout your organization. Stephanie Beben shares insights from her work with pioneering companies, government agencies, and nonprofits that are successfully navigating this partnership by becoming “mathematical corporations.” Read more.
Add to your personal schedule
9:10am9:30am Thursday, March 8, 2018
Location: Grand Ballroom 220 Level: Intermediate
Eric Colson (Stitch Fix)
Average rating: ****.
(4.50, 10 ratings)
While companies often use data science as a supportive function, the emergence of new business models has made it possible for some companies to differentiate via data science. Eric Colson explores what it means to differentiate by data science and explains why companies must now think very differently about the role and placement of data science in the organization. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 8, 2018
Location: LL20 A Level: Intermediate
Clare Gollnick (Terbium Labs)
Average rating: ****.
(4.86, 7 ratings)
At the heart of the reproducibility crisis in the sciences is the widespread misapplication of statistics. Data science relies on the same statistical methodology as these scientific fields. Can we avoid the same crisis of integrity? Clare Gollnick considers the philosophy of data science and shares a framework that explains (and even predicts) the likelihood of success of a data project. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 8, 2018
Location: 210 C/G Level: Beginner
Secondary topics:  Graphs and Time-series
Michael Schrenk (Self-Employed)
Average rating: ****.
(4.00, 5 ratings)
Big data becomes much more powerful when it has context. Fortunately, creative data scientists can create needed context though the use of metadata. Michael Schrenk explains how metadata is created and used to gain competitive advantages, predict troop strength, or even guess Social Security numbers. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 8, 2018
Location: 210 A/E Level: Intermediate
David Talby (Pacific AI)
Average rating: ***..
(3.50, 4 ratings)
Machine learning and data science systems often fail in production in unexpected ways. David Talby shares real-world case studies showing why this happens and explains what you can do about it, covering best practices and lessons learned from a decade of experience building and operating such systems at Fortune 500 companies across several industries. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 8, 2018
Location: 210 C/G Level: Beginner
Paco Nathan (O'Reilly Media)
Average rating: ****.
(4.25, 4 ratings)
Human in the loop (HITL) has emerged as a key design pattern for managing teams where people and machines collaborate. Such systems are mostly automated, with exceptions referred to human experts, who help train the machines further. Paco Nathan offers an overview of HITL from the perspective of a business manager, focusing on use cases within O'Reilly Media. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 8, 2018
Location: 210 D/H Level: Intermediate
Average rating: *****
(5.00, 2 ratings)
With so many business intelligence tools in the Hadoop ecosystem and no common measure to identify the efficiency of each tool, where do you begin to build or modify your enterprise data lake strategy? Sagar Kewalramani shares real-world BI problems and how they were resolved with Hadoop tools and demonstrates how to build an effective data lake strategy with open source tools and components. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 8, 2018
Location: LL20 C Level: Non-technical
Veronica Mapes (Pinterest), Garner Chung (Pinterest)
Average rating: *****
(5.00, 3 ratings)
Veronica Mapes and Garner Chung detail the human evaluation platform Pinterest developed to better serve its deep learning and operational teams when its needs grew beyond platforms like Mechanical Turk. Along the way, they cover tricks for increasing data reliability and judgement reproducibility and explain how Pinterest integrated end-user-sourced judgements into its in-house platform. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 8, 2018
Location: 210 D/H Level: Intermediate
Chris Chapo (Gap Inc.)
Average rating: ****.
(4.20, 5 ratings)
Chris Chapo walks you through real-world examples of companies that are driving transformational change by leveraging data science and analytics, paying particular attention to established organizations where these capabilities are newer concepts. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 8, 2018
Location: LL21 E/F Level: Intermediate
Ajay Mothukuri (Sapient), Dr. Vijay Srinivas Agneeswaran (SapientRazorfish)
Ajay Mothukuri and Vijay Srinivas Agneeswaran explain how to use open source blockchain technologies such as Hyperledger to implement the European Union's General Data Protection Regulation (GDPR) regulation. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 8, 2018
Location: 210 A/E Level: Non-technical
Anjali Thakur (Accenture)
Average rating: *....
(1.00, 5 ratings)
Whether you are a technology or a services provider, understanding your value in the ecosystem and focusing on the right partners to reach your market goals is critical. Anjali Thakur shares examples of teaming models and leading practices for accelerating value from your ecosystem strategy. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 8, 2018
Location: 210 C/G Level: Beginner
Brian Karfunkel (Pinterest)
Average rating: ****.
(4.50, 2 ratings)
When software companies use A/B tests to evaluate product changes and fail to accurately estimate the long-term impact of such experiments, they risk optimizing for the users they have at the expense of the users they want to have. Brian Karfunkel explains how to estimate an experiment’s impact over time, thus mitigating this risk and giving full credit to experiments targeted at noncore users. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 8, 2018
Location: 210 D/H
Matt Derda (Trifacta), Harrison Lynch (Consensus Corporation)
Average rating: **...
(2.00, 1 rating)
Matt Derda and Harrison Lynch explain how Consensus leverages the combined power of data wrangling and machine learning to more efficiently identify and reduce retail fraud and how adopting data wrangling technology has helped Trifacta reduce time spent data wrangling from six weeks to one week. Read more.
Add to your personal schedule
4:20pm5:00pm Thursday, March 8, 2018
Location: 210 A/E Level: Non-technical
Jesse Anderson (Big Data Institute)
Average rating: ****.
(4.00, 1 rating)
There's been an explosion of new architectures, but is this because engineers love new things or is there a good business reason for these changes? Jesse Anderson explores new architectures and the actual business problems they solve. You may find out that your team would be far more productive if you moved to these architectures. Read more.
Add to your personal schedule
4:20pm5:00pm Thursday, March 8, 2018
Location: 210 C/G Level: Beginner
Felix Gorodishter (GoDaddy)
Average rating: ***..
(3.00, 2 ratings)
GoDaddy ingests and analyzes over 100,000 data points per second. Felix Gorodishter discusses the company's big data journey from ingest to automation, how it is evolving its systems to scale to over 10 TB of new data per day, and how it uses tools like anomaly detection to produce valuable insights, such as the worth of a reminder email. Read more.
Add to your personal schedule
4:20pm5:00pm Thursday, March 8, 2018
Location: 210 D/H Level: Beginner
Marcin Pilarczyk (Ryanair)
Average rating: *****
(5.00, 2 ratings)
Managing fuel at a company flying 120 millions passengers yearly is not a trivial task. Marcin Pilarczyk explores the main aspects of fuel management of a modern airline and offers an overview of machine learning methods supporting long-term planning and daily decisions. Read more.