Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY
Strata Business Summit

September 11-13 2018
New York, NY

Make data work for business.

The 2018 Strata Business Summit will give you a thorough understanding of how some of the world’s leading companies build successful data strategies. You’ll discover game-changing technologies and their business applications—and how to move your enterprise forward to bridge the gap. You'll also receive a hand-picked lineup of executive briefings on key issues such as: predictive analytics and machine learning, Cloud strategy, governance security and privacy, IoT, and artificial intelligence, and more.

In just 3 days, you’ll have the intel you need to build strategies and data-driven business models that deliver customer insight, drive efficiency and innovation in products and services, modernize architecture, reduce costs, and lower risk.

Featured Speakers

Gold and Silver pass holders have access to Strata Business Summit on Tues–Thurs. Platinum and Bronze pass holders have access to Strata Business Summit on Wed–Thurs.

Tuesday Sep 11: Tutorials (Gold & Silver passes)
Wednesday Sep 12: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
9:00am | Location: 3E
Strata Data Conference Keynotes
10:50am
Morning break
Thursday Sep 13: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
9:00am | Location: 3E
Strata Data Conference Keynotes
10:50am
Morning break
Add to your personal schedule
9:00am–5:00pm Tuesday, 09/11/2018
Location: 1A 04/05
Acquiring machine-learning (ML) technology is relatively straightforward, but ML must be applied to be useful. In this one-day boot camp, we teach students how to apply advanced analytics in ways that reshape the enterprise and improve outcomes. This training is equal parts hackathon, presentation, and group participation. Read more.
Add to your personal schedule
9:00am–12:30pm Tuesday, 09/11/2018
Location: 1A 10 Level: Non-technical
Secondary topics:  Machine Learning in the enterprise
Nick Elprin (Domino Data Lab)
The honeymoon era of data science is ending, and accountability is coming. Not content to wait for results that may or may not arrive, successful data science leaders deliver measurable impact on an increasing share of an enterprise’s KPIs. Nick Elprin details how leading organizations have taken a holistic approach to people, process, and technology to build a sustainable competitive advantage. Read more.
Add to your personal schedule
9:00am–5:00pm Tuesday, 09/11/2018
Location: 1A 08
Alistair Croll (Solve For Interesting), Amro Alkhatib (National Health Insurance Company - Daman), Mridul Mishra (Fidelity Investments), Patrick Angeles (Cloudera), Andreas Kohlmaier (MunichRe), Paul Lashmet (Arcadia Data), Laura Eisenhardt (iKnow Solutions), Robin Way (Corios), Theresa Johnson (Airbnb), Jane Tran (Unqork)
From analyzing risk and detecting fraud to predicting payments and improving customer experience, take a deep dive into the ways data technologies are transforming the financial industry. Read more.
Add to your personal schedule
9:00am–5:00pm Tuesday, 09/11/2018
Location: 1E 10
Alistair Croll (Solve For Interesting), Katharina Warzel (EveryMundo), Mike Berger (Mount Sinai Health System), Sam Helmich (Deere & Company), Stephanie Fischer (datanizing GmbH), Maryam Jahanshahi (TapRecruit), Greg Quist (SmartCover Systems), Ann Nguyen (Whole Whale), Abhimanyu Verma (Novartis), Steve Otto (Navistar), Jennifer Lim (Cerner), Anand S (Gramener)
Hear practical insights from household brands and global companies: the challenges they tackled, approaches they took, and the benefits—and drawbacks—of their solutions. Read more.
Add to your personal schedule
1:30pm–5:00pm Tuesday, 09/11/2018
Location: 1A 10 Level: Beginner
Secondary topics:  Machine Learning in the enterprise
This tutorial is a primer on crafting well-conceived data science projects on course toward uncovering valuable business insights. Using case studies and hands-on skills development, we will teach techniques that are essential for a variety of audiences invested in effecting real business change. Read more.
Add to your personal schedule
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1E 10/11 Level: Non-technical
Secondary topics:  Data preparation, governance and privacy, Machine Learning in the enterprise
JF Gagne (Element AI)
The CIO is going to need a broader mandate in the company to better align their AI training and outcomes with business goals and compliance. This mandate should include an AI Governance team that is well staffed and deeply established in the company in order to catch biases that can develop from faulty goals or flawed data Read more.
Add to your personal schedule
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1E 12/13 Level: Intermediate
Secondary topics:  Data preparation, governance and privacy, Ethics and Privacy
Anthony Hsu (LinkedIn), Issac Buenrostro (LinkedIn)
With over 100 million LinkedIn members in the EU, enforcing GDPR compliance is challenging. In this talk, we explain the architecture of our system and how we leverage Hive, Kafka, Gobblin, and WhereHows to ensure compliance. Read more.
Add to your personal schedule
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1E 14 Level: Intermediate
Secondary topics:  Data preparation, governance and privacy, Ethics and Privacy
Mark Donsky (Cloudera), Steven Ross (Cloudera)
General Data Protection Regulation (GDPR) goes into effect in May 2018 for firms doing any business in the EU. However many companies aren't prepared for the strict regulation or fines for noncompliance (up to €20 million or 4% of global annual revenue). This session will explore the capabilities your data environment needs in order to simplify GDPR compliance, as well as future regulations. Read more.
Add to your personal schedule
11:20am–12:00pm Wednesday, 09/12/2018
Location: Expo Hall Level: Non-technical
Secondary topics:  Data Integration and Data Pipelines, Financial Services
Usama Fayyad (Open Insights), Troels Oerting (Barclays UK)
This presentation will share the main outcomes and learnings from building and deploying a global data fusion, incident analysis/visualization, and effective cybersecurity defense based on BigData and AI at a major EU bank and in collaboration with several financial services institutions. The focus is on learnings and breakthroughs gleaned from making the systems work Read more.
Add to your personal schedule
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1E 12/13 Level: Advanced
Secondary topics:  Data preparation, governance and privacy, Ethics and Privacy
Les McMonagle (BlueTalon )
"Privacy by Design" is a fundamentally important approach to achieving compliance with GDPR and other data privacy or data protection regulations. This session will outline how organizations can save time and money while improving data security and regulatory compliance and dramatically reduce the risk of a data breach or expensive penalties for non-compliance. Read more.
Add to your personal schedule
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1E 10/11 Level: Non-technical
Secondary topics:  Machine Learning in the enterprise, Retail and e-commerce
Erin Coffman (Airbnb)
Airbnb has open-sourced many high-leverage data tools: Airflow, Superset, and the Knowledge Repo. However, adoption of these tools across Airbnb was relatively low. To make data more accessible and utilized in decision-making, Airbnb launched Data University in early 2017. Since the launch, over a quarter of the company has participated in the program, and data tool utilization rates have doubled. Read more.
Add to your personal schedule
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1E 14 Level: Non-technical
Secondary topics:  Machine Learning in the enterprise
Tony Baer (Ovum), Florian Douetteau (DATAIKU)
Ovum will present the results of research cosponsored by Dataiku, surveying a specially selected sample of chief data officers and data scientists, on how to map roles and processes to make success with AI in the business repeatable. Read more.
Add to your personal schedule
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1E 10/11 Level: Beginner
Lawrence Cowan (Cicero Group)
We've worked with firms and seen over and over that they are struggling to leverage their data. We've developed a methodology for assessing 4 critical areas that firms must consider when looking to make the analytical leap: Data Strategy; Data Culture; Data Analysis & Implementation; Data Management & Architecture. Read more.
Add to your personal schedule
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1E 12/13 Level: Beginner
Secondary topics:  Ethics and Privacy
Harry Glaser (Periscope Data)
What is the moral responsibility of a data team today? As AI & machine learning technologies become part of our everyday life, and as data becomes accessible to everyone, CDOs and data teams are taking on a very important moral role as the conscience of the corporation. This session will highlight the risks companies will face if they don't empower data teams to lead the way for ethical data use. Read more.
Add to your personal schedule
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1E 14 Level: Intermediate
Secondary topics:  Machine Learning in the enterprise, Model lifecycle management
David Talby (Pacific AI)
Machine learning and data science systems often fail in production in unexpected ways. David Talby shares real-world case studies showing why this happens and explains what you can do about it, covering best practices and lessons learned from a decade of experience building and operating such systems at Fortune 500 companies across several industries. Read more.
Add to your personal schedule
2:55pm–3:35pm Wednesday, 09/12/2018
Location: 1E 12/13 Level: Non-technical
Secondary topics:  Data preparation, governance and privacy, Ethics and Privacy
Andrew Burt (Immuta)
Machine learning is becoming prevalent across industries, creating new types of risk. Managing this risk is quickly becoming the central challenge of major organizations, one that strains data science teams, legal personnel and the c-suite alike. This talk will highlight lessons from past regulations focused on similar technology, and conclude with a proposal for new ways to manage risk in ML. Read more.
Add to your personal schedule
2:55pm–3:35pm Wednesday, 09/12/2018
Location: 1E 14 Level: Intermediate
Secondary topics:  Machine Learning in the enterprise, Media, Marketing, Advertising
Ted Malaska (Blizzard Entertainment), Jonathan Seidman (Cloudera)
Creating a successful big data practice in your organization presents new challenges in managing projects and teams. In this session we'll provide guidance and best practices to help technical leaders deliver successful projects from planning to implementation. Read more.
Add to your personal schedule
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1E 10/11 Level: Beginner
Adil Aijaz (Split Software)
Many products - whether data driven or not - chase “the one metric that matters”. It may be engagement, revenue, or conversion, but the common theme is the pursuit of improvement in one metric. Product development teams should focus on the design of metrics that measure our goals. Adil will present an approach to designing metrics, discuss best practices and common pitfalls that you may run into. Read more.
Add to your personal schedule
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1E 12/13 Level: Non-technical
Secondary topics:  Ethics and Privacy, Machine Learning in the enterprise
Kimberly Nevala (SAS Institute)
Too often, the discussion of AI and ML includes an expectation - if not a requirement - for infallibility. But as we know, this expectation is not realistic. So what’s a company to do? While risk can’t be eliminated, it can be rationalized. This session will demonstrate how a unflinching risk assessment enables AI/ML adoption and deployment. Read more.
Add to your personal schedule
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1E 14 Level: Non-technical
Secondary topics:  Machine Learning in the enterprise, Media, Marketing, Advertising
Cassie Kozyrkov (Google)
Many organizations aren’t aware that they have a blindspot with respect to their lack of data effectiveness and hiring experts doesn’t seem to help. This session examines what it takes to build a truly data-driven organizational culture and highlights a vital, yet often neglected, job function: the data science manager. Read more.
Add to your personal schedule
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1E 10/11 Level: Non-technical
Friederike Schuur (Cloudera), Rita Ko (USA for UNHCR)
The Hive and Cloudera Fast Forward Labs share how they transformed USA for UNHCR (UN Refugee Agency) to use data science and machine learning (DS/ML) to address the refugee crisis. From identifying use cases and success metrics to showcasing the value of DS/ML, we cover the development and implementation of a DS/ML strategy hoping to inspire other organizations looking to derive value from data. Read more.
Add to your personal schedule
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1E 12/13 Level: Non-technical
Secondary topics:  Media, Marketing, Advertising
John Thuma (Arcadia Data)
Forget about the fake news, data and analytics in politics is what drives elections. While proposing analytical solutions to the RNC and DNC, I faced ethical dilemmas. Not only did I help causes I disagreed with, but I also armed politicians with “REAL-TIME” data to manipulate voters. Politics is a business, and today’s modern data infrastructure optimize campaign funds more effectively than ever. Read more.
Add to your personal schedule
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1E 14 Level: Intermediate
Secondary topics:  Data preparation, governance and privacy
Sanjeev Mohan (Gartner)
If the last few years were spent proving the value of data lakes, the emphasis now is to monetize the big data architecture investments. The rallying cry is to onboard new workloads efficiently. But, how does one do so if they don’t know what data is in the lake, the level of its quality and the trustworthiness of models? This is why data governance becomes the linchpin to the success of lakes. Read more.
Add to your personal schedule
11:20am–12:00pm Thursday, 09/13/2018
Location: 1E 14 Level: Intermediate
Secondary topics:  Machine Learning in the enterprise, Retail and e-commerce
Mikio Braun (Zalando SE)
In order to become "AI ready", an organization not just has to provide the right technical infrastructure for data collection and processing, but also learn new skills. In this talk I will highlight three such missing pieces: making the connection between business problems and AI technology, AI driven development, and how to run AI based projects. Read more.
Add to your personal schedule
11:20am–12:00pm Thursday, 09/13/2018
Location: 1E 10/11 Level: Non-technical
Secondary topics:  Machine Learning in the enterprise, Retail and e-commerce, Transportation and Logistics
Data scientists are hard to hire. But too often, companies struggle to find the right talent only to make avoidable mistakes that cause their best data scientists to leave. From org structure and leadership to tooling and infrastructure to continuing education, this talk will offer concrete (and inexpensive) tips for keeping your data scientists engaged, productive, and adding business value. Read more.
Add to your personal schedule
11:20am–12:00pm Thursday, 09/13/2018
Location: 1E 12/13 Level: Beginner
Secondary topics:  Ethics and Privacy
Nuria Ruiz (Wikimedia)
The Wikipedia community feels strongly that you shouldn’t have to provide personal information to participate in the free knowledge movement. In this talk we will go into the challenges that this strong privacy stance poses for the Wikimedia Foundation, including how it affects data collection and some creative workarounds that allow WMF to calculate metrics in a privacy conscious way. Read more.
Add to your personal schedule
1:10pm–1:50pm Thursday, 09/13/2018
Location: 1E 12/13 Level: Intermediate
Secondary topics:  Data preparation, governance and privacy, Ethics and Privacy
GDPR is more than another regulation to be handled by your back office. Enacting the Data Subject Access Rights (DSAR) requires practical actions. In this session, we will discuss the practical steps to deploy governed data services Read more.
Add to your personal schedule
1:10pm–1:50pm Thursday, 09/13/2018
Location: 1E 14 Level: Non-technical
Secondary topics:  Machine Learning in the enterprise, Transportation and Logistics
Brandy Freitas (Pitney Bowes)
Data science is an approachable field given the right framing. Often, though, practitioners and executives are describing opportunities using completely different languages. In this session, Harvard Biophysicist-turned-Data Scientist, Brandy Freitas, will work with participants to develop context and vocabulary around data science topics to help build a culture of data within their organization. Read more.
Add to your personal schedule
2:00pm–2:40pm Thursday, 09/13/2018
Location: 1E 10/11 Level: Intermediate
Secondary topics:  Health and Medicine
Niraj Nagrani ( Ancestry)
Ancestry has more than 10 petabytes of structured and unstructured data. Ancestry’s SVP of platform, Niraj Nagrani, will discuss how companies can build a data platform that uses cloud computing, Data Science, Artificial Intelligence and Machine Learning to analyze complex data sets at scale to provide personalized insights and relationship graph to consumers. Read more.
Add to your personal schedule
2:00pm–2:40pm Thursday, 09/13/2018
Location: 1E 12/13 Level: Non-technical
Secondary topics:  Health and Medicine, Text and Language processing and analysis
Chiny Driscoll (Metistream Inc.), Jawad Khan (Rush University Medical Center )
This Cloudera/MetiStream solution lets healthcare providers automate the extraction, processing and analysis of clinical notes within the Electronic Health Record in batch or real-time. Improve care, identify errors, and recognize efficiencies in billing and diagnoses by leveraging NLP capabilities to conduct fast analytics in a distributed environment. Use case by Rush University Medical Center. Read more.
Add to your personal schedule
2:00pm–2:40pm Thursday, 09/13/2018
Location: 1E 14 Level: Non-technical
Dean Wampler (Lightbend)
Streaming data systems, so called "Fast Data", promise accelerated access to information, leading to new innovations and competitive advantages. But they aren't just "faster" versions of Big Data. They force architecture changes to meet new demands for reliability and dynamic scalability, more like microservices. This talk tells you what you need to know to exploit Fast Data successfully. Read more.
Add to your personal schedule
3:30pm–4:10pm Thursday, 09/13/2018
Location: 1E 10/11 Level: Non-technical
Secondary topics:  Data Platforms, Media, Marketing, Advertising, Retail and e-commerce
Francesco Mucio (Zalando SE)
The story of how Zalando went from old school BI to an AI driven company built on a solid data platform, what we learned in the process and what are the challenges we still see in front of us. Read more.
Add to your personal schedule
3:30pm–4:10pm Thursday, 09/13/2018
Location: 1E 14 Level: Intermediate
Secondary topics:  Machine Learning in the enterprise
Jennifer Prendki (Figure Eight)
The Agile Methodology has been widely successful for Software Engineering teams, but seems inappropriate for Data Science teams. This is because Data Science is part-engineering, part-research. In this talk, I will show how, with a minimum amount of tweaking, Data Science managers can adapt the techniques used in Agile and establish best practices to make their teams more efficient. Read more.
Add to your personal schedule
4:20pm–5:00pm Thursday, 09/13/2018
Location: 1E 10/11 Level: Beginner
Secondary topics:  Transportation and Logistics
Yasuyuki Kataoka (NTT Innovation Institute, Inc.)
One of the challenges of sports data analytics is how to deliver machine intelligence beyond a mere real-time monitoring tool. This session highlights various real-time machine learning models in both IndyCar and Tour de France. This talk encompasses real-time data processing architecture, machine learning model, and demonstration that delivers meaningful insights for players and fans. Read more.
Add to your personal schedule
4:20pm–5:00pm Thursday, 09/13/2018
Location: 1E 12/13 Level: Beginner
Secondary topics:  Machine Learning in the enterprise
Francesca Lazzeri (Microsoft), Jaya Mathew (Microsoft)
What profession did Harvard Business Review call the Sexiest Job of the 21st Century? With the growing buzz of data science, several professionals have approached us at various events to learn more about how to become a data scientist. This session aims at raising awareness of what it takes to become a data-scientist and how artificial intelligence solutions have started to reinvent businesses. Read more.
Add to your personal schedule
4:20pm–5:00pm Thursday, 09/13/2018
Location: 1A 06/07 Level: Non-technical
Secondary topics:  Machine Learning in the enterprise
Bill Franks (International Institute For Analytics)
The International Institute For Analytics studied the analytics maturity level of large enterprises. The talk will cover how maturity varies by industry and some of the key steps organizations can take to move up the maturity scale. The research also correlates analytics maturity with a wide range of corporate success metrics including financial and reputational measures. Read more.