Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA
Strata Business Summit

March 6-8, 2018
San Jose, CA

The missing MBA for data-driven business.

Tailored for executives, business leaders, and strategists, you'll learn how some of the world's leading companies build modern data strategies. Discover game-changing technologies and their business applications—and concrete methodologies to move your company forward.

You'll also have access to a hand-picked lineup of Executive Briefings on key issues such as: artificial intelligence; predictive analytics and machine learning; cloud strategy; governance, security, and privacy; bot strategy and automation; and the Internet of Things.

Make data work for business

  • From banking to biotech, retail to government, entertainment to energy—every sector is changing in the face of abundant data. Executives need to make data serve the strategic imperatives of their business.
  • At Strata Business Summit, get the intel you need to build data strategies that drive efficiency and innovation in your business.

Featured Speakers

All Strata Data Conference Gold and Silver passes have access to Strata Business Summit Tuesday-Thursday. Platinum and Bronze passes have access to Strata Business Summit Wednesday-Thursday.

Tuesday March 6: Tutorials (Gold & Silver passes)
Wednesday March 7: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
8:45 | Location: Salon 1&2
Strata Data Conference Keynotes
10:30am
Morning break
Thursday March 8: Keynotes & Sessions (Platinum, Gold, Silver & Bronze passes)
8:45 | Location: Salon 1&2
Strata Data Conference Keynotes
10:30am
Morning break
Add to your personal schedule
9:00am - 5:00pm Monday, March 5 & Tuesday, March 6
Location: 212 D
Angie Ma (ASI), Maria Diaz (ASI Data Science)
Average rating: ****.
(4.00, 2 ratings)
Angie Ma offers a condensed introduction to key data science and machine learning concepts and techniques, showing you what is (and isn't) possible with these exciting new tools and how they can benefit your organization. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, March 6, 2018
Location: 210 A/E Level: Beginner
Burcu Baran (LinkedIn), Wei Di (LinkedIn), Michael Li (LinkedIn), Chi-Yi Kuan (LinkedIn)
Average rating: ****.
(4.44, 9 ratings)
Burcu Baran, Wei Di, Michael Li, and Chi-Yi Kuan walk you through the big data analytics and data science lifecycle and share their experience and lessons learned leveraging advanced analytics and machine learning techniques such as predictive modeling to drive and grow business at LinkedIn. Read more.
Add to your personal schedule
9:00am5:00pm Tuesday, March 6, 2018
Location: LL20 A
Madhav Madaboosi (BP), Meenakshisundaram Thandavarayan (Infosys), Matt Conners (Microsoft), Katie Malone (Civis Analytics), Mike Prorock (mesur.io), Thomas Miller (Northwestern University), Ann Nguyen (Whole Whale), Jennie Shin (Kaiser Permanente), Valentin Bercovici (PencilDATA), Wayde Fleener (General Mills), Joe Dumoulin (Next IT), Jules Malin (GoPro), Taylor Martin (O'Reilly Media), Divya Ramachandran (Captricity)
Hear practical insights from household brands and global companies: the challenges they tackled, approaches they took, and the benefits—and drawbacks—of their solutions. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, March 6, 2018
Location: 210 A/E Level: Non-technical
Nick Elprin (Domino Data Lab)
Average rating: *****
(5.00, 2 ratings)
The honeymoon era of data science is ending, and accountability is coming. Not content to wait for results that may or may not arrive, successful data science leaders deliver measurable impact on an increasing share of an enterprise's KPIs. Nick Elprin details how leading organizations have taken a holistic approach to people, process, and technology to build a sustainable competitive advantage. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 7, 2018
Location: 210 A/E Level: Intermediate
Mark Madsen (Think Big Analytics), Shant Hovsepian (Arcadia Data)
Average rating: ***..
(3.29, 7 ratings)
There are 70+ BI tools in the market and a dozen or more SQL- or OLAP-on-Hadoop open source projects. Mark Madsen and Shant Hovsepian outline the trade-offs between a number of architectures that provide self-service access to data and discuss the pros and cons of architectures, deployment strategies, and examples of BI on big data. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 7, 2018
Location: 210 C/G Level: Intermediate
Anne Buff (SAS)
Average rating: ****.
(4.50, 2 ratings)
Emerging technologies such as the IoT, AI, and ML present businesses with enormous opportunities for innovation, but to maximize the potential of these technologies, businesses must radically shift their approach to governance. Anne Buff explains what it takes to shift the focus of governance from standards, conformity, and control to accountability, extensibility, and enablement. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 7, 2018
Location: 210 D/H Level: Intermediate
Mauro Damo (Dell EMC), Wei Lin (Dell EMC)
Average rating: ***..
(3.50, 2 ratings)
Image recognition classification of diseases will minimize the possibility of medical mistakes, improve patient treatment, and speed up patient diagnosis. Mauro Damo and Wei Lin offer an overview of an approach to identify bladder cancer in patients using nonsupervised and supervised machine learning techniques on more than 5,000 magnetic resonance images from the Cancer Imaging Archive. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 7, 2018
Location: 210 A/E Level: Non-technical
Yishay Carmiel (IntelligentWire)
Average rating: ****.
(4.00, 3 ratings)
One of the most important tasks of AI has been to understand humans. People want machines to understand not only what they say but also what they mean and to take particular actions based on that information. This goal is the essence of conversational AI. Yishay Carmiel explores the latest breakthroughs and revolutions in this field and the challenges still to come. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 7, 2018
Location: 210 C/G Level: Beginner
Matthew Granade (Domino Data Lab)
Average rating: **...
(2.00, 1 rating)
Predictive analytics and artificial intelligence have become critical competitive capabilities. Yet IT teams struggle to provide the support data science teams need to succeed. Matthew Granade explains how leading banks, insurance and pharmaceutical companies, and others manage data science at scale. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 7, 2018
Location: 210 D/H Level: Beginner
Ari Gesher (Kairos Aerospace)
A warming planet needs precise, localized predictions about the effects of climate change to make good long-term and medium-term economic decision making. Ari Gesher demonstrates how to use a mix of physical simulation, enhanced scientific models, machine learning verification, and high-scale computing to predict and package climate predictions as data products. Read more.
Add to your personal schedule
1:50pm2:30pm Wednesday, March 7, 2018
Location: 210 A/E Level: Non-technical
Frances Haugen (Pinterest), Patrick Phelps (Pinterest)
Average rating: ****.
(4.67, 3 ratings)
Data science is most powerful when combined with deep domain knowledge, but those with domain knowledge don't work on data-focused teams. So how do you empower employees with diverse backgrounds and skill sets to be effective users of data? Frances Haugen and Patrick Phelps dive into the social side of data and share strategies for unlocking otherwise unobtainable insights. Read more.
Add to your personal schedule
1:50pm2:30pm Wednesday, March 7, 2018
Location: 210 C/G Level: Beginner
John Mertic (The Linux Foundation), Maryna Strelchuk (ING)
John Mertic and Maryna Strelchuk detail the benefits of a vendor-neutral approach to data governance, explain the need for an open metadata standard, and share how companies like ING, IBM, Hortonworks, and more are delivering solutions to this challenge as an open source initiative. Read more.
Add to your personal schedule
1:50pm2:30pm Wednesday, March 7, 2018
Location: 210 D/H Level: Beginner
Ayin Vala (DeepMD | Foundation for Precision Medicine)
Average rating: ****.
(4.33, 3 ratings)
Complex diseases like Alzheimer’s cannot be cured by pharmaceutical or genetic sciences alone, and current treatments and therapies lead to mixed successes. Ayin Vala explains how to use the power of big data and AI to treat challenging diseases with personalized medicine, which takes into account individual variability in medicine intake, lifestyle, and genetic factors for each patient. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 7, 2018
Location: 210 A/E
Michael Chui (McKinsey Global Institute)
Average rating: ****.
(4.83, 6 ratings)
After decades of extravagant promises, artificial intelligence is finally starting to deliver real-life benefits to early adopters. However, we're still early in the cycle of adoption. Michael Chui explains where investment is going, patterns of AI adoption and value capture by enterprises, and how the value potential of AI across sectors and business functions is beginning to emerge. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 7, 2018
Location: 210 C/G Level: Non-technical
Katie Malone (Civis Analytics), Skipper Seabold (Civis Analytics)
Average rating: *****
(5.00, 1 rating)
A huge challenge for data science managers is determining priorities for their teams, which often have more good ideas than they have time. Katie Malone and Skipper Seabold share a framework that their large and diverse data science team uses to identify, discuss, select, and manage data science projects for a fast-moving startup. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 7, 2018
Location: 210 D/H Level: Beginner
Or Herman-Saffar (Dell), Ran Taig (Dell EMC)
Average rating: *....
(1.67, 3 ratings)
What if we could predict when and where crimes will be committed? Or Herman-Saffar and Ran Taig offer an overview of Crimes in Chicago, a publicly published dataset of reported incidents of crime that have occurred in Chicago since 2001. Or and Ran explain how to use this data to explore committed crimes to find interesting trends and make predictions for the future. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 7, 2018
Location: 210 A/E Level: Intermediate
Ted Malaska (Blizzard Entertainment), Jonathan Seidman (Cloudera)
Average rating: ****.
(4.67, 3 ratings)
Recent years have seen dramatic advancements in the technologies available for managing and processing data. While these technologies provide powerful tools to build data applications, they also require new skills. Ted Malaska and Jonathan Seidman explain how to evaluate these new technologies and build teams to effectively leverage these technologies and achieve ROI with your data initiatives. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 7, 2018
Location: 210 C/G
Moderated by:
Lisha Li (Amplify Partners)
Panelists:
Katherine Boyle (General Catalyst), Wayne Hu (SignalFire), Andrew Parker (Spark Capital), Brandon Reeves (Lux Capital)
Average rating: ****.
(4.00, 1 rating)
To anticipate who will succeed and to invest wisely, investors spend a lot of time trying to understand the longer-term trends within an industry. In this panel discussion, top-tier VCs look over the horizon to consider the big trends in how data is being put to work in startups and share what they think the field will look like in a few years (or more). Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 7, 2018
Location: 210 D/H Level: Intermediate
Kapil Surlaker (LinkedIn), Ya Xu (LinkedIn)
Average rating: *****
(5.00, 3 ratings)
Metrics measurement and experimentation play crucial roles in every product decision at LinkedIn. Kapil Surlaker and Ya Xu explain why, to meet the company's needs, LinkedIn built the UMP and XLNT platforms for metrics computation and experimentation, respectively, which have allowed the company to perform measurement and experimentation very efficiently at scale while preserving trust in data. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 7, 2018
Location: 210 A/E
Alysa Z. Hutnik (Kelley Drye & Warren LLP), Crystal Skelton (Kelley Drye & Warren LLP)
Average rating: *****
(5.00, 1 rating)
Big data promises enormous benefits for companies. But what about privacy, data protection, and consumer laws? Having a solid understanding of the legal and self-regulatory rules of the road are key to maximizing the value of your data while avoiding data disasters. Alysa Hutnik and Crystal Skelton share legal best practices and practical tips to avoid becoming a big data “don’t.” Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 7, 2018
Location: 210 C/G Level: Non-technical
Stephanie Beben (Booz Allen Hamilton)
How can you most effectively use machine intelligence to drive strategy? By merging it in the right way with the human ingenuity of leaders throughout your organization. Stephanie Beben shares insights from her work with pioneering companies, government agencies, and nonprofits that are successfully navigating this partnership by becoming “mathematical corporations.” Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 7, 2018
Location: 210 D/H
Derek Ruths (CAI)
Unreasonable sales forecasts, badly overstocked inventory, misguided investments . . . bad analyses happen all the time, leading to bad decisions and costing businesses millions of dollars. Derek Ruths shares the five most common issues that lead to bad data-informed thinking. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 8, 2018
Location: 210 A/E
Mike Olson (Cloudera)
Average rating: ****.
(4.75, 4 ratings)
Mike Olson shares examples of real-world machine learning applications, explores a variety of challenges in putting these capabilities into production—including the speed with with technology is moving, cloud versus in-data-center consumption, security and regulatory compliance, and skills and agility in getting data and answers into the right hands—and outlines proven ways to meet them. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 8, 2018
Location: 210 C/G Level: Beginner
Secondary topics:  Graphs and Time-series
Michael Schrenk (Self-Employed)
Average rating: ****.
(4.00, 5 ratings)
Big data becomes much more powerful when it has context. Fortunately, creative data scientists can create needed context though the use of metadata. Michael Schrenk explains how metadata is created and used to gain competitive advantages, predict troop strength, or even guess Social Security numbers. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 8, 2018
Location: 210 D/H Level: Beginner
Average rating: **...
(2.00, 1 rating)
Sugreev Chawla offers an overview of Spotlight, a tool created by Thorn, a nonprofit that uses technology to fight online child sexual exploitation. It allows law enforcement to process millions of escort ads per month in an effort to fight sex trafficking, using graph analysis, time series analysis, and NLP techniques to surface important networks of ads and characterize their behavior over time. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 8, 2018
Location: 210 A/E Level: Intermediate
David Talby (Pacific AI)
Average rating: ***..
(3.50, 4 ratings)
Machine learning and data science systems often fail in production in unexpected ways. David Talby shares real-world case studies showing why this happens and explains what you can do about it, covering best practices and lessons learned from a decade of experience building and operating such systems at Fortune 500 companies across several industries. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 8, 2018
Location: 210 C/G Level: Beginner
Paco Nathan (derwen.ai)
Average rating: ****.
(4.25, 4 ratings)
Human in the loop (HITL) has emerged as a key design pattern for managing teams where people and machines collaborate. Such systems are mostly automated, with exceptions referred to human experts, who help train the machines further. Paco Nathan offers an overview of HITL from the perspective of a business manager, focusing on use cases within O'Reilly Media. Read more.
Add to your personal schedule
11:50am12:30pm Thursday, March 8, 2018
Location: 210 D/H Level: Intermediate
Average rating: *****
(5.00, 2 ratings)
With so many business intelligence tools in the Hadoop ecosystem and no common measure to identify the efficiency of each tool, where do you begin to build or modify your enterprise data lake strategy? Sagar Kewalramani shares real-world BI problems and how they were resolved with Hadoop tools and demonstrates how to build an effective data lake strategy with open source tools and components. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 8, 2018
Location: 210 A/E Level: Intermediate
Mark Donsky (Okera), Steven Ross (Cloudera)
In May 2018, the General Data Protection Regulation (GDPR) goes into effect for firms doing business in the EU, but many companies aren't prepared for the strict regulation or fines for noncompliance (up to €20 million or 4% of global annual revenue). Mark Donsky and Steven Ross outline the capabilities your data environment needs to simplify compliance with GDPR and future regulations. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 8, 2018
Location: 210 C/G
Alex Rosenblat (Data & Society Research Institute )
Average rating: *****
(5.00, 1 rating)
Ride-hail drivers work alone, but they’re banding together online to compare notes, uncover new policies, and help each other navigate a workplace characterized by information scarcity. Alex Rosenblat explores how ride-hail workers are using online forums to create their own workplace culture as employment relationships grow more remote and algorithms replace human managers. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 8, 2018
Location: 210 D/H Level: Intermediate
Chris Chapo (Gap Inc.)
Average rating: ****.
(4.20, 5 ratings)
Chris Chapo walks you through real-world examples of companies that are driving transformational change by leveraging data science and analytics, paying particular attention to established organizations where these capabilities are newer concepts. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 8, 2018
Location: 210 A/E Level: Non-technical
Anjali Thakur (Accenture)
Average rating: *....
(1.00, 5 ratings)
Whether you are a technology or a services provider, understanding your value in the ecosystem and focusing on the right partners to reach your market goals is critical. Anjali Thakur shares examples of teaming models and leading practices for accelerating value from your ecosystem strategy. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 8, 2018
Location: 210 C/G Level: Beginner
Brian Karfunkel (Pinterest)
Average rating: ****.
(4.50, 2 ratings)
When software companies use A/B tests to evaluate product changes and fail to accurately estimate the long-term impact of such experiments, they risk optimizing for the users they have at the expense of the users they want to have. Brian Karfunkel explains how to estimate an experiment’s impact over time, thus mitigating this risk and giving full credit to experiments targeted at noncore users. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 8, 2018
Location: 210 D/H
Matt Derda (Trifacta), Harrison Lynch (Consensus Corporation)
Average rating: **...
(2.00, 1 rating)
Matt Derda and Harrison Lynch explain how Consensus leverages the combined power of data wrangling and machine learning to more efficiently identify and reduce retail fraud and how adopting data wrangling technology has helped Trifacta reduce time spent data wrangling from six weeks to one week. Read more.
Add to your personal schedule
4:20pm5:00pm Thursday, March 8, 2018
Location: 210 A/E Level: Non-technical
Jesse Anderson (Big Data Institute)
Average rating: ****.
(4.00, 1 rating)
There's been an explosion of new architectures, but is this because engineers love new things or is there a good business reason for these changes? Jesse Anderson explores new architectures and the actual business problems they solve. You may find out that your team would be far more productive if you moved to these architectures. Read more.
Add to your personal schedule
4:20pm5:00pm Thursday, March 8, 2018
Location: 210 C/G Level: Beginner
Felix Gorodishter (GoDaddy)
Average rating: ***..
(3.00, 2 ratings)
GoDaddy ingests and analyzes over 100,000 data points per second. Felix Gorodishter discusses the company's big data journey from ingest to automation, how it is evolving its systems to scale to over 10 TB of new data per day, and how it uses tools like anomaly detection to produce valuable insights, such as the worth of a reminder email. Read more.
Add to your personal schedule
4:20pm5:00pm Thursday, March 8, 2018
Location: 210 D/H Level: Beginner
Marcin Pilarczyk (Ryanair)
Average rating: *****
(5.00, 2 ratings)
Managing fuel at a company flying 120 millions passengers yearly is not a trivial task. Marcin Pilarczyk explores the main aspects of fuel management of a modern airline and offers an overview of machine learning methods supporting long-term planning and daily decisions. Read more.