Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Schedule: Sponsored sessions

9:00am–5:00pm Tuesday, 09/11/2018
Location: 1A 04/05
Jerry Overton (DXC), Ashim Bose (DXC), Samir Sehovic (DXC)
Average rating: *****
(5.00, 1 rating)
Acquiring machine learning (ML) technology is relatively straightforward, but ML must be applied to be useful. In this one-day boot camp that is equal parts hackathon, presentation, and group participation, Jerry Overton, Ashim Bose, and Samir Sehovic teach you how to apply advanced analytics in ways that reshape the enterprise and improve outcomes. Read more.
9:25am–9:35am Wednesday, 09/12/2018
Location: 3E
Ted Dunning (MapR)
Average rating: **...
(2.79, 19 ratings)
There’s real value in big data and more waiting when you add real-time, but to get the payoff, you need successful deployments of your AI and data-intensive applications. You need to be ready with your current applications in production but must have an architecture and infrastructure that are ready for the next ones as well. Ted Dunning explores how others have fared in this journey. Read more.
9:50am–9:55am Wednesday, 09/12/2018
Location: 3E
DD Dasgupta (Cisco)
Average rating: ***..
(3.60, 15 ratings)
DD Dasgupta explores the exciting development of the edge-cloud continuum, which is redefining business models and technology strategies while creating a vast array of new applications that will power the digital age. The continuum is also destroying what we know about the centralized data centers and cloud computing infrastructures that were so vital to the success of the previous computing eras. Read more.
10:15am–10:25am Wednesday, 09/12/2018
Location: 3E
Drew Paroski (MemSQL), Aatif Din (Fanatics)
Average rating: **...
(2.92, 13 ratings)
Today’s successful businesses utilize data better than their competitors; however, data sprawl and inefficient data infrastructure restrict what’s possible. Blending the best of the past with the software innovations of today will solve future data challenges. Drew Paroski shares how to develop modern database applications without sacrificing cost savings, data familiarity, and flexibility. Read more.
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1A 01/02
Jim Scott (NVIDIA)
Drawing on his experience working with customers across many industries, including chemical sciences, healthcare, and oil and gas, Jim Scott details the major impediments to successful completion of deep learning projects and solutions while walking you through a customer use case. Read more.
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1A 03
Han Yang (Cisco)
Data is the lifeblood of an enterprise, and it's being generated everywhere. To overcome the challenges of data gravity, data analytics, including machine learning, is best done where the data is located: ubiquitous machine learning. Han Yang explains how to overcome the challenges of machine learning everywhere. Read more.
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1E 17
Tim Davis (IBM)
Average rating: ***..
(3.33, 3 ratings)
Tim Davis discusses key pain points and solutions to problems many enterprises face with data in silos, poor-quality data that cannot always be trusted, and managing and making large volumes of data available to derive more accurate insights and machine learning models. Read more.
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1 E15
Ward Eldred (NVIDIA)
Average rating: *****
(5.00, 2 ratings)
Ward Eldred offers an overview of the types of analytical problems that can be solved using deep learning and shares a set of heuristics that can be used to evaluate the feasibility of analytical AI projects. Read more.
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1E 06
Paul Kent (SAS)
Average rating: *****
(5.00, 1 rating)
Software is eating the world, and open source is eating the software. Most contemporary analytics shops use a lot of open source software in their analytics platform. So where does commercial software like SAS fit? Paul Kent explains how you can achieve the best of both worlds by combining your favorite open source software with the power of SAS analytics. Read more.
11:20am–12:00pm Wednesday, 09/12/2018
Location: 1A 04/05
Petrus Smith (PwC)
Peet Smith explains how PwC is using modern database tools with a combination of open source technologies to automate and scale data ingestion and transformation to get data to engagement teams to help them streamline and accelerate client service delivery. Read more.
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1A 01/02
Skyler Thomas (MapR)
Average rating: *****
(5.00, 2 ratings)
In the past, there have been major challenges in quickly creating machine learning training environments and deploying trained models into production. Skyler Thomas details how Kubernetes helps data scientists and IT work in concert to speed model training and time-to-value. Read more.
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1A 04/05
Arakere Ramesh (Intel), Bharath Yadla (Aerospike)
Persistent memory accelerates analytics, database, and storage workloads across a variety of use cases, bringing new levels of speed and efficiency to the data center and to in-memory computing. Arakere Ramesh and Bharath Yadla offer an overview of the newly announced Intel Optane data center persistent memory and share the exciting potential of this technology in analytics solutions. Read more.
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1E 17
Zhi Zhu (China Construction Bank ), Luke Han (Kyligence)
When China Construction Bank wanted to migrate 23,000+ reports to mobile, it chose Apache Kylin as the high-performance and high-concurrency platform to refactor its data warehouse architecture to serving 400K+ users. Zhi Zhu and Luke Han detail the necessary architecture and best practices for refactoring a data warehouse for mobile analytics. Read more.
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1 E15
Darrin Johnson (NVIDIA)
While every enterprise is on a mission to infuse its business with deep learning, few know how to build the infrastructure to get them there. Darrin Johnson shares insights and best practices learned from NVIDIA's deep learning deployments around the globe that you can leverage to shorten deployment timeframes, improve developer productivity, and streamline operations. Read more.
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1A 03
Srikanth Desikan (Oracle)
Average rating: *****
(5.00, 1 rating)
SparklineData is an in-memory distributed scale-out analytics platform built on Apache Spark to enable enterprises to query on data lakes directly with instant response times. Srikanth Desikan offers an overview of SparklineData and explains how it can enable new analytics use cases working on the most granular data directly on data lakes. Read more.
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1E 06
Interested in how Ebates is using a hybrid on-premises and cloud implementation to scale out its centralized business intelligence and data hub? Mark Stange-Tregear shares the history, business context, and technical plan around Ebates’s hybrid Hadoop-AWS cloud approach. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1A 03
chris wojdak (Symcor)
Average rating: ****.
(4.67, 3 ratings)
Chris Wojdak explains how Symcor has transformed its big data architecture using Informatica’s comprehensive machine learning-based solutions for data integration, data quality, data cataloging, and data governance. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1E 06
Randy Lea (Arcadia Data)
Average rating: *****
(5.00, 1 rating)
The use of data lakes continue to grow, and the right business intelligence (BI) and analytics tools on data lakes are critical to data lake success. Randy Lea explains why existing BI tools work well for data warehouses but not data lakes and why every organization should have two BI standards: one for data warehouses and one for data lakes. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1E 17
Intelligent enterprises—fueled by rapid advances in artificial intelligence (AI), machine learning (ML), and the internet of things (IoT)—promise significant business value. Richard Mooney explains how to achieve the game-changing outcomes of an intelligent enterprise, delivering value across business functions with the synergy of machine and human intelligence. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1 E15
Michael Balint (NVIDIA)
Michael Balint explains how NVIDIA employs its own distribution of Kubernetes, in conjunction with DGX hardware, to make the most efficient use of GPU resources and scale its efforts across a cluster, allowing multiple users to run experiments and push their finished work to production. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1A 01/02
Anand Raman (Microsoft), Wee Hyong Tok (Microsoft)
Anand Raman and Wee Hyong Tok walk you through applying AI technologies in the cloud. You'll learn how to add prebuilt AI capabilities like object detection, face understanding, translation, and speech to applications, build cognitive search applications that understand deep content in images, text, and other data, use the Azure platform to accelerate machine learning, and more. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1A 04/05
Ben Sharma (Zaloni), Selwyn Collaco (TMX)
Average rating: *****
(5.00, 2 ratings)
Selwyn Collaco and Ben Sharma share insights from their real-world experience and discuss best practices for architecture, technology, data management, and governance to enable centralized data services and explain how to leverage the Zaloni Data Platform (ZDP), an integrated self-service data platform, to operationalize the enterprise data lake . Read more.
2:55pm–3:35pm Wednesday, 09/12/2018
Location: 1A 03
Mathew Lodge (Anaconda)
Average rating: *****
(5.00, 1 rating)
The days of deploying Java code to Hadoop and Spark data lakes for data science and ML are numbered. Welcome to the future. Containers and Kubernetes make great language-agnostic distributed computing clusters: it's just as easy to deploy Python as it is Java. Mathew Lodge shows you how. Read more.
2:55pm–3:35pm Wednesday, 09/12/2018
Location: 1E 06
Basil Faruqui (BMC Software)
Average rating: **...
(2.00, 1 rating)
Basil Faruqui demonstrates how to simplify the automation and orchestration of an IoT-driven data pipeline in a cloud environment where machine learning algorithms predict failures. Read more.
2:55pm–3:35pm Wednesday, 09/12/2018
Location: 1E 17
Chris Stirrat (Eagle Investment Systems)
Average rating: ***..
(3.00, 1 rating)
Eagle Investment Systems, a leading provider of financial services technology, is building a new Hadoop and cloud-based data management solution. Chris Stirrat explains how Eagle went from incubation to an enterprise-scale solution in just 10 months, using a Hadoop-based big data stack and multitenant architecture, transforming software creation, delivery, quality, technology, and culture. Read more.
2:55pm–3:35pm Wednesday, 09/12/2018
Location: 1A 04/05
Michael Mahoney (Kinetica)
Michael Mahoney demonstrates how to leverage the power of GPUs to converge streaming data analysis, location analysis, and streamlined machine learning with a single engine. Along the way, Michael shares real-world case studies on how Kinetica is used to solve complex data challenges. Read more.
2:55pm–3:35pm Wednesday, 09/12/2018
Location: 1A 01/02
Sara Alavi (Bell Canada)
Bell Canada, Canada's largest communications company, leads the industry in providing world-class broadband communications services to consumers and business customers. Join Sara Alavi to learn how the network big data and AI team within Bell is using modern data environments and applying a startup mindset to transform traditional networks into insight-driven intelligent networks. Read more.
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1A 01/02
Average rating: *****
(5.00, 3 ratings)
TD Bank’s data analytics team has undertaken a multiyear journey to modernize its data infrastructure for today and future needs. Joseph DosSantos explains how the team built a governed data lake foundation, enabling business users to leverage its big data environment to extract analytical insights while minimizing risks. Read more.
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1E 17
Paul Scott-Murphy (WANdisco)
Average rating: ****.
(4.50, 2 ratings)
Every organization is considering its storage options, with an eye toward the cloud. Paul Scott-Murphy explores what makes different large-scale storage systems and services unique, their clear (and unexpected) differences, the options you have to use them, and the surprises you can expect along the way. Read more.
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1A 03
As the data authority for hybrid cloud for big data analytics and AI, NetApp understands the value of the access, management, and control of data. Karthikeyan Nagalingam discusses the NetApp Data Fabric, which provides a unified data management environment that spans edge devices, data centers, and multiple hyperscale clouds using ONTAP software, all-flash systems, ONTAP Select, and cloud volumes. Read more.
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1 E15
Alen Capalik (FASTDATA.io), Jim McHugh (NVIDIA), SriSatish Ambati (H2O.ai), Tim Delisle (Datalogue)
Explore case studies from Datalogue, FASTDATA.io, and H20.ai that demonstrate how GPU-accelerated analytics, machine learning, and ETL help companies overcome slow queries and tedious data preparation process, dynamically correlate among data, and enjoy automatic feature engineering. Read more.
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1E 06
Faria Bruno (Amazon Web Services)
Bruno Faria explains how to identify the components and workflows in your current environment and shares best practices to migrate these workloads to AWS. Read more.
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1A 04/05
Anand Raman (Impetus Technologies)
Average rating: *....
(1.00, 1 rating)
Is a single source of truth across the enterprise possible, or is it just an expensive myth? Anand Raman explains why you need a holistic decision framework that addresses multiple facets from platform to processes. Join in to explore EDW modernization strategies, self-service analytics, and interactive insights on big data and discover a process to get to a unified data model. Read more.
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1A 01/02
Ivan Jibaja (Pure Storage)
Average rating: *****
(5.00, 1 rating)
Pure Storage runs over 70,000 tests per day. Using Spark’s flexible computing platform, the company can write a single application for both streaming and batch jobs so the company's team of triage engineers can understand the state of the continuous integration pipeline. Ivan Jibaja discusses the use case for big data analytics technologies, the architecture of the solution, and lessons learned. Read more.
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1A 03
Dan Adams (Pitney Bowes)
The role of data and the demand to get it right, coupled with competitive pressures to move faster, have dramatically increased. Companies now recognize data as an asset and need to manage it that way. Join Dan Adams for the insights you need to ensure that your data addresses current and future needs and that your organization is set up for success. Read more.
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1 E15
Renee Yao (NVIDIA)
Average rating: *****
(5.00, 1 rating)
Renee Yao explains how generative adversarial networks (GAN) are successfully used to improve data generation and explores specific real-world examples where customers have deployed GANs to solve challenges in healthcare, space, transportation, and retail industries. Read more.
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1A 04/05
Mark Huang (Bell Canada)
Like all telecommunication giants, Bell Canada relies on huge volumes of data to make accurate business decisions and deliver better services. Mark Huang discusses why Bell Canada chose Kyvos’s OLAP on big data technology to achieve multidimensional analytics and how it helped the company deliver to its growing business reporting demands. Read more.
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1E 17
Sam Chance (Cambridge Semantics), Partha Bhattachargee (Cambridge Semantics)
Ben Szekely shares a vision for digital innovation: The data fabric connects enterprise data for unprecedented access in an overlay fashion that does not disrupt current investments. Interconnected and reliable data drives business outcomes by automating scalable AI and ML efforts. Graph technology is the way forward to realize this future. Read more.
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1E 06
Randy Lea (Arcadia Data)
The use of data lakes continue to grow, and the right business intelligence (BI) and analytics tools on data lakes are critical to data lake success. Randy Lea explains why existing BI tools work well for data warehouses but not data lakes and why every organization should have two BI standards: one for data warehouses and one for data lakes. Read more.
9:15am–9:20am Thursday, 09/13/2018
Location: 3E
Average rating: **...
(2.87, 15 ratings)
IBM Analytics’s Dinesh Nirmal solves school lunch and the struggle to keep ahead of regulations. With AI tech like deep learning and NLG, supplying meals to California’s kids leaps from enriching metadata for compliance to actionable insights for the business. Read more.
9:30am–9:35am Thursday, 09/13/2018
Location: 3E
Average rating: ***..
(3.67, 9 ratings)
Data is the fuel for analytics and AI workloads, but the challenges in using it are constant. Ziya Ma discusses how recent innovations from Intel in high-capacity persistent memory and open source software are accelerating production-scale deployments, delivering breakthrough optimizations and faster insights to a wide range of opportunities in the digital enterprise. Read more.
9:55am–10:00am Thursday, 09/13/2018
Location: 3E
Chad W. Jennings (Google)
Average rating: ***..
(3.45, 11 ratings)
Cities all over the world are using data and analytics to optimize infrastructure, but city planners are often held back by outdated data gathering methods and legacy analysis tools. Chad Jennings details how Geotab, a leader in IoT fleet logistics, brought BigQuery's unique machine learning and geospatial capabilities to its existing datasets to deliver a more capable solution to city planners. Read more.
10:20am–10:25am Thursday, 09/13/2018
Location: 3E
Ben Sharma (Zaloni)
Average rating: ***..
(3.00, 12 ratings)
Once, a company could live 60-70 years on the S&P 500. Now it averages 15 years. If companies were people, this would be an epidemic on par with the Black Plague. But the same things that dragged humanity out of that dark age can drag companies out of this one. Read more.
11:20am–12:00pm Thursday, 09/13/2018
Location: 1A 03/04/05
Bob Bradley (Geotab), Chad W. Jennings (Google)
Average rating: ****.
(4.50, 4 ratings)
If your company isn’t good at analytics, it’s not ready for AI. Bob Bradley and Chad W. Jennings explain how the right data strategy can set you up for success in machine learning and artificial intelligence—the new ground for gaining competitive edge and creating business value. You'll then see an in-depth demonstration of Google technology from smart cities innovator Geotab. Read more.
11:20am–12:00pm Thursday, 09/13/2018
Location: 1A 01/02
Jennifer Shin (8 Path Solutions | NYU Stern | IBM)
Common wisdom dictates that we should never make assumptions, but assumptions are essential in the creation of statistical models. Jennifer Shin explores how assumptions fit into the creation of a statistical model, the pitfalls of applying a model to data without taking the underlying assumptions into account, and how to identify datasets where the model and its assumptions are applicable. Read more.
11:20am–12:00pm Thursday, 09/13/2018
Location: 1E 06
Arun Murugan (GE Digital), Jeff Miller (GE)
Average rating: **...
(2.00, 2 ratings)
Arun Murugan and Jeff Miller detail how complex relationships are discovered and modeled to simplify analytics while keeping an Agile architecture for data acquisition. You’ll see how GE uses machine learning (powered by Io-Tahoe) in data discovery and profiling for data engineering of the development of a standard data model essential to enterprise use cases. Read more.
1:10pm–1:50pm Thursday, 09/13/2018
Location: 1E 10/11
Faria Bruno (Amazon Web Services)
Average rating: ****.
(4.00, 1 rating)
Bruno Faria explains how to identify the components and workflows in your current environment and shares best practices to migrate these workloads to AWS. Read more.
1:10pm–1:50pm Thursday, 09/13/2018
Location: 1E 06
Dave Huh (Hitachi Vantara), Kevin Haas (Hitachi Vantara)
Data in most organizations today is massive, messy, and often found in silos. With so many sources to analyze, data engineers need to construct robust data pipelines using automation and minimize duplicate processes, as computation is costly for big data. David Huh shares strategies to construct data pipelines for machine learning, including one to reduce time to insight from weeks to hours. Read more.
1:10pm–1:50pm Thursday, 09/13/2018
Location: 1A 01/02
Shivnath Babu (Unravel Data Systems | Duke University), Madhusudan Tumma (TIAA)
Average rating: ****.
(4.00, 1 rating)
Operationalizing big data apps in a quick, reliable, and cost-effective manner remains a daunting task. Shivnath Babu and Madhusudan Tumma outline common problems and their causes and share best practices to find and fix these problems quickly and prevent such problems from happening in the first place. Read more.
1:10pm–1:50pm Thursday, 09/13/2018
Location: 1A 03/04/05
Ian Swanson (Oracle)
Ian Swanson explores why and how data scientists and line-of-business leaders must treat AI as a team sport and explains what tools are needed to deploy models and applications that truly inform decision making. Read more.
2:00pm–2:40pm Thursday, 09/13/2018
Location: 1A 03/04/05
Kyle Davis (Redis Labs)
Average rating: *****
(5.00, 1 rating)
Kyle Davis explains how Redis can be used for ingesting high-velocity data from large-scale platforms and IoT data collections as well as for storing and querying data using probabilistic data structures that trade some precision for both higher speed and lower storage requirements. Along the way, Kyle shares examples and a demo of the solution. Read more.
2:00pm–2:40pm Thursday, 09/13/2018
Location: 1A 01/02
Patrick Nussbaumer (Alteryx)
There is a lot of buzz around data science and machine learning in the world today. Unfortunately, to truly innovate with data and advanced capabilities, organizations need to expand their focus beyond just a few specialists. Patrick Nussbaumer details how focusing on people can help improve analytic value and drive innovation. Read more.
2:00pm–2:40pm Thursday, 09/13/2018
Location: 1E 06
Deborah Reynolds (Pfizer), Kurt Muehmel (Dataiku)
Average rating: ****.
(4.00, 2 ratings)
By creating a collaborative and interactive analytic environment, a forward-thinking company may harness the best capabilities of its business analysts and data scientists to answer the company’s most pressing business questions. Deborah Reynolds and Kurt Muehmel explain how large enterprises can successfully put data at the core of everyday business decisions. Read more.
3:30pm–4:10pm Thursday, 09/13/2018
Location: 1E 06
Antonio Fragoso (Globant)
Average rating: *....
(1.00, 1 rating)
Antonio Fragoso explores the key aspects of implementing a natural language processing project within your organization and reveals the necessary steps for making it a success. Antonio focuses on how to leverage an iterative process that can pave the way toward building a successful product. Read more.
4:20pm–5:00pm Thursday, 09/13/2018
Location: 1E 14
Mathew Lodge (Anaconda)
Average rating: *****
(5.00, 1 rating)
The days of deploying Java code to Hadoop and Spark data lakes for data science and ML are numbered. Welcome to the future. Containers and Kubernetes make great language-agnostic distributed computing clusters: it's just as easy to deploy Python as it is Java. Mathew Lodge shows you how. Read more.
4:20pm–5:00pm Thursday, 09/13/2018
Location: 1A 01/02
Jennifer Shin (8 Path Solutions | NYU Stern | IBM)
Common wisdom dictates that we should never make assumptions, but assumptions are essential in the creation of statistical models. Jennifer Shin explores how assumptions fit into the creation of a statistical model, the pitfalls of applying a model to data without taking the underlying assumptions into account, and how to identify datasets where the model and its assumptions are applicable. Read more.