Presented By O'Reilly and Cloudera
Make Data Work
December 1–3, 2015 • Singapore
 

Click buttons to filter by type

  • Events
  • Tutorials
  • Training
  • Keynotes
  • Office Hours
  • 321-322
    Add Apache SINGA: A flexible and scalable deep learning platform for big data analytics to your personal schedule
    11:50am Apache SINGA: A flexible and scalable deep learning platform for big data analytics Ju Fan (National University of Singapore), Wei Wang (National University of Singapore)
    Add Data modeling for data science: Simplify your workload with complex types to your personal schedule
    1:30pm Data modeling for data science: Simplify your workload with complex types Marcel Kornacker (Cloudera), Skye Wanderman-Milne (Cloudera)
    Add Boosting consumer engagement at PayPal to your personal schedule
    2:20pm Boosting consumer engagement at PayPal Sujit Mathew (PayPal), Yew Yap Goh (PayPal)
    Add Technology solutions for data analytics with privacy and data control to your personal schedule
    4:00pm Technology solutions for data analytics with privacy and data control Stephen Hardy (National ICT Australia)
    Add Using EEG and machine learning for lie detection to your personal schedule
    4:50pm Using EEG and machine learning for lie detection Jennifer Marsman (Microsoft)
    324
    Add Monitoring traffic in Singapore using telco data to your personal schedule
    11:50am Monitoring traffic in Singapore using telco data Thomas Holleczek (Singtel)
    Add GearPump: Real time DAG processing at scale to your personal schedule
    1:30pm GearPump: Real time DAG processing at scale Sean Zhong (Previously Intel)
    Add Application of Spark on analyzing massive GIS data for a large number of mobile objects to your personal schedule
    2:20pm Application of Spark on analyzing massive GIS data for a large number of mobile objects Masaru Dobashi (NTT DATA Corporation), Yoshitaka Suzuki (IHI Corporation)
    Add Modeling machine failure in the IoT era to your personal schedule
    4:00pm Modeling machine failure in the IoT era Danielle Dean (Microsoft)
    331
    Add Why you need a data strategy to your personal schedule
    11:50am Why you need a data strategy Edd Wilder-James (Silicon Valley Data Science)
    Add Don't believe everything you see on CSI: Beyond predictive policing to your personal schedule
    4:50pm Don't believe everything you see on CSI: Beyond predictive policing Hong Eng Koh (Oracle), Vladimir Videnovic (Oracle)
    334-335
    Add Multitenant Hadoop across geographically distributed data centers to your personal schedule
    1:30pm Multitenant Hadoop across geographically distributed data centers Heesun Won (ETRI), Minh Chau Nguyen (ETRI)
    Add Hadoop in the cloud: An architectural how-to to your personal schedule
    2:20pm Hadoop in the cloud: An architectural how-to Jairam Ranganathan (Cloudera)
    Add Evolving from RDBMS to NoSQL + SQL to your personal schedule
    4:50pm Evolving from RDBMS to NoSQL + SQL Jim Scott (MapR Technologies, Inc.)
    328-329
    Add Fast big data analytics with Spark on Tachyon in Baidu to your personal schedule
    11:00am Fast big data analytics with Spark on Tachyon in Baidu Bin Fan (Alluxio), Xiang Wen (Baidu)
    Add Adaptive, demand-driven shared transit through crowdsourcing and big data to your personal schedule
    11:50am Adaptive, demand-driven shared transit through crowdsourcing and big data Feng-Yuan Liu (Infocomm Development Authority of Singapore)
    Add Reliable data propagation between SQL and NoSQL databases using Aesop to your personal schedule
    1:30pm Reliable data propagation between SQL and NoSQL databases using Aesop Regunath Balasubramanian (Flipkart Internet)
    Add Architectural patterns for streaming applications to your personal schedule
    4:00pm Architectural patterns for streaming applications Ted Malaska (Blizzard), Mark Grover (Cloudera)
    Add The evolution of massive-scale data processing to your personal schedule
    4:50pm The evolution of massive-scale data processing Tyler Akidau (Google)
    332
    Add Road to real-time digital business to your personal schedule
    11:00am Road to real-time digital business Rod Smith (IBM Emerging Internet Technologies )
    Add The journey to value using advanced analytics to your personal schedule
    11:50am The journey to value using advanced analytics Thomas Beaujard (Accenture Digital), Tom Ridsdill-Smith (Woodside)
    Add Demystifying analytics: Idea to insight in 7 minutes to your personal schedule
    1:30pm Demystifying analytics: Idea to insight in 7 minutes Christopher Harrold (Talend)
    Add Avoiding big data ecoming a big problem to your personal schedule
    2:20pm Avoiding big data ecoming a big problem Raghunath Nambiar (Cisco)
    Add Big solution in manufacturing industry: Ask hadoop. Hadoop answers. to your personal schedule
    4:00pm Big solution in manufacturing industry: Ask hadoop. Hadoop answers. SeongHwa Ahn (SK telecom), Jisung Kim (sk telecom)
    333
    Add Real-world success with big data to your personal schedule
    4:00pm Real-world success with big data Neil Mendelson (Oracle)
    Add Faster time to insight using Spark, Tachyon, and Zeppelin to your personal schedule
    4:50pm Faster time to insight using Spark, Tachyon, and Zeppelin Nirmal Ranganathan (Rackspace)
    7:30am Coffee Break
    Room: Summit 1-2
    Add Wednesday keynote welcome to your personal schedule
    8:45am Plenary
    Room: Summit 1-2
    Wednesday keynote welcome Roger Magoulas (O'Reilly Media), Doug Cutting (Cloudera), Alistair Croll (Solve For Interesting)
    Add The Next Generation of Analytics to your personal schedule
    8:55am Plenary
    Room: Summit 1-2
    The Next Generation of Analytics Mike Olson (Cloudera)
    Add Taxi Uncle, where are you?: Using machine learning to predict taxi availability to your personal schedule
    9:10am Plenary
    Room: Summit 1-2
    Taxi Uncle, where are you?: Using machine learning to predict taxi availability Kevin Lee (GrabTaxi)
    Add Road to real-time digital business, sponsored by IBM to your personal schedule
    9:25am Plenary
    Room: Summit 1-2
    Road to real-time digital business, sponsored by IBM Rod Smith (IBM Emerging Internet Technologies )
    Add Data ‘daddying' vs. data empowerment to your personal schedule
    9:35am Plenary
    Room: Summit 1-2
    Data ‘daddying' vs. data empowerment Tara Hirebet (R/GA)
    Add Patterns from the future, sponsored by SAS to your personal schedule
    9:50am Plenary
    Room: Summit 1-2
    Patterns from the future, sponsored by SAS Deepak Ramanathan (SAS Asia Pacific)
    Add Music science: Applying streaming data to map a billion behaviors to your personal schedule
    9:55am Plenary
    Room: Summit 1-2
    Music science: Applying streaming data to map a billion behaviors Rishi Malhotra (Saavn)
    Add Data-driven, insight-powered transformation, sponsored by Accenture to your personal schedule
    10:05am Plenary
    Room: Summit 1-2
    Data-driven, insight-powered transformation, sponsored by Accenture Amit Bansal (Accenture Digital)
    Add When AI joins the team: Onboarding the next generation of employees to your personal schedule
    10:10am Plenary
    Room: Summit 1-2
    When AI joins the team: Onboarding the next generation of employees Jana Eggers (Nara Logics)
    10:30am Morning Break Sponsored by SAS
    Room: Concourse 1-4 (Sponsor Pavilion)
    Add Lunch / Wednesday Industry Tables to your personal schedule
    12:30pm Lunch sponsored by IBM
    Room: Concourse 1-4 (Sponsor Pavilion)
    Lunch / Wednesday Industry Tables
    3:00pm Afternoon break Sponsored by Accenture
    Room: Concourse 1-4 (Sponsor Pavilion)
    Add Attendee Reception to your personal schedule
    5:30pm Sponsored by Fusionex and IBM
    Room: Concourse 1-4 (Sponsor Pavilion)
    Attendee Reception
    11:00am-11:40am (40m) Data Science and Advanced Analytics
    Building South East Asia's largest E-commerce Recommender
    Kai Xin Thia (Lazada)
    Southeast Asia provides a unique challenge to large recommender systems: how will you design one system that recommends products to millions of users, many whom are spread across several countries, with their own language and cultural preferences? Well, you don't. Instead, we will explore a hybrid system that integrates inputs from a variety of recommenders and deploys it on a distributed system.
    11:50am-12:30pm (40m) Data Science and Advanced Analytics
    Apache SINGA: A flexible and scalable deep learning platform for big data analytics
    Ju Fan (National University of Singapore), Wei Wang (National University of Singapore)
    We will introduce Apache SINGA, a flexible and scalable deep learning platform for big data analytics. SINGA is flexible to support various deep learning models, and is general to provide scalable training architecture. We will also show two applications to demonstrate how SINGA is helpful for healthcare data analytics, predicting risk-of-readmission and modeling chronic disease progression.
    1:30pm-2:10pm (40m) Data Science and Advanced Analytics
    Data modeling for data science: Simplify your workload with complex types
    Marcel Kornacker (Cloudera), Skye Wanderman-Milne (Cloudera)
    In this talk, we will explain how data scientists use nested data structures to increase analytic productivity. We will use two well-known relational schemas - TPC-H and Twitter - to demonstrate how to simplify data science workloads with nested schemas. Also, we will outline best practices for converting flat relational schemas into nested ones, and give examples of data science-style analysis.
    2:20pm-3:00pm (40m) Hadoop Use Cases
    Boosting consumer engagement at PayPal
    Sujit Mathew (PayPal), Yew Yap Goh (PayPal)
    Our team’s main focus at PayPal is to boost customer engagement. This talk is about how we use predictive modeling to recommend products to consumers. We will talk about the technologies we use and how we deploy our models to production.
    4:00pm-4:40pm (40m) Data Science and Advanced Analytics
    Technology solutions for data analytics with privacy and data control
    Stephen Hardy (National ICT Australia)
    Privacy in the world of big data is often considered as a legal or regulatory function. However, there are technology solutions for analytics that can be used today to protect users' privacy and to enable applications over data that is too sensitive to share. We will illustrate the state-of-the-art in privacy-preserving machine learning, including new techniques we have developed.
    4:50pm-5:30pm (40m) Data Science and Advanced Analytics
    Using EEG and machine learning for lie detection
    Jennifer Marsman (Microsoft)
    Using the EPOC headset from Emotiv, I can capture the big data stream of EEG from our brains. I will share my results on a “lie detector” experiment comparing brain waves when telling the truth and lying. I have built classifiers based on the EEG data using Azure Machine Learning to predict whether a subject is telling the truth. The effectiveness of multiple classifiers can be easily compared.
    11:00am-11:40am (40m) IoT and Real-time
    Invigorating the Telco landscape: How telcos can use data assets to create new applications
    Amy Shi-Nash (Singtel)
    This talk will broach the topic of how DataSpark has created an innovative way of understanding people and what is important to them, by leveraging advanced data science and the wealth of data in an aggregated manner, while adhering to high standards of data privacy.
    11:50am-12:30pm (40m) IoT and Real-time
    Monitoring traffic in Singapore using telco data
    Thomas Holleczek (Singtel)
    We present a traffic measurement system that monitors subway and expressway traffic from telco location data.
    1:30pm-2:10pm (40m) IoT and Real-time
    GearPump: Real time DAG processing at scale
    Sean Zhong (Previously Intel)
    GearPump is an akka based framework that processes real time data across a DAG of actors. Its data delivery is highly scalable with at least once data delivery guarantees.
    2:20pm-3:00pm (40m) IoT and Real-time
    Application of Spark on analyzing massive GIS data for a large number of mobile objects
    Masaru Dobashi (NTT DATA Corporation), Yoshitaka Suzuki (IHI Corporation)
    We are developing a platform to process massive sensor data obtained from social infrastructures and industrial machinery all over the world, in order to achieve advanced safety management. In this session, we'll talk about the capability of Spark to realize time-series data processing, the best practices of application development, and realistic lessons on operating Spark on YARN.
    4:00pm-4:40pm (40m) IoT and Real-time
    Modeling machine failure in the IoT era
    Danielle Dean (Microsoft)
    Predictive maintenance is a technique to predict when an in-service machine will fail so that maintenance can be planned in advance. This talk introduces the landscape and challenges of predictive maintenance applications in the industry. Through a real-world example, the talk also illustrates how to formulate a predictive maintenance problem with three machine learning models.
    4:50pm-5:30pm (40m) IoT and Real-time
    Using machine learning to identify fraud on Telecom networks
    Arshak Navruzyan (Startup.ML)
    Like most large internet sites, Telecom networks are constantly under attack by highly sophisticated fraudsters. Historically, carriers have tried to isolate fraudulent behavior through complex rules. However, increasingly there is a need to use machine learning algorithms that can keep up with the changing face of Telecom fraud.
    11:00am-11:40am (40m) Data-driven Business
    Democratizing big data: Riding the curve from descriptive to prescriptive intelligence
    Tushar Shanbhag (Adatao, Inc)
    All analytics is prescriptive analytics; it just depends on who's writing the prescription, human or machine. In this talk, I will present how to humanize the big data experience, promote collaboration between business users and data scientists, and bridge the gap between human and machine.
    11:50am-12:30pm (40m) Data-driven Business
    Why you need a data strategy
    Edd Wilder-James (Silicon Valley Data Science)
    Big data and data science have great potential for accelerating business, but how do you reconcile the opportunity with the sea of possible technologies? Conventional data strategy has little to guide us, focusing more on governance than on creating new value. In this talk, we explain the how to create a modern data strategy that powers data-driven business.
    1:30pm-2:10pm (40m) Data-driven Business
    Building a self-serve real-time reporting platform at LinkedIn
    Shirshanka Das (LinkedIn)
    LinkedIn describes how they’ve built a self-serve petabyte-scale reporting platform centered around Hadoop, that powers all business decision making at LinkedIn. We describe how we overcame challenges to scale to over a thousand analysts, over a thousand metrics, and provide daily, hourly, as well as real-time reports. This has reduced turnaround times for dashboards from weeks to a few hours.
    2:20pm-3:00pm (40m) Data-driven Business
    Data-savvy leaders of the future: Designing an applied analytics course for MBAs
    Hallie Benjamin (Accenture)
    As the lead for the Accenture-UC Berkeley Data & Analytics Partnership, Hallie has worked with a multidisciplinary team of professors, data scientists, and students to design the Applied Learning Course in Data Science. She will talk about her experience and lessons learned for bringing together technical and business minds to help data science feature more prominently in business strategy.
    4:00pm-4:40pm (40m) Data-driven Business
    The 3 key barriers keeping companies from acting upon the possibilities that big data has to offer
    Pauline Brown (Dataiku)
    Getting from raw data to deploying data-driven solutions requires technology, data, and people. All of which exist. So why aren’t we seeing more truly data-driven companies: what's missing and why? Find out how lack of collaboration is what is keeping companies from imagining and actually doing what is possible to accomplish with big data.
    4:50pm-5:30pm (40m) Data-driven Business
    Don't believe everything you see on CSI: Beyond predictive policing
    Hong Eng Koh (Oracle), Vladimir Videnovic (Oracle)
    Public safety and national security are increasingly being challenged by technology; the need to use data to detect and investigate criminal activities has increased dramatically. But with the sheer volume of data and noise, law enforcement organisations are struggling to keep up. This session will examine trends and use cases on how big data can be utilised to make the world a safer place.
    11:00am-11:40am (40m) Hadoop Platform
    Hadoop's storage gap: Resolving transactional access/analytic performance trade-offs with Kudu
    Todd Lipcon (Cloudera)
    This session will investigate the trade-offs between real-time transactional access and fast analytic performance in Hadoop from the perspective of storage engine internals. We will discuss recent advances, evaluate benchmark results from current generation Hadoop technologies, and propose potential ways ahead for the Hadoop ecosystem to conquer its newest set of challenges.
    11:50am-12:30pm (40m) Hadoop Platform
    Designing an SQL-on-Hadoop cluster using Impala simulator: A use case for the banking and financial services sector
    Jun Liu (Intel), Zhaojuan Bian (Intel)
    Based on previous experience, there are many challenges in designing an Impala cluster for production, such as table schema, data placement, file format selection, hardware selection, and software stack parameters tuning. We will walk through a real-world case study in the banking and financial services sector to illustrate how we use our simulator-based approach to design an Impala cluster.
    1:30pm-2:10pm (40m) Hadoop Platform
    Multitenant Hadoop across geographically distributed data centers
    Heesun Won (ETRI), Minh Chau Nguyen (ETRI)
    This session will address how one single Hadoop cluster can be built across many geographically distributed data centers to provide multitenant analytics services. We extend the overall architecture of Hadoop so that multiple tenants can securely access, share, and analyze data in their own isolated executing environments.
    2:20pm-3:00pm (40m) Hadoop Platform
    Hadoop in the cloud: An architectural how-to
    Jairam Ranganathan (Cloudera)
    Apache Hadoop was designed when cloud models were in their infancy. Despite this fact, Hadoop has proven remarkably adept at migrating its architecture to work well in the context of the cloud, as production workloads migrate to a cloud environment. This talk will have cover several topics on adapting Hadoop to the cloud.
    4:00pm-4:40pm (40m) Hadoop Platform
    From Oracle to Hadoop: Unlocking Hadoop for your RDBMS with Apache Sqoop and other tools
    Guy Harrison (Dell Software)
    When people think of big data processing, they think of Apache Hadoop, but that doesn't mean traditional databases don't play a role. In most cases users will still draw from data stored in RDBMS systems. Apache Sqoop can be used to unlock that data and transfer it to Hadoop, enabling users with information stored in existing SQL tables to use new analytic tools.
    4:50pm-5:30pm (40m) Hadoop Platform
    Evolving from RDBMS to NoSQL + SQL
    Jim Scott (MapR Technologies, Inc.)
    Application developers have long created complex schemas to handle storing with minor relationships in an RDBMS. This talk will show how to convert an existing (complicated schema) music database to HBase for transactional workloads, plus how to use Drill against HBase for real-time queries. HBase column families will also be discussed.
    11:00am-11:40am (40m) Hadoop & Beyond
    Fast big data analytics with Spark on Tachyon in Baidu
    Bin Fan (Alluxio), Xiang Wen (Baidu)
    Baidu runs Tachyon in production with more than 100 nodes managing 2PB space! In this talk we will focus on how Tachyon can help improve big data analytics (ad-hoc query) with 30X performance improvement within Baidu.
    11:50am-12:30pm (40m) Hadoop & Beyond
    Adaptive, demand-driven shared transit through crowdsourcing and big data
    Feng-Yuan Liu (Infocomm Development Authority of Singapore)
    At IDA’s Government Analytics department, our team of data scientists work with bus operators to offer demand-driven express bus routes by combining crowdsourcing and big data. We use Apache Spark to analyze ticketing, taxi, and crowdsourced data to find bus routes that are both time-saving and financially viable. We show how these insights are delivered into a new transport option for commuters.
    1:30pm-2:10pm (40m) Hadoop & Beyond
    Reliable data propagation between SQL and NoSQL databases using Aesop
    Regunath Balasubramanian (Flipkart Internet)
    Aesop is an open source reliable change data propagation system. It has been used to build tiered data stores using best in class SQL and NoSQL databases. Aesop provides simple pubsub-like interfaces with implementations for popular technologies like MySQL, HBase, Redis, Elasticsearch, and Kafka. Aesop scales to multi-node clusters that process millions of data records.
    2:20pm-3:00pm (40m) Hadoop & Beyond
    When it absolutely, positively, has to be there: Reliability guarantees in Kafka
    Gwen Shapira (Confluent)
    Kafka provides the low latency, high throughput, high availability, and scale that financial services firms require. But can it also provide complete reliability? In this session, we will go over everything that happens to a message - from producer to consumer - and pinpoint all the places where data can be lost if you are not careful.
    4:00pm-4:40pm (40m) Hadoop & Beyond
    Architectural patterns for streaming applications
    Ted Malaska (Blizzard), Mark Grover (Cloudera)
    In this session, we will discuss common archictectural patterns for building streaming applications.
    4:50pm-5:30pm (40m) Hadoop & Beyond
    The evolution of massive-scale data processing
    Tyler Akidau (Google)
    Join me for a whirlwind tour of the conceptual building blocks of massive-scale data processing systems over the last decade, comparing and contrasting systems at Google with popular open source systems in use today.
    11:00am-11:40am (40m) Sponsored
    Road to real-time digital business
    Rod Smith (IBM Emerging Internet Technologies )
    Big data and analytics continue to be a disruptive business force. Are we entering another phase – real-time digital business transformation, where businesses are realizing that the time to adjust to market and customer opportunities and threats is shrinking quickly?
    11:50am-12:30pm (40m) Sponsored
    The journey to value using advanced analytics
    Thomas Beaujard (Accenture Digital), Tom Ridsdill-Smith (Woodside)
    In 2015 Woodside is working with Accenture to deliver predictive analytics to Woodside’s LNG operations. By combining Accenture’s expertise in data analytics and Woodside’s leading operational experience in oil and gas, valuable, actionable insights have been discovered throughout 2015.
    1:30pm-2:10pm (40m) Sponsored
    Demystifying analytics: Idea to insight in 7 minutes
    Christopher Harrold (Talend)
    n this practical demonstration, participants will see how they can perform a simple, but meaningful analysis of social sentiment data using freely available and easy to deploy tools. Participants will be equipped with the download links, scripts, and complete step-by-step walkthrough of the analysis from start to finish.
    2:20pm-3:00pm (40m) Sponsored
    Avoiding big data ecoming a big problem
    Raghunath Nambiar (Cisco)
    Join us as we review the Big Data landscape and reflect on Big Data lessons being learned in enterprise over the last few years and how these organisations are avoiding their Big Data environments becoming unmanageable by using simplex management for deployment, administration, monitoring and reporting no matter how much the environment scales.
    4:00pm-4:40pm (40m) Sponsored
    Big solution in manufacturing industry: Ask hadoop. Hadoop answers.
    SeongHwa Ahn (SK telecom), Jisung Kim (sk telecom)
    With Big Data system using Hadoop platform, we resolved the problem that make slow down the performance with existing legacy system based on RDBMS. And we set up real-time pattern analysis system using Spark. It provides easy and quick solutions to hands-on worker to monitor and diagnose manufacturing processes rather than traditional legacy system based on RDBMS.
    11:00am-11:40am (40m) Sponsored
    Analytics in action – The analytics lifecycle from data discovery to deployment
    Deepak Ramanathan (SAS Asia Pacific)
    With Hadoop becoming the chosen Data Platform across enterprises, analytical lifecycles are now being powered with Hadoop being the centrepiece for discovery and deployment. During this talk, attendees will get insights from organisations that are building and deploying thousands of analytical models into their operational environments.
    11:50am-12:30pm (40m) Sponsored
    Hadoop data replication: Guaranteeing consistency across distributions, versions and datacenters while active
    Paul Scott-Murphy (WANdisco)
    Hadoop lacks a mechanism to extend the distributed file system beyond the confines of a single cluster. Done right, active-active consensus can guarantee consistency of replicated file system changes regardless of Hadoop versions, distributions and communication latency. Find out how to perform selective data replication for cluster migration, disaster recovery, multi-site ingest, backup and more.
    1:30pm-2:10pm (40m) Sponsored
    On the edge of everything – from edge security to edge analytics: Emerging technologies define the way progressive organizations will interact with data
    Joanna Schloss (Dell Software)
    Join us to hear how Big Data and Analytics Subject Matter Expert, Joanna Schloss, envisions emerging technologies shaping the future of mission critical initiatives such as security and analytics. How will in-memory, big data, and IoT shape and guide businesses with deployment and maintenance of key capabilities?
    2:20pm-3:00pm (40m) Sponsored
    Scaling document data up (way up) while scaling complexity down
    Ted Dunning (MapR Technologies)
    Flexible data model. SQL compatibility. Unlimited scale. Nearly all data systems require that you pick at most one or two of three. You can now have them all. I will show how a real-world relational database can be massively simplified using document structure, how that database can be queried using SQL and how it can scale to the trillion-row, TB-scale required by modern applications.
    4:00pm-4:40pm (40m) Sponsored
    Real-world success with big data
    Neil Mendelson (Oracle)
    Companies who are successful with big data need to be analytics-driven. During this session, Neil will look at new analytics capabilities that are essential for big data to deliver results, and discuss how to maximize the time you spend providing differentiation for your organization. This session will also cover some common big data use cases in both industry and government.
    4:50pm-5:30pm (40m) Sponsored
    Faster time to insight using Spark, Tachyon, and Zeppelin
    Nirmal Ranganathan (Rackspace)
    In this talk, we will discuss how a streamlined Spark stack including Tachyon and Zeppelin can solve both the need for speed and reduced development time. We will walk thru a sample use case that utilizes 20 years of data to look for insights and create a predictive model from the dataset.
    7:30am-8:45am (1h 15m)
    Break: Coffee Break
    8:45am-8:55am (10m)
    Wednesday keynote welcome
    Roger Magoulas (O'Reilly Media), Doug Cutting (Cloudera), Alistair Croll (Solve For Interesting)
    Strata + Hadoop World Program Chairs Roger Magoulas, Doug Cutting, and Alistair Croll welcome you to the first day of keynotes.
    8:55am-9:10am (15m)
    The Next Generation of Analytics
    Mike Olson (Cloudera)
    Hadoop has come a long way from monolithic storage and batch processing; today the ecosystem is diverse and flexible and is emerging as the foundation of next-generation analytic applications. Join Mike Olson, Cloudera's Chief Strategy Officer, as he discusses new innovations across the ecosystem and gives a vision for Hadoop as an architectural must have for analytics transformation.
    9:10am-9:25am (15m)
    Taxi Uncle, where are you?: Using machine learning to predict taxi availability
    Kevin Lee (GrabTaxi)
    Why do taxi drivers not want to pick me up when I most need a taxi? Join GrabTaxi's Kevin Lee to learn how GrabTaxi uses machine learning to answer this age old question and build models for predicting taxi availability in order to improve matching on the platform.
    9:25am-9:35am (10m) Sponsored
    Road to real-time digital business, sponsored by IBM
    Rod Smith (IBM Emerging Internet Technologies )
    Big data and analytics continue to be a disruptive business force. Are we entering another phase – real-time digital business transformation, where businesses are realizing that the time to adjust to market and customer opportunities and threats is shrinking quickly?
    9:35am-9:50am (15m)
    Data ‘daddying' vs. data empowerment
    Tara Hirebet (R/GA)
    When data is hidden and crunched, and used purely for organization and optimization, we may be losing out on a crucial value it can offer – that of empowerment, engagement and impactful behavioral change.
    9:50am-9:55am (5m) Sponsored
    Patterns from the future, sponsored by SAS
    Deepak Ramanathan (SAS Asia Pacific)
    Join this keynote presentation to get tips from the future and hear about key patterns emerging from a wide cross section of corporate and institutional Hadoop journeys. Perhaps they’ll inspire yours.
    9:55am-10:05am (10m)
    Music science: Applying streaming data to map a billion behaviors
    Rishi Malhotra (Saavn)
    In this session, we’ll take a look at how music streaming delivers real time data that enables us to proxy a billion behaviors and apply the signals to other industries. Rishi was also a participant in the O’Reilly Study “Music Science”, published in 2015 by Alistair Croll.
    10:05am-10:10am (5m) Sponsored
    Data-driven, insight-powered transformation, sponsored by Accenture
    Amit Bansal (Accenture Digital)
    Learn how the intersection of people, data and intelligent machines will have far-reaching impact on the productivity, efficiency and operations of industries around the world as organizations transform to become data-driven, insight-powered enterprises.
    10:10am-10:25am (15m)
    When AI joins the team: Onboarding the next generation of employees
    Jana Eggers (Nara Logics)
    Within the next decade, 16 percent of current US jobs will be done by artificial intelligences. It’s time to start thinking about how we onboard these employees. While we’ll look at what it takes to get started with machine learning projects, our focus will be on the top 5 things you need to consider when your next employee is an AI.
    10:30am-11:00am (30m)
    Break: Morning Break Sponsored by SAS
    12:30pm-1:30pm (1h) Event
    Lunch / Wednesday Industry Tables
    Industry Table discussions are a great way to informally network with people in similar industries or interested in the same topics.
    3:00pm-4:00pm (1h)
    Break: Afternoon break Sponsored by Accenture
    5:30pm-6:30pm (1h) Event
    Attendee Reception
    Grab a drink, mingle with fellow Strata + Hadoop World participants, and see the latest technologies and products from leading companies in the data space.