Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

Schedule: Text sessions

9:00am5:00pm Tuesday, March 14, 2017
Spark & beyond
Location: San Jose Ballroom, Marriott
Andy Konwinski (Databricks)
Average rating: ****.
(4.43, 7 ratings)
Andy Konwinski introduces you to Apache Spark 2.0 core concepts with a focus on Spark's machine-learning library, using text mining on real-world data as the primary end-to-end use case. Read more.
1:50pm2:30pm Wednesday, March 15, 2017
Data science & advanced analytics
Location: 230 C Level: Intermediate
David Talby (Pacific AI), Claudiu Branzan (Accenture)
Average rating: ****.
(4.14, 7 ratings)
David Talby and Claudiu Branzan offer a live demo of an end-to-end system that makes nontrivial clinical inferences from free-text patient records. Infrastructure components include Kafka, Spark Streaming, Spark, and Elasticsearch; data science components include spaCy, custom annotators, curated taxonomies, machine-learned dynamic ontologies, and real-time inferencing. Read more.
4:20pm5:00pm Wednesday, March 15, 2017
Business case studies, Strata Business Summit
Location: 210 D/H Level: Intermediate
Alan Chaney (Bitvore Corp)
Average rating: ***..
(3.50, 2 ratings)
Bitvore Corp’s Bitvore for Munis personalized news surveillance system is rapidly becoming a must-have for all major fixed-income securities analysts, investors, and brokers working in the three-trillion-dollar municipal bond market in the USA. Alan Chaney explains how Bitvore delivers the few important and relevant articles out of thousands each day, saving users many hours daily. Read more.
11:00am11:40am Thursday, March 16, 2017
Dorna Bandari (Jetlore)
Average rating: ****.
(4.00, 2 ratings)
Most internet companies record a constant stream of logs as a user interacts with their application. Depending on the complexity of the application, the logs can be extremely difficult to decipher. Dorna Bandari presents a novel NLP-based method for clustering user sessions in consumer internet applications, which has proved to be extremely effective in both driving strategy and personalization. Read more.
1:50pm2:30pm Thursday, March 16, 2017
Grace Huang (Pinterest)
Average rating: ***..
(3.33, 3 ratings)
With over 75 billion pins, the Pinterest content corpus is one of the largest human-curated collection of ideas. Grace Huang walks you through the lifecycle of a piece of content in Pinterest, a portfolio of metrics developed to monitor the health of the content corpus, and the story of creating a cross-functional initiative to preserve a healthy, sustainable content ecosystem. Read more.
4:20pm5:00pm Thursday, March 16, 2017
Business case studies, Strata Business Summit
Location: 210 D/H Level: Intermediate
Mahesh Goud T (Ticketmaster)
Average rating: **...
(2.00, 1 rating)
Mahesh Goud shares success stories using Ticketmaster's large-scale contextual bandit platform for SEM, which determines the optimal keyword bids under evolving keyword contexts to meet different business requirements, and explores Ticketmaster's streaming pipeline, consisting of Storm, Kafka, HBase, the ELK Stack, and Spring Boot. Read more.