Presented By O'Reilly and Cloudera
Make Data Work
Sept 29–Oct 1, 2015 • New York, NY

Office Hours conference sessions

Office Hours gives you a chance to meet face-to-face in a small group setting with expert Strata + Hadoop World presenters. Discuss the speaker’s area of expertise, give feedback about their sessions, or ask questions.

Sign-up now by adding it to your personal schedule. Seating is limited.

Office Hours will take place in the Expo Hall.

    Wednesday, September 30

    11:20am–12:00pm Wednesday, 09/30/2015
    Location: Table A (O'Reilly Booth)
    srowen om (Cloudera)

    If you want to apply large-scale data science techniques and technologies to finance, come by and see Sean. He can also answer questions about:

    • Apache Spark
    • Apache Mahout
    • Data Science on Hadoop
    11:20am–12:00pm Wednesday, 09/30/2015
    Location: Table B (O'Reilly Booth)
    Anima Anandkumar (UC Irvine)

    Wondering what tensor methods can do for you? Anima will answer questions about:

    • How tensor methods can yield rich discriminative features for classification tasks
    • How tensor methods serve as an alternative method for training neural networks
    • Tensor methods as a new paradigm for training probabilistic models and for feature learning
    1:15pm–1:55pm Wednesday, 09/30/2015
    Location: Table A (O'Reilly Booth)
    Tim Berglund (Confluent)

    If you want to create robust, reliable distributed systems, stop by and see Tim. He’s happy to talk to you about:

    • Apache Cassandra
    • Distributed storage
    • Computation, timing, messaging and consensus
    2:05pm–2:45pm Wednesday, 09/30/2015
    Location: Table A (O'Reilly Booth)
    Alice Zheng (Amazon)

    Spend a little time with Alice. She offers invaluable advice on topics like:

    • Building machine learning models for intelligence applications
    • Evaluating machine learning models
    • Managing ML in production
    2:55pm–3:35pm Wednesday, 09/30/2015
    Location: Table A (O'Reilly Booth)
    Hossein Falaki (Databricks Inc.)

    Handling large or distributed data with R is challenging. Hossein can help you integrate Spark and R, and is happy to answer questions on:

    • Spark
    • SparkR
    • R Notebooks
    • Big data visualization and exploration
    2:55pm–3:35pm Wednesday, 09/30/2015
    Location: Table B (O'Reilly Booth)
    Juliet Hougland (Cloudera)

    Join Juliet if you have questions about:

    • Machine learning on Spark
    • Time series forecasting
    • Anomaly detection in time series
    • Featurizing high dimensional data, such as data from sensors or in time series
    4:35pm–5:15pm Wednesday, 09/30/2015
    Location: Table A (O'Reilly Booth)
    Garrett Grolemund (RStudio)

    Working with R? Garrett’s your guy. Stop by and talk to Garrett about:

    • How to use SparkR and database connections to work with Big Data in R
    • How to manipulate, visualize, model, and report on data with R
    • How to build interactive data products with R and Shiny
    5:25pm–6:05pm Wednesday, 09/30/2015
    Location: Table A (O'Reilly Booth)
    Martin Kleppmann (University of Cambridge)

    Martin can help you decide which framework is best suited to what kind of application. He’s ready to discuss:

    • Stream processing architectures for solving real-time data issues
    • The trade-offs made by different stream processing frameworks (such as Samza, Storm and Spark Streaming)
    • How to figure out which is best suited to what kind of application
    5:25pm–6:05pm Wednesday, 09/30/2015
    Location: Table B (O'Reilly Booth)
    Jim Scott (NVIDIA)

    Meet Jim to talk about:

    • Enterprise Messaging and Streaming Engines
    • Scalable applications using Zeta Architecture
    • High-speed Timeseries applications and databases
    • Briefcase in a cluster and other IoT use cases

    Thursday, October 1

    11:20am–12:00pm Thursday, 10/01/2015
    Location: Table A (O'Reilly Booth)
    Claudia Perlich (Dstillery)
    Average rating: *....
    (1.00, 1 rating)

    As we’ve been able to access more granular information, old standby metrics like clickthrough rate are becoming meaningless. Join Claudia to discuss new approaches to:

    • Digital Marketing
    • Predictive Modeling
    • Data Science Education and Management
    • Causal Methods in Observational Data
    11:20am–12:00pm Thursday, 10/01/2015
    Location: Table B (O'Reilly Booth)
    Ted Dunning (MapR)

    Stop by and meet Ted for advice on a range of topics including:

    • Streaming and real-time data architectures
    • Large-scale noSQL systems
    • Apache Drill, Zookeeper, Kylin, Flink, or Mahout
    1:15pm–1:55pm Thursday, 10/01/2015
    Location: Table A (O'Reilly Booth)

    Come talk to Michael if you’re interested in:

    • Apache Mesos, Marathon, Docker, DCOS
    • Stream processing with Spark/Storm/Kafka
    • Data Engineering
    1:15pm–1:55pm Thursday, 10/01/2015
    Location: Table B (O'Reilly Booth)
    Robert Grossman (University of Chicago)

    If you’re building an operational systems for detecting anomalies and creating alerts, you need to talk with Robert. He’ll answer all your questions about:

    • Anomaly detection and change detection
    • Using segmented models to manage very large datasets
    2:05pm–2:45pm Thursday, 10/01/2015
    Location: Table A (O'Reilly Booth)
    Uri Laserson (Cloudera)

    Uri has some fascinating insight on:

    • Genomics on Hadoop
    • ETL of large scale DNA sequencing data and other types of data (e.g., BAMVCF)
    • Building a scalable variant store, including annotation data (e.g., ENCODE, dbSNP).
    • Other uses of Hadoop ecosystem in place of scientific HPC
    2:05pm–2:45pm Thursday, 10/01/2015
    Location: Table B (O'Reilly Booth)
    Kurt Brown (Netflix)
    Average rating: *****
    (5.00, 1 rating)

    Curious what the Netflix data platform team is up to? Chat with Kurt about that, and:

    • Large scale data infrastructure
    • Presto integration
    • The motivations behind the Netflix architecture & approach—and what you can learn from them
    3:45pm–4:25pm Thursday, 10/01/2015
    Location: Table A (O'Reilly Booth)
    Dean Wampler (Lightbend)

    Dean will discuss all things Spark, including:

    • Stream processing
    • Deployment platforms, such as Mesos and Hadoop.
    3:45pm–4:25pm Thursday, 10/01/2015
    Location: Table B (O'Reilly Booth)
    Fangjin Yang (Imply), Gian Merlino (Imply)

    Interested in the lambda architecture? Fangjin and Gian are available to discuss:

    • Building real-time analytic stacks with open source technologies
    • Motivation for the Kafka, Samza, Hadoop, and Druid real-time analytics stack
    • Implementation details for those interested in trying the stack out at home
    • How to scale the stack and make it highly available