Office Hours

Office Hours gives you a chance to meet face-to-face in a small group setting with expert Strata + Hadoop World presenters. Discuss the speaker's area of expertise, give feedback about their sessions, or ask questions.

Sign-up now by adding it to your personal schedule. Seating is limited.

Office Hours takes place in the O'Reilly booth in the Sponsor Pavilion on Thursday and Friday.

    Location: Table A
    Average rating: *****
    (5.00, 1 rating)

    If you need to communicate data clearly to a non-data scientist audience, stop by to see Naomi. She’ll:

    • Critique your graphs
    • Offer suggestions on how to use scales to present information clearly
    • Answer questions about the graphical display of data
    Location: Table B
    Sameer Farooqui (Databricks)

    Working with Spark? Chat with Sameer about things like:

    • How the Apache Spark compute engine integrates with the Apache Cassandra Database
    • Business use cases for Spark
    • Spark’s architecture, core engine, JVM interactions, etc.
    Location: Table A
    Sebastian Gutierrez (

    If you’re interested in D3.js, data visualization, data visualization tools, or the last mile of data science (going from scientists to users), stop by and see Sebastian. He’s happy to talk with you about:

    • Visual perception, trends in data visualization tools, and data visualization
    • The last mile of data science
    • Data Scientists at Work – what they do, how they do it, where they do it, what their goals are
    Location: Table B
    Amy Heineike (Primer)
    Average rating: *****
    (5.00, 1 rating)

    Amy will have Quid available to play with, and can answer all your data-driven business day questions, as well as discuss:

    • How data is driving high-level decision making
    • What ‘external’ data is available (news, patents, twitter, oh my!) – and what has to be done to make it meaningful
    • Why networks views are beautiful and what they can teach you
    Location: Table A
    Mark Grover (Lyft), Ted Malaska (Capital One)

    Ted and Mark are available for discussions about:

    • Best practices for data modelling and processing in Hadoop
    • Batch processing applications on Hadoop
    • Considerations and recommendations for architecting Hadoop Applications
    Location: Table B
    Uwe Weiss (Blue Yonder)

    Interested in decision automation? Uwe is ready for in-depth conversations about his work at Blue Yonder, and things like:

    • Blue Yonder’s experiences with decision automation in industry and retail
    • Managing change when introducing decision automation
    • Cross-industry applications of machine learning
    Location: Table A
    Guy Ernest (Amazon Web Services)

    If you’re using multiple tools to solve data problems—or wish you were, talk to Guy. He’ll give you tips on how to integrate tools and chat with you about:

    • Best practices to deploy big data
    • Real time streaming and analytics system in an agile and elastic way with Amazon Cloud
    • Amazon Kinesis, Elastic MapReduce, Redshift, DynamoDB, CloudSearch, S3 (EMRFS)
    Location: Table B
    Assaf Araki (Intel)

    If you’re interested in analytics for the IoT, wearables, or predictive analytics, stop by and see Assaf. He’ll share his experience with:

    • Predictive analytics in the Big Data domain (distributed computing)
    • Internet of things analytics
    • Use of wearable in the healthcare industry
    Location: Table A
    srowen om (Cloudera)

    If you want to use Apache Spark and clustering for anomaly detection, stop by and see Sean. He’ll answer all your questions on things like:

    • Large scale machine learning on Hadoop
    • Using Spark, MLlib, Mahout
    • Connecting R, SAS, et al to Hadoop for analytics
    Location: Table B
    Jodok Batlogg (CRATE Technology GmbH)

    Got a data-intensive app? Or another project that requires massive concurrency together with real-time processing? Jodok’s your man. Talk to him about:

    • Distributed SQL
    • Easy horizontal scaling
    • Storage and processing of Time-series Data
    Location: Table A
    Melissa Santos (Big Cartel)

    Melissa can answer questions about any part of the data pipeline from ETLs to modeling. She’s particularly talented at translating hard-core data science into plain English, so talk to Melissa if you’re trying to:

    • Build data communities, in and outside of your company
    • Make your data tools available to even the most unexpected people you work with
    • Work across data teams to reach common goals
    Location: Table B
    Lars George (Cloudera), Jonathan Hsieh (Cloudera, Inc)

    If you have HBase questions, Jon and Lars are your guys. Ask them your deep technical or architectural questions, such as:

    • Where does HBase work best?
    • How can I scale my use-case to make the most out of HBase?
    • How does HBase compare to other NoSQL and emerging MPP query engine solutions?
    Location: Table A

    If you’re interested in large-scale data integration, the Internet of Things, or web applications, you’ll want to chat with Michael. He’ll also answer questions on:

    • Lambda Architecture
    • Stream processing
    • Apache Spark
    • Internet of Things
    Location: Table A
    Jordan Tigani (Google )
    Average rating: ***..
    (3.00, 1 rating)

    Jordan correctly predicted the outcome of 14 of 16 games in the World Cup. If you want to make predictive analytics work for you, stop by and talk with Jordan about:

    • Machine Learning with Google Cloud Platform
    • Predictive Sports Analytics
    • Google BigQuery
    Location: Table B
    Paco Nathan (

    Paco can answer all your Spark-related question. Ask him about:

    • PySpark
    • Spark SQL
    • Spark Streaming
    • Spark integration with Cassandra
    Location: Table A
    Mikio Braun (Zalando)

    Mikio will discuss technological and application aspects of real-time data analysis, in particular:

    • Technology aspects, algorithms, technology choices, for example, streaming vs batch, Spark vs. Storm, etc.
    • Application of real-time big data, for example, click stream analysis, monitoring, altering, analytics
    Location: Table B
    Marcelo Soria-Rodriguez (BBVA Data & Analytics)
    Average rating: *****
    (5.00, 1 rating)

    Interested in financial data or open innovation, or most interestingly, both? Meet Marcelo to discuss these opportunities (and challenges) and things like:

    • Openness in private companies (exposing data or capabilities to third parties)
    • Dealing with customers privacy in data-based products
    • Service design
    Location: Table A
    John Akred (Silicon Valley Data Science), Edd Wilder-James (Google)

    If you’re creating a data strategy for your organization, spend some time with Edd and John. They’ll help you figure out:

    • What platforms and frameworks you need to store and analyze your data
    • How to find the business value in big data technology investment
    • How to retrieve data locked in difficult formats
    Location: Table B
    Aurélie Pols (Mind Your Privacy)

    Where do you start with data security and consumer privacy? Aurelie can help you create a framework to deal with things like:

    • Privacy and data protection challenges within the digital marketing ecosystem
    • Compliance and regulatory best practices in light of increasing Privacy legislation
    • What to do, where to start?
    Location: Table A
    Yves-Alexandre de Montjoye (Imperial College London | MIT Media Lab)

    If you want to figure out how to use data while preserving people’s privacy, talk to Yves-Alexandre. He’s available to discuss things like:

    • Privacy of metadata datasets
    • Privacy of location data and unicity
    • Use of large-scale mobile-phone metadata
    Location: Table B
    Lisa Green (Common Crawl), Peter Adolphs (Neofonie)

    Do you want to scrape billions of Web pages to dig up the information nuggets relevant to your business? Come to chat with Lisa and Peter. Share your ideas, and they’ll and help you:

    • Find and evaluate facts in the Web
    • Use NLP methods to get from text to structured data
    • Analyze Common Crawl with SQL in MIA
    • Use your custom data and algorithms for Web analysis
    Location: Table C
    Kathleen Ting (Cloudera)

    If you’re looking to improved job throughput and cluster utilization, and permitting different computational frameworks to run on Hadoop, stop by to see Kathleen. She’ll share some useful configuration tweaks, as well as:

    • Best practices around migrating from MR1 to YARN
    • Recommended YARN configurations
    Location: Table A
    nick dimiduk (Hortonworks, Inc)

    If you’re looking at NoSQL options, have adopted Hadoop into your Data Warehouse, or are considering HBase, you need to talk to Nick. Chat with him about:

    • Consistency as a priority in data storage systems.
    • HBase architecture compared to other NoSQLs.
    • HBase roadmap, 1.0 and beyond.
    Location: Table B
    David Boyle (Audience Strategies)

    Want to make better decisions with data? (Of course you do.) David is passionate about data-driven decision making, too. Stop by and discuss:

    • Teasing inspiring, actionable insight from complex data
    • Engaging a business with insight. Changing ways of working
    • The role and opportunities for data in a creative business
    Location: Table C
    Ted Dunning (MapR)

    I will be happy to talk about:

    • Approximation algorithms for median, percentiles, top-40 on streaming data
    • Scaling time-series databases to 100M points / second
    • Wind-powered open source, big data in the 19th century (really!) (with pictures!)
    • Anything else interesting that somebody wants to talk about!