Skip to main content
Make Data Work
Oct 15–17, 2014 • New York, NY

Office Hours

Office Hours gives you a chance to meet face-to-face in a small group setting with expert Strata + Hadoop World presenters. Discuss the speaker's area of expertise, give feedback about their sessions, or ask questions.

Sign-up now by adding it to your personal schedule. Seating is limited.

Office Hours will take place in the Expo Hall.

    Thursday, October 16

    10:30am–11:00am Thursday, 10/16/2014
    Location: Table A
    Mark Grover (Lyft), Ted Malaska (Capital One)

    Mark and Ted are happy to share a wealth of information on things like:

    • Best practices for data modelling and processing in Hadoop
    • Batch processing applications on Hadoop
    • Considerations and recommendations for architecting Hadoop Applications
    10:30am–11:00am Thursday, 10/16/2014
    Location: Table B
    Garrett Grolemund (RStudio)

    If you want to do reliable, insightful data analysis with R, Garrett’s the guy to talk to. He’ll answer all you questions about:

    • Web applications with R and Shiny
    • Interactive documents with R Markdown and Shiny
    • Programming with R
    10:30am–11:00am Thursday, 10/16/2014
    Location: Table C
    Gwen Shapira (Confluent), Jonathan Seidman (Cloudera)

    Gwen and Jonathan will be on hand to share with you:

    • Best practices for ingestion and orchestration in Hadoop
    • Real-time processing applications on Hadoop
    • Considerations and recommendations for architecting Hadoop Applications
    10:30am–11:00am Thursday, 10/16/2014
    Location: Table D
    Miriah Meyer (University of Utah)
    Average rating: *****
    (5.00, 1 rating)

    If you have questions about visualizations bring them to Miriah. She’ll help you with:

    • The visualization design process
    • Designing visualizations for domain experts
    • Creating effective visualizations
    10:30am–11:00am Thursday, 10/16/2014
    Location: Table E
    Andy Terrel (NumFOCUS)

    Andy leads the Blaze team taking the Python data stack to the next generation of scalable tools, so he’s a great person to ask about:

    • Python programming language
    • High Performance Python
    • Parallel computing
    10:30am–11:00am Thursday, 10/16/2014
    Location: Table F
    Doug Cutting (Cloudera)

    Here’s your chance to talk to the creator of Hadoop. Doug can answer questions on all things Hadoop, but especially:

    • Recent developments in the Hadoop ecosystem
    • The future of Data Management
    • Big Data for better decision making
    11:00am–11:40am Thursday, 10/16/2014
    Location: Table A
    Matei Zaharia (Databricks), michael dddd (Databricks), Paco Nathan (derwen.ai), Tathagata Das (Databricks)

    Talk with Apache Spark developers about:

    • The latest updates in Spark
    • Use cases and deployment
    • Getting started in Spark
    11:00am–11:40am Thursday, 10/16/2014
    Location: Table C
    Hadley Wickham (Rice University / RStudio)

    Stop by and meet Hadley to have fascinating and wide-ranging conversations on:

    • R
    • Data analysis
    • Visualisation
    11:00am–11:40am Thursday, 10/16/2014
    Location: Table D
    Joe Caserta (Caserta Concepts)

    This is a great opportunity to meet Joe, who is happy to talk with you about:

    • How to monetizing your data
    • How to create an effective Data Governance program
    • How to manage, manipulate and understand your data and make better decisions.
    11:00am–11:40am Thursday, 10/16/2014
    Location: Table E
    Ben Lorica (O'Reilly Media)

    Ben specifically wants to talk to you about:

    • Your feedback on his O’Reilly Radar report “Big Data’s Big Ideas”
    • What you see as the most important trends in data engineering and data science
    • The topics and speakers you want to see at future Strata+Hadoop World Conferences
    11:50am–12:30pm Thursday, 10/16/2014
    Location: Table A
    Michael Stonebraker (Tamr, Inc.)

    Enterprise context is king—neither machine learning alone nor people alone can deliver it. An artful combination of the two is needed. Michael can help you combine the two, as well as explore things like:

    • Updating your traditional methods of data integration and curation in order to do everything you want with all the data you have
    • Finding ways around traditional top-down integration methods and data science tools focused on individual productivity that don’t scale
    11:50am–12:30pm Thursday, 10/16/2014
    Location: Table B
    Max Shron (Warby Parker)

    Max is here to help you:

    • Translate business problems into data problems
    • Understand how and when to invest in predictive and prescriptive analytics
    • Select pilot projects for advanced data techniques at established companies
    11:50am–12:30pm Thursday, 10/16/2014
    Location: Table C
    Stephen O'Sullivan (Data Whisperers), Richard Williamson (Silicon Valley Data Science)

    Are you trying to create an intelligent data strategy for your organization? Stephen and Richard are happy to help. Talk to them about:

    • How to get started
    • The platforms and frameworks available—and which ones are right for you
    • How to find business value in big data technology investment
    11:50am–12:30pm Thursday, 10/16/2014
    Location: Table D
    Dafna Shahaf (The Hebrew University of Jerusalem)

    Trying to make sense of large amounts of data? Bring your questions to Dafna. This could be a fascinating conversation about:

    • Metro maps of information
    • Computational creativity
    • Computational insight
    11:50am–12:30pm Thursday, 10/16/2014
    Location: Table E
    Anna Gilbert (University of Michigan)

    Analyzing data is only half the battle. Anna focuses on how to collect more relevant, useful data and how to process it more efficiently. Stop by and discuss:

    • Streaming/sketching algorithms for big data
    • Statistical algorithms for large data analysis
    • Compressive sampling and sensing of large data
    11:50am–12:30pm Thursday, 10/16/2014
    Location: Table F
    Jeroen Janssens (Data Science Workshops B.V.)

    Meet Jeroen to talk about everything related to data science at the command line, including:

    • Becoming more proficient at the command line
    • Performing a particular data science task using the command line
    • Customizing your own data science toolbox with new command-line tools
    1:45pm–2:25pm Thursday, 10/16/2014
    Location: Table A
    Bob Mankoff (The New Yorker Magazine)

    Have some fun! Talk to Bob about:

    • Evolutionary perspectives on humor
    • What having a good sense of humor means
    • Gender differences in humor and their relevance for messaging and marketing
    1:45pm–2:25pm Thursday, 10/16/2014
    Location: Table B
    Joe Hellerstein (UC Berkeley)

    Brainstorm with Joe about:

    • Data transformation
    • Agile data analysis
    • The state of the big data management ecosystem
    1:45pm–2:25pm Thursday, 10/16/2014
    Location: Table C
    Julian Hyde (Hortonworks)

    Ask Julian anything about:

    • Apache Optiq
    • Query optimization in Apache Hive and Apache Drill
    • The Mondrian OLAP engine
    1:45pm–2:25pm Thursday, 10/16/2014
    Location: Table D
    Sebastian Gutierrez (DashingD3js.com)

    If you’re interested in D3.js, data visualization, data visualization tools, or the last mile of data science (going from scientists to users), stop by and see Sebastian. He’s happy to talk with you about:

    • Visual perception, trends in data visualization tools, and data visualization
    • The last mile of data science
    • Data scientists at work – what they do, how they do it, where they do it, what their goals are
    1:45pm–2:25pm Thursday, 10/16/2014
    Location: Table E
    Cameron Turner (The Data Guild)

    Get advice from Cameron on:

    • Understanding user experience
    • Building product strategy
    • Energy-related machine learning
    1:45pm–2:25pm Thursday, 10/16/2014
    Location: Table F
    Altan Khendup @madmongol (Teradata Corporation)

    Tap into Altan’s two decades of experience architecting and implementing big data solutions across multiple industries. Ask him about:

    • Big Data Apps with the Lambda Architecture
    • Teradata and Real-Time Big Data
    2:35pm–3:15pm Thursday, 10/16/2014
    Location: Table A
    Paco Nathan (derwen.ai), Allen Day (MapR Technologies)

    Paco and Allen have a great perspective on DNA sequencing, biomedical data, data best practices, and future opportunities—as well as adjacent areas like ag and manufacturing. Ask them about:

    • How to put business context into mathematical techniques, especially for the parts “beyond calculus” that enable high-ROI solutions today
    • Application areas of advanced math that have proven to be game-changers for large-scale scale in common business use cases, and how that informs the selection of open source platforms
    • Computational thinking: a way of understanding problems, decomposing them into recognized patterns, then articulating steps toward computable solutions useful both for developing algorithms and for articulating shared process within a team
    2:35pm–3:15pm Thursday, 10/16/2014
    Location: Table B
    Chris Harland (Textio)

    If you are stumped by a scenario that can’t adequately be answered by A/B testing, you need to talk to Chris. He’ll give you techniques for dealing with these “non-standard” quasi-experimental events, share some fun data stories, and chat with you about:

    • Causal inference
    • Experimentation
    • Tools of the trade
    2:35pm–3:15pm Thursday, 10/16/2014
    Location: Table C
    John Mount (Win-Vector LLC)
    Average rating: ****.
    (4.00, 1 rating)

    John is happy to converse on all things data science, particularly:

    • Data inventories
    • The general practice of data science
    • Practical Data Science with R
    2:35pm–3:15pm Thursday, 10/16/2014
    Location: Table D
    Ted Dunning (MapR)

    Ted’s deeply involved with MapR, the Apache Mahout, Drill, and Zookeeper projects, Storm, Spark, recommendation systems, and fraud detection systems. So you know you have questions for Ted. He’ll also talk with you about things like:

    • Approximation algorithms, particularly heavy hitters and the t-digest
    • Time series algorithms and databases
    • Anything to do with MapR software
    2:35pm–3:15pm Thursday, 10/16/2014
    Location: Table E
    Scott Nicholson (Poynt)

    Bring Scott any data problem that’s weighing on your mind. He’ll discuss:

    • How to use econometric tools to establish causality outside of randomized testing frameworks
    • How to build data science teams and culture
    • Data usage, privacy, anonymization
    2:35pm–3:15pm Thursday, 10/16/2014
    Location: Table F
    Vaibhav Nivargi (ClearStory Data)

    Looking for a data analytics solution to quickly explore and analyze disparate data for fast answers? Vaibhav can help. Ask him about:

    • ClearStory’s interactive and visual analysis solution
    • How companies are driving faster holistic insights across private and premium data sources
    • How to facilitate faster diagnosis and discovery and consistent decision-making across teams with ClearStory’s visual and collaborative user application
    3:25pm–4:05pm Thursday, 10/16/2014
    Location: Table A
    Stefan Heeke (SumAll.org)

    Meet with Stefan to discuss practical aspects of using data analytics in social impact projects, including:

    • Should non-profits, NGO’s, and social enterprises use “open source” or commercial enterprise tools?
    • Using predictive analytics to identify and better serve at-risk populations
    • How to communicate social impact related data & insights
    3:25pm–4:05pm Thursday, 10/16/2014
    Location: Table B
    Barry Devlin (9sight Consulting)

    Grab this opportunity to discuss and debate the differences between a modern Data Warehouse environment and a Data Lake. Barry will talk about things like:

    • What is important in a Data Warehouse that a Data Lake can’t offer
    • Why one technology base may not be the answer to all questions
    • The relative strengths of relational and Hadoop environments
    3:25pm–4:05pm Thursday, 10/16/2014
    Location: Table C
    Doug Bryan (RichRelevance)

    Doug has insight into how retailers like L.L. Bean are successfully using data. Talk to Doug about:

    • Real-time personalization and product recommendations
    • Continuous recommendation algorithm optimization
    • In-store applications of personalization (cross-sell, up-sell and storytelling)
    3:25pm–4:05pm Thursday, 10/16/2014
    Location: Table D
    Joel Gurin (Center for Open Data Enterprise)

    Want to explore the opportunities that Open Data opens up? Come talk to Joel about your ideas as well as things like:

    • The relationship between open data and big data
    • How entrepreneurs can use government open data to build their businesses or start new ones
    • The kinds of open data that are most useful in different sectors (healthcare, finance, education, energy, etc.)
    3:25pm–4:05pm Thursday, 10/16/2014
    Location: Table E
    John Akred (Silicon Valley Data Science), Edd Wilder-James (Google)

    If you are struggling with here to start in planning your data strategy, a few minutes with John and Edd could prove invaluable. They’ll talk with you about:

    • How to chose platforms and frameworks to store and analyze your data
    • How to find business value in big data technology investment
    • How to access and utilize data locked in difficult formats
    3:25pm–4:05pm Thursday, 10/16/2014
    Location: Table F
    Eugene Kolker (Seattle Children's)

    Not only does Eugene have some valuable experience with predictive analytics: he’s also done a deep forensic look at success and failure in predictive analytics for healthcare, so he has insights to share on:

    • How to complement analytics with best business practices, project management, and communication
    • How to consider costs and organizational readiness to justify change
    • How to use data and analytics to build shared vision of both problem and solution
    4:15pm–4:55pm Thursday, 10/16/2014
    Location: Table A
    Chris Wilson (L.L.Bean)

    Chris has some great stories to tell about how L.L. Bean is using data. Ask him anything about his experiences and things like:

    • How to rapidly develop and test omnichannel use cases
    • How to host big data boot camps for employees
    • How to create OLAP cubes to democratize data across your organization
    4:15pm–4:55pm Thursday, 10/16/2014
    Location: Table B
    Tim Kraska (Brown University)

    If you’ve got an interest in machine-learning and hybrid human/machine database systems your time with Tim is bound to be fascinating. He’ll also discuss:

    • Query compilation techniques for SQL and machine learning
    • Asynchronous machine learning models
    • Geo-replication (specifically MDCC/PLANET)
    4:15pm–4:55pm Thursday, 10/16/2014
    Location: Table C
    Fangjin Yang (Imply), Xavier Léauté (Confluent)

    Fangjin and Xavier are happy to discuss Druid and all big data related miscellany, including:

    • Architecting systems for efficiency and cost in multi-tenant environments
    • Common tradeoffs in system design
    • Druid use cases and scaling the system without incurring downtime
    4:15pm–4:55pm Thursday, 10/16/2014
    Location: Table D
    ed00425e 963b0803 (MapR Technologies), Sridhar Reddy (MapR Technologies)

    Want to discuss HBase development? Carol and Sridhar will answer questions about:

    • The HBase data model and HBase architecture
    • HBase Schema design
    • Advanced Java APIs for performing scans and light-weight transactions
    4:15pm–4:55pm Thursday, 10/16/2014
    Location: Table E
    Jana Eggers (Nara Logics)

    If you want to encourage innovation in your organization, Jana can describe:

    • How to set-up listening posts around the company to bring up innovation ideas
    • How you can sort through and manage ideas across a company
    • How to marry customer interaction with data to help prioritize ideas
    4:15pm–4:55pm Thursday, 10/16/2014
    Location: Table F
    Adam Fuchs (Sqrrl)

    Come meet Sqrrl’s CTO and co-founder of the Apache Accumulo project. Adam will field your questions like:

    • Is this use case a good fit for a Sqrrl Enterprise?
    • Why is data-centric security for Hadoop-based graphs so important?
    • How do I get involved with the Apache Accumulo project?
    5:05pm–5:45pm Thursday, 10/16/2014
    Location: Table A
    Brigitte Piniewski (nonaffiliated )

    Hoping to conquer the community health paradox? Brigitte is, too. She’ll share specifics of two youth mentoring projects, and discuss:

    • Innovative Workforce development strategy
    • Youth mentoring to drive sustainable community engagement
    • Digital inclusion strategy to unlock your full economic potential
    5:05pm–5:45pm Thursday, 10/16/2014
    Location: Table B
    Monte Zweben (Splice Machine Inc.)

    If you’re weighing the pros and cons of traditional relational databases like Oracle and IBM vs. distributed computing solutions, you’ll want to talk to Monte. Discuss things like:

    • Replacing Oracle with the Hadoop RDBMS
    • RDBMS vs SQL-on-Hadoop
    • ACID properties and why that is important.
    5:05pm–5:45pm Thursday, 10/16/2014
    Location: Table C
    Haoyuan Li (Alluxio)

    Haoyuan is here to answer your questions about Tachyon, a memory centric fault-tolerant distributed file system, which enables reliable file sharing at memory-speed across cluster frameworks, such as Spark and MapReduce. He’ll also share:

    • Tachyon use cases
    • The Tachyon road map (including exciting features from AMPLab and industry partners)
    5:05pm–5:45pm Thursday, 10/16/2014
    Location: Table D
    Manish Devgan (Software AG)

    Are you adopting in-memory computing (or thinking about it?) Manish can give you guidance in:

    • The technology landscape for in-memory data management platforms
    • The convergence of In-Memory, NoSQL, Hadoop, and other “Big Data” solutions
    • Real-world deployments and use cases leveraging In-Memory Data Management
    5:05pm–5:45pm Thursday, 10/16/2014
    Location: Table E
    Bahman Bahmani (Rakuten)

    If security is a priority for you stop by to visit with Bahman. (If it isn’t a priority, you should probably attend his session). He’ll talk about:

    • Machine learning algorithms for detecting adversaries
    • Attacks against machine learning algorithms
    • Making machine learning algorithms robust against adversaries
    5:05pm–5:45pm Thursday, 10/16/2014
    Location: Table F
    Michael Rosenbaum (Pegged Software)

    If you are interested in applying data and predictive analytics to hiring and team assembly, Michael can help you reduce turn-over by up to 75%. He’ll give you tips on how to do that, and answer any questions you may have about:

    • Big data applied to talent and HR
    • Big data applications in organizational effectiveness
    • Healthcare

    Friday, October 17

    10:30am–11:00am Friday, 10/17/2014
    Location: Table A
    Olivier Grisel (Inria & scikit-learn)
    Average rating: *****
    (5.00, 1 rating)

    Do you work with the PyData stack? Spend a few minutes with Olivier, who is happy to discuss:

    • Recent developments in scikit-learn
    • Machine learning in Python
    • Predictive analytics in Python
    10:30am–11:00am Friday, 10/17/2014
    Location: Table B
    srowen om (Cloudera)

    Sean is the Director of Data Science at Cloudera. Chat with Sean about:

    • Large scale machine learning on Hadoop
    • Using Spark, MLlib, Mahout
    • Connecting R, SAS, et al to Hadoop for analytics
    10:30am–11:00am Friday, 10/17/2014
    Location: Table C
    Brian Granger (Cal Poly San Luis Obispo), Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory)

    Stop by and check in with two of the co-founders of the Jupyter Project (and the new multiuser Jupyter Notebook server). Brian and Fernando can tell you about:

    • Using IPython/Jupyter notebook for telling data and code driven stories
    • Using the Notebook with languages other than Python (such as R and Julia)
    • How to integration the Notebook and Google Drive
    10:30am–11:00am Friday, 10/17/2014
    Location: Table D
    Lelanie Moll (FICO), Deb Brooks (FICO), Silaphet Mounkhaty (FICO)

    Lelanie, Deb, and Silaphet will share their perspective on Hadoop administration from a DBA point-of-view, and can discuss their hard-won experience with things like:

    • The challenges faced by traditional IT Organizations in adopting Big Data strategies
    • Moving a SQL Server DBA to Hadoop
    • Moving an Oracle DBA to Hadoop
    10:30am–11:00am Friday, 10/17/2014
    Location: Table E
    Samuel Kommu (Cisco Systems)

    Working with big data on the enterprise? Stop by to see Samuel and talk about things like:

    • Compute and Network considerations for building Big Data clusters
    • Optimizing multi tenant Big Data clusters
    • Best practices and validated designs
    10:30am–11:00am Friday, 10/17/2014
    Location: Table F
    Alice Zheng (1977)

    Alice is an expert in Machine Learning algorithms (among other things). She’ll help you:

    • Scale up data science on Big Data
    • Gain better intuition about how machine learning algorithms work
    • Learn some practical data science tips
    11:00am–11:40am Friday, 10/17/2014
    Location: Table A
    Jaroslav Cecho (Cloudera), Gwen Shapira (Confluent), Abraham Elmahrek (Fossa), Kathleen Ting (Cloudera), David Robson (Dell Software), Guy Harrison (Dell Software)

    Looking for a deeper understanding of how to integrate components in the Apache Hadoop ecosystem? Meet with Jarcec, Gwen, Abe, Kathleen, Guy, and David for an insightful conversation on things like:

    • Discuss the refactoring and new features of Apache Sqoop 2
    • Troubleshooting common Sqoop errors
    • Learning how to contribute from the committers
    11:00am–11:40am Friday, 10/17/2014
    Location: Table D
    Vishal Bamba (Transamerica), David Beaudoin (Transamerica), Stephen Lloyd (Transamerica)

    Vishal, David, and Stephen are implementing a Hadoop cluster adoption spanning multiple divisions at Transamerica. Come learn from their experiences—and their successful POC. They’re available to provide details and to talk about:

    • Solution architecture
    • Best practices
    • Do’s & don’ts
    11:00am–11:40am Friday, 10/17/2014
    Location: Table E
    Julia Angwin (ProPublica)
    Average rating: *****
    (5.00, 2 ratings)

    Concerned about protecting privacy while managing large data sets? Stop by to say hi to Julia, she will be happy to talk about things like:

    • Common privacy mistakes
    • The latest privacy-protecting tools
    • The arms race between data collectors and those who want to control their data
    11:00am–11:40am Friday, 10/17/2014
    Location: Table F
    Paul Zikopoulos (IBM CANADA)

    Paul is hosting office hours to:

    • Elaborate on his keynote
    • Explore more closely an actual application that builds on in-place existing technologies such as Hadoop to deliver understandable results
    • Take a deeper dive into a story where analytics at rest was applied to forms of unstructured data using a simple SQL-like development environment, and findings were promoted to the frontier of the business to score, in real time, monetizable intent, assess reputations, predict health outbreaks, and more
    11:50am–12:30pm Friday, 10/17/2014
    Location: Table A
    Martin Kleppmann (University of Cambridge)

    Bring Martin your real-time data problems to explore:

    • Stream processing architectures for solving them
    • The trade-offs made by different stream processing frameworks (such as Samza, Storm and Spark Streaming)
    • How to figure out which framework is best suited to your application
    11:50am–12:30pm Friday, 10/17/2014
    Location: Table B
    adam pilz (SAS)

    Bring Adam all of your SAS questions, including things like:

    • Statistical techniques and their use for your business
    • How SAS and Hadoop work together
    • Challenges you face with data needs and how to resolve them
    11:50am–12:30pm Friday, 10/17/2014
    Location: Table C
    nick dimiduk (Hortonworks, Inc), Nicolas Liochon (Scaled Risk)

    Nick and Nicolas want to talk about your experiences in Hadoop/Hbase and other systems. Ask them about:

    • Remaining challenges to lower the post .99 latency
    • Configuration settings relevant to latency
    • Gotcha’s in running latency experiments
    11:50am–12:30pm Friday, 10/17/2014
    Location: Table D
    Joshua Patterson (NVIDIA), Nathan Shetterley (Accenture), Mike Wendt (NVIDIA)

    Joshua, Nathan and Michael will have a custom visualization that you can play with and will talk about things like:

    • How they use a Data Science Team approach to producing advance visualizations (and what skills are necessary)
    • The underlying architecture, analytics, and/or the visual method of there demo viz
    • Their agile approach to translating data into insights and then rapidly putting them into action (and what technologies and methods work best)
    1:45pm–2:25pm Friday, 10/17/2014
    Location: Table A
    Jonathan Hsieh (Cloudera, Inc), Jean-Daniel Cryans (Cloudera), Lars George (Cloudera), Amandeep Khurana (Cloudera)

    Interested in Apache HBase? This is your chance to meet the team from Cloudera to discuss:

    • Application archetypes
    • Schema design
    • Deployments and support stories
    1:45pm–2:25pm Friday, 10/17/2014
    Location: Table B
    michael dddd (Databricks), Reynold Xin (Databricks)

    Michael and Reynold are happy to talk about:

    • Spark SQL features and roadmap
    • How to contribute to Spark SQL
    • And other Spark use cases
    1:45pm–2:25pm Friday, 10/17/2014
    Location: Table C
    Douglas Moore (Think Big Analytics)

    Thinking about high-level architecture pattern? You’re not alone. Come by and talk to Douglas about:
    Anti-patterns

    • Use case to architecture pattern
    • Big data warehousing
    2:35pm–3:15pm Friday, 10/17/2014
    Location: Table A
    Greg Rahn (Cloudera)

    Greg is on hand to discuss:

    • Open-source SQL on Hadoop
    • SQL engines for Hadoop (Impala, Hive, Presto, etc.)
    • Good/bad benchmarking practices
    2:35pm–3:15pm Friday, 10/17/2014
    Location: Table B
    Cliff Click (0xdata)

    Conversations with Cliff may include a deep dive into the Distributed Parallel Gradient Boosting Machine, or:

    • R & H2O
    • Spark & H2O
    • Doing Math at Scale with H2O
    2:35pm–3:15pm Friday, 10/17/2014
    Location: Table C
    Wei Zheng (Trifacta), Stephanie McKinley (Independent Consultant)

    Wei and Stephanie have valuable advice on agile data transformation in Hadoop. Drop by and ask questions about their session, Trifecta products, and things like:

    • How to structure, standardize, enrich and distill your data for different use cases through a single application
    • How raw weblogs, JSON and AVRO formats can be wrangled just as easily as row and column data
    • How to increase both data transformation agility in your organization
    2:35pm–3:15pm Friday, 10/17/2014
    Location: Table D
    Alex Gorelik (Waterline Data)

    If you’re thinking of deploying a Hadoop data lake, you need to talk to Alex. He’s open to conversations on any aspect of the challenge, including how to:

    • Create a data inventory of data assets in Hadoop
    • Discover and document lineage of Hadoop files
    • Discover quality metrics and sensitive data
    2:35pm–3:15pm Friday, 10/17/2014
    Location: Table E
    Trina Chiasson (Tableau Software)

    If you’re interested in discovering the best ways to communicate your data (and who isn’t?), Trina is the woman to talk to. Ask her about:

    • Visualization and information design
    • What to include—and what to leave out
    • User experience and design for visualization-based web applications
    3:25pm–4:05pm Friday, 10/17/2014
    Location: Table A
    Jodok Batlogg (CRATE Technology GmbH)

    Are you looking for a super simple Internet of Things backend? Jodok can show you how to create one with Crate Data and Twitter Storm. He’ll also answer questions on a wide range of topics, including:

    • Scaling SQL data stores
    • Storage of time-series data
    • SQL with Elasticsearch
    3:25pm–4:05pm Friday, 10/17/2014
    Location: Table B
    Roy Singh (Guavus)

    Anomaly detection, causality analysis, and anomaly prediction can be pretty complex. Roy is available to answer your security related questions, offer approaches to anomaly detection, and talk about:

    • How streaming analytics provides an end-to-end visibility for the enterprise
    • Data lakes vs. data streams: what’s best for the enterprise?
    • How an operational intelligence platform based on Apache Spark can provide a pipeline for anomaly detection, causality analysis, anomaly prediction, and actionable alerts
    3:25pm–4:05pm Friday, 10/17/2014
    Location: Table C
    Anil Madan (PayPal)
    Want to learn from Paypal’s experiences in behavioral analytics, personalization, and marketing? Anil is your man. He’ll also answer questions on:
    • Infrastructure
    • Building & Scaling Real Time
    • NoSQL Solutions
    3:25pm–4:05pm Friday, 10/17/2014
    Location: Table D
    Indrajit Roy (HP Labs), sunil venkayala (HP)

    HP Vertica and Distributed R together provide high performance statistical analysis, while retaining the simplicity of R. Ask Indrajit and Sunil questions about:

    • How to use Distributed R algorithms on your data, whether the data resides in files or a database
    • How to deploy data mining models inside the database for real-time prediction on incoming data
    • How to use familiar front-end such as R/RStudio with Distributed R, and run in your private servers or the public cloud
    3:25pm–4:05pm Friday, 10/17/2014
    Location: Table E
    Ailey Crow (Pivotal)

    Bring Ailey all your follow up questions from her session on image processing on Hadoop. She’s ready tackle the following subjects and more:

    • Image analytics for large images or a large number of images
    • Image storage in-database or Hadoop
    • Digital pathology and medical imaging