Skip to main content

Schedule: Office Hour sessions

Table A
Rodney Mullen (Almost Skateboards)
Rodney is probably the most influential skateboarder in history. He’ll gladly discuss how to balance the analytic methods that help us learn with the internal feel for what we are learning. You’ll get tips for perfecting your Heelflip, and discover how skateboarders perform death-defying stunts without, you know, dying. Read more.
Table A
Ted Dunning (MapR Technologies)
Do you need an anomaly detection system? Meet with Ted and his pet anomaly to discuss what anomaly detection is, and how you can do it. You’ll also find out about open source solutions for anomaly detection. Read more.
Table B
Max Gasner (Salesforce.com)
Want to build a predictive platform that’s as easy to use as a relational database? Meet Max to discuss API design for general-purpose predictive platforms and services, and UX for configuring predictive systems and validating predictive performance. He’ll also explore information design for communicating interpretable model structure, structural uncertainty, and uncertain predicted values. Read more.
Table C
Alexander Gray (Skytree, Inc.), Leland Wilkinson (Skytree)
Do you want to achieve best-in-class results with high-performance learning in your application? Join Alexander and Leland to find out how to do just that, and to discuss: Read more.
Table A
Ben Waber (Sociometric Solutions)
If you want to determine your organization’s best practices with sensors, meet with Ben to explore what makes people effective at work today—and what could make them effective in the future. That includes People Analytics—using big data about human behavior to manage organizations—and deriving meaningful metrics out of sensor data. Read more.
Table B
Richard Williamson (Silicon Valley Data Science)
Working with Hadoop? Richard has built multiple Hadoop platforms and has worked on many existing Hadoop deployments. Meet with him to learn about the Hadoop pitfalls he’s encountered and how you can avoid them. Understand how to articulate the value of investing in scalable data platforms, and learn how to use MADlib on Impala. Read more.
Table C
Mike Stringer (Datascope Analytics), Dean Malmgren (Datascope Analytics)
If you’ve got the data down, but design thinking is still a headache, talk to Michael and Dean about how to build rapid iteration into your data science processes. You’ll find out how to distill dashboards into their essential parts, and make complex analyses understandable to the layperson. Read more.
Table A
Stephen O'Sullivan (Data Whisperers)
If you’re building a data platform, Stephen has solid advice on navigating the changing technology landscape. That includes how to choose platforms and frameworks, and where to start with data strategy for your organization. He’ll also show you how to determine the business value in big data technology investment. Read more.
Table B
Edd Wilder-James (Google)
If you’ve been to Strata before, you know Edd (if you’re new here, Edd was the founding Strata Chair). This is your chance to sit down with Edd to discuss large-scale data strategy, such as choosing big data solutions, and building data science teams. Edd will also pontificate on the future of data science. Read more.
Table C
Jane Kell (AutoTrader.com)
If you want to transform your BI tools into a powerful data analytics stack, or looking for best practices in web analytics, talk to Jane. She’s happy to discuss tried and true tactics to increase consumer engagement, and actionable metrics to represent consumer engagement. Read more.
Table A
Michael Conover (LinkedIn)
Michael’s talk at Strata explores how LinkedIn uses information visualization technologies and details best practices. Meet with Michael if you’d like to learn techniques for information visualization at scale, and best practices for building robust, reliable production machine-learning workflows. Read more.
Table B
Max Shron (Warby Parker)
It’s problem formation that makes data science live up to its potential. If you want advice on problem formation and communication, meet with Max to discuss how to frame data problems effectively, and how to keep from working on things that later turn out to be useless. Find out what you can learn from other disciplines about how to ask good questions before you collect data. Read more.
Table C
Jeffrey Heer (Trifacta | University of Washington)
If you’re spending more time “wrangling” data than performing analysis, talk to Jeffrey about interactive data analysis, data transformation and preparation, and data visualization. Read more.
Table A
Ben Hamner (Kaggle)
Got gremlins? Ben has seen many of the most common machine-learning mistakes, gremlins, and pitfalls that can get in the way of a successful project. He’s happy to meet with you to discuss the gremlins snarling your project and other topics, such as: how to structure a machine-learning problem, and how to pick a machine-learning method. Read more.
Table B
Adam Fuchs (Sqrrl)
Adam is the Chief Technology Officer and co-founder of Sqrrl. Meet with Adam if you want to know how to construct a data-centric security ecosystem. Probe the inner workings of Apache Accumulo, and learn how Sqrrl Enterprise can help with your application. Read more.
Table C
Soam Acharya (Altiscale), Charles Wimmer (Altiscale)
Soam’s talk on “Making Big Data Portable” will get you thinking about the potential pitfalls of moving large quantities of data between Hadoop clusters. Meet with Soam to explore the limitations of using Hadoop’s DistCp to copy data from one cluster to another. And learn how to secure in-transit data, including when to transfer data over networks and when to use disks. Read more.
Table A
Marc Smith (Connected Action Consulting Group)
If you have questions on social network analysis, Marc is your man. Meet with him to talk about collecting, storing, analyzing, visualizing, summarizing, and publishing social network maps and reports with just a few clicks—and no coding! Learn how to identify key people in a conversation network, and how to compare your networks to basic network structures and types in social media. Read more.
Table B
Leo Meyerovich (Graphistry)
Interested in how to leverage new browser APIs and scale data visualization? Meet with Leo to discuss using WebGL/WebCL, OpenGL/OpenCL/CUDA, and Superconductor with GPUs. Or how to work with multicore processors, using web workers, RiverTrail/ParallelJS, ASMJS/NaCL, and compiler tricks. He’ll also explore visualization design in theory and practice. Read more.
Table C
Ben Lorica (O'Reilly Media)
Interested in hardcore data science? Ben is Chief Data Scientist at O’Reilly Media. He’s available to talk with you about emerging tools and best practices in data science, and what’s next in data infrastructures, including Hadoop, BDAS, and beyond. You’ll also learn how to manage complex data science workflows. Read more.
Table A
John Akred (Silicon Valley Data Science)
John loves interesting and challenging problems—and figuring out how to make an impact with data. Talk to John about how to navigate the changing technology landscape, and how to determine the best platforms and frameworks for your data infrastructure. Learn how to prioritize technology investments to deliver on your business objectives. Read more.
Table B
Eva Andreasson (Cloudera), Justin Langseth (Zoomdata, Inc.), Gus Hunt (CIA)
If you’re looking for new insights from previously siloed data sources, or ideas on how to instantly understand, visualize, and use data, meet with Eva, Justin, and Gus. They’ll answer questions on the enterprise data hub: what is it, why enterprises are executing against EDH strategies, and where Cloudera plays a role. Read more.
Table C
David Epstein (Sports Illustrated)
Curious about what makes a super-athlete? David, the author of The Sports Gene, is available to autograph copies of his book and talk about how an individual’s biology interacts with his/her training, environment, and culture. Find out how athletes engineer for top-tier performance--and how the 10,000-hour rule hinders peak performance. Read more.
Table A
Lukas Biewald (CrowdFlower)
Interested in crowdsourcing? Meet with Lukas to discuss how to collect large data sets or conduct big data experiments using the crowd. Or train machine-learning models using results from the crowd. Find out how to augment, label, and categorize data sets using microtasking. Read more.
Table B
Patrick McFadin (Datastax)
Interested in Apache Cassandra? Patrick can talk about anything related to Cassandra, including installation and deployment, data modeling, application design, and architecture decisions. Come on by and pick his brain. Read more.
Table C
Rachel Poulsen (Silicon Valley Data Science)
If you’re a manager trying to understand your data scientist—or a data scientist trying to get through to your manager—you need to meet Rachel. A data geek with a background in both stats and communications, she can help bridge the divide. Visit with Rachel to talk about things like statistics for dummies, experimental design, and A/B testing. Read more.
Table A
Kurt Brown (Netflix)
If Kurt’s talk about Netflix’s unconventional approach to development has given you some great ideas on how to get the most out of your data, stop by to talk with him about anything data-platform related. Drill into topics such as using Hadoop in the cloud, low latency querying on big data, or crafting a developer-friendly data platform. Read more.
Table B
Nick Kolegraff (Rackspace)
Are you building a data team? Nick offers great advice, starting with lessons he learned from multiple attempts at building a team. He’ll also discuss organizational challenges with data. Read more.
Table C
Steven Hillion (Alpine Data Labs)
If you’re looking to build a data science team, or use agile development in you data projects, Steven is the man to talk to. Meet him to discuss how to develop analytics workflows collaboratively, and how to conduct predictive analytics, particularly on Hadoop. Read more.
Table A
Andy Konwinski (Databricks), Matei Zaharia (Databricks), Patrick Wendell (Databricks), Pat McDonough (Databricks)
Andy, Matei, Patrick, and Pat are the Spark and Databricks developers. If you have an interest in the Spark project (and who doesn’t ?), stop by during their office hours to ask about Spark’s current features and future release plan. Learn about upcoming projects like GraphX, Tackyon, and SparkR. Read more.
Table A
Joe Hellerstein (UC Berkeley)
Looking to cement your position as an agile data wrangler? Want to shorten “time to insight” and use your data to enable better business decisions? Meet Joe to talk about the human side of data science, and the role of discovery, structure, and content in data transformation. Find out about technology trends in data, including research and impacts on practice. Read more.
Table B
Max Richman (Mobile Accord - GeoPoll)
Max’s field of expertise includes using data for international development and emerging markets. Meet with Max to discuss the data landscape for international development, including the actors, tools, and resources. Learn about research design and survey methodology, such as lessons learned from sending millions of surveys on mobile phones around the world. Read more.
Table C
Julie Steele (Silicon Valley Data Science)
If you’re interested in data visualization or data in the healthcare field, stop by and see Julie. She’s happy to talk with you about trends in personalized healthcare, medical data and meaningful use, and visual perception and data visualization. Read more.
Table A
Shrikanth Shankar (Qubole Inc.)
Looking to build high-performance, scalable queries or deploy User Defined Functions (UDFs) in Apache Hive? Shrikanth will give you advice you seek. Talk to him about using advanced Hive techniques, getting started with Hadoop in the cloud, or deciding between AWS or Google Compute. Read more.
Table B
Abe Gong (Superconductive Health)
Interested in using small data to increase the value of big data? Meet with Abe for a freewheeling conversation on patterns and anti-patterns for data science (including the Sidekick Pattern), and learn how data structures interact with the process of analysis. Read more.
Table C
Eric Pugh (OpenSource Connections)
Eric submitted Solr’s most popular patch and co-authored Solr Enterprise Search Server. If you’re interested in Solr, spend some time with him exploring the stability of the Solr community, and what’s next for this enterprise search platform. And be sure to ask Eric about HeliosSearch. Read more.
Table A
Felienne Hermans (Delft University of Technology)
Meet Felienne to talk about everything to do with spreadsheets, such as measuring spreadsheet quality, keeping track of spreadsheets in an organization, and improving “legacy” spreadsheets. Learn how to test existing and new spreadsheets, and how to migrate from spreadsheets to other BI tools. Read more.
Table B
Jim Stogdill (O'Reilly Media, Inc.)
Jim runs O’Reilly’s Strata, Solid, and Radar groups. Stop by and tell Jim what you think of Strata and how we can make it better. Talk with him about the cool project you’re working on, and find out about interesting developments in our industry. Read more.
Table A
Hadley Wickham (Rice University / RStudio)
Hadley has written or contributed to more than 30 R packages. If you want to express yourself with this language and platform, talk to Hadley about using R for data science and analysis and interactive exploration of large in-memory datasets. Learn how to understand and visualize data with R. Read more.
Table B
Brian Granger (Cal Poly San Luis Obispo)
Average rating: *****
(5.00, 1 rating)
Discuss the IPython Notebook with Brian and other leaders of the IPython project. Learn how to create and use JavaScript widgets in the Notebook, and find out how to use it with different programming languages. Use the IPython Notebook Viewer (http://nbviewer.ipython.org) to share Notebooks on the Web. Read more.
Table C
Adam Marcus (B12)
Want to make sense of your data by learning how to build mixed crowdsourced machine-learning pipelines? See Adam and find out how to train a classifier that makes judgments better than the crowd that trained it. Learn how to apply lessons from fields such as user interface design and cognitive science to your crowd-powered workflows. Read more.
Table A
Fangjin Yang (Imply)
If you want to build a real-time analytics stack, meet with Fangjin to hear the argument for using Kafka, Storm, Hadoop, and Druid. He’ll provide implementation details for those of you interested in trying out this unique stack at home, including how to scale the stack and make it highly available. Read more.
Table B
Chris Harland (Textio)
Wherever you sit in the data pipeline, if you want to create an effective data workflow from source to insight delivery, talk to Chris. He’ll give you the low-down on machine-learning tools, implementations, and use cases. You’ll learn about fundamental problems in user behavior and measurement, and predictive modeling for user retention and feature improvement. Read more.
Table C
Scott Lee (Knowledgent)
Scott Lee is an information strategist who can help you explore complex issues like the evolving nature of data governance, and the effort to transform big data methodologies to include data governance & quality disciplines. Learn about balancing holistic and agile aspects of enterprise information management (EIM), especially across analytic use cases and constituents. Read more.
Table A
Paco Nathan (derwen.ai)
If you want to build scalable, fault-tolerant data workflows atop Apache Mesos, and master Mesos workloads, ask Paco. He’ll tell you how to use Mesos as an SDK for building distributed frameworks, and how to build enterprise data workflows with Cascading. You might get him to mention PMML, an open standard for migrating predictive models. Read more.
Table B
Rahul Pathak (Amazon Web Services)
Do you manage large datasets? Join Rahul to discuss AWS big data services, including Amazon S3, Elastic MapReduce, DynamoDB, Redshift, and Kinesis. Learn how to build a good big data architecture and elastically grow your resources with these products. Read more.
Table C
Anjul is the Vice President of Big Data Products at IBM. If you’re interested in data analytics, visit Anjul for in-depth conversations about services from IBM’s Watson Group. He’ll tell you how to get started with your cognitive system, using Watson services to find and capitalize on actionable insights. Find out what happens when Watson meets Hadoop. Read more.