New speakers are added regularly. Please check back to see the latest updates to the agenda.
Jose Abelenda is the director of marketing analytics at Hotwire. Prior to that, he worked as a data scientist at PayPal.
Lior Abraham is a cofounder of Interana, Inc. Lior was instrumental in scaling Facebook’s infrastructure from a few million users to over a billion, as well as bringing its most technically challenging products to market. This included building many of the backend systems that powered... Read More.
Armando Acosta has been in the IT industry for 14 years. With experience ranging from sales to product marketing and management to developing big data solutions, recently Armando has been focusing on the hyperscale market in server product management, hardware design, and big data solutions.... Read More.
Joseph Adler has many years of experience in data mining and data analysis at companies including DoubleClick, Verisign, and LinkedIn. Currently, he is director of product management and data science at Confluent. He is the holder of several patents for computer security and cryptography and... Read More.
Nidhi Aggarwal leads strategy and marketing at Tamr. Prior to joining Tamr, Nidhi founded Cloud vLab, makers of qwikLAB, a software-learning platform used to create and deploy on-demand lab environments. In the years before Cloud vLab, Nidhi worked at McKinsey & Company, advising Fortune... Read More.
Sara Ahmadian is Seamless Planet’s globetrotting CEO and the relentless catalyst for Seamless Planet’s great journey. Sara has several years of experience developing large-scale business infrastructures at successful B2B startups in Silicon Valley. She was invited by President Obama to participate in the 2015... Read More.
With over 15 years in advanced analytical applications and architecture, John Akred is dedicated to helping organizations become more data driven. As CTO of Silicon Valley Data Science, John combines deep expertise in analytics and data science with business acumen and dynamic engineering leadership.... Read More.
Brad Allen’s career has centered around the development and scaling of technologies that provide broad social benefit—for example, Peloton Technology (automated transportation), Aclima (air pollution sensing), and Embrace (developing-world healthcare). At Embrace, Brad led the late design and early market development trials for the Embrace... Read More.
T.J. Alumbaugh is a developer at Continuum Analytics. He likes array-oriented computing, Python, and C++.
Franz Aman is senior vice president of brand and demand at Informatica, where he is responsible for branding, global demand generation, marketing operations, content, and digital marketing. Previously, Franz held numerous executive positions within industry-leading technology companies, including SAP, BusinessObjects, BEA Systems,
Xavier Amatriain is VP of engineering at Quora, where he leads the team building the best source of knowledge in the Internet. With over 50 publications in different fields, Xavier is best known for his work on machine learning in general and recommender systems in... Read More.
Jeremy Anderson is a Design Lead at the Spark Technology Center, in San Francisco, focused on designing better experiences for the open source data community. Jeremy’s team has been active in contributing to Apache projects, including Zeppelin and SystemML. Prior to joining IBM, Jeremy... Read More.
Jesse Anderson is a data engineer, creative engineer, and managing director of the Big Data Institute. Jesse trains employees on big data—including cutting-edge technology like Apache Kafka, Apache Hadoop, and Apache Spark. He has taught thousands of students at companies ranging from startups to... Read More.
Erik Andrejko leads the data science and research organization at the Climate Corporation, which applies large-scale statistical machine learning and data science to solve challenging problems in numerous domains such as climatology, agronomic modeling, and geospatial applications. Erik’s contributions to the Climate Corporation include defining... Read More.
Bruce Andrews was confirmed as the deputy secretary of commerce on July 24, 2014, after being named acting deputy secretary of commerce by President Obama and Secretary Penny Pritzker on June 9, 2014. Previously, Bruce served as chief of staff to the secretary at the... Read More.
Ian Andrews is VP of products at Pivotal, where he is responsible for product strategy and marketing for Pivotal Cloud Foundry, Spring, and Big Data Suite. Prior to Pivotal, Ian was involved with market-defining startups such as Netscape, Opsware, and Aster Data.
Michael Armbrust is the lead developer of the Spark SQL project at Databricks. Michael’s interests broadly include distributed systems, large-scale structured storage, and query optimization. Michael holds a PhD from UC Berkeley, where his thesis focused on building systems that allow developers to rapidly... Read More.
Robert Bagley is the vice president of analytics science at ClickFox, where he oversees the innovative application of data science practices for client engagements and product features, charting future analytic and technology strategies. Prior to his current role, he held various leadership positions at ClickFox... Read More.
Brandon Ballinger is a cofounder at Cardiogram. Previously a cofounder at Sift Science and an engineer at Google on speech recognition and ads quality, Brandon was also called in by the White House to help fix Healthcare.gov. He graduated from the University of Washington with... Read More.
Vishal Bamba is vice president of strategy and architecture at Transamerica Technology, where he leads a team focusing on innovation initiatives within the enterprise. Vishal has over 15 years of experience in distributed systems and has led many innovation projects. He has consulted and worked... Read More.
Nenshad Bardoliwalla is the founding vice president of products at Paxata, where he is responsible for product strategy, product management, and product marketing. Nenshad is an executive and thought leader with a proven track record of success leading product strategy, product management, and development in... Read More.
Paul Barth is founder and CEO of Podium Data, creator of the industry-leading Podium data lake software platform, which is redefining enterprise data management. He has spent decades developing advanced data and analytics solutions for Fortune 100 companies and is a recognized thought leader... Read More.
Pierre Thomas Barthelemy is the engineering lead of the Data Infrastructure team at Coursera. The team is responsible for introducing core data systems (e.g., data warehouse using Redshift, ETL using Data Pipeline and Scalding), while also helping build products that create a developer-friendly ecosystem... Read More.
Joel Baxter is an engineer at BlueData, where he focuses on virtualization, containers, and Hadoop-related technologies to build an infrastructure platform for big data analytics. His background is in the provisioning and configuration of virtual compute, storage, and networking to serve the needs of application... Read More.
Maxime Beauchemin recently joined Airbnb as a data engineer developing tools to help streamline and automate data-engineering processes. He mastered his data-warehousing fundamentals at Ubisoft and was an early adopter of Hadoop/Pig while at Yahoo in 2007. More recently, at Facebook, he developed analytics-as-a-service frameworks... Read More.
Marie Beaugureau is the lead data editor for O’Reilly Media.
Alex Behm is a software engineer at Cloudera, working on the Impala team. He holds a PhD in computer science from UC Irvine.
John Belchamber is global head of business intelligence for the Advanced Analytics team at Telefónica. John has 10 years of experience in marketing and 10 more in the telecom industry, where he held strategic roles in innovation and business intelligence. Recognized as Data Professional of... Read More.
Tim Berglund is a teacher, author, and technology leader with DataStax. He has spoken at numerous conferences internationally and in the United States and contributes to the Denver tech community as president of the Denver Open Source User Group. He is the copresenter of... Read More.
Lucy Bernholz is a senior research scholar at Stanford University, where she runs the Digital Civil Society Lab. Lucy blogs about philanthropy, nonprofits, and technology at Philanthropy2173.com.
Christopher Berry is a data scientist at the Canadian Broadcasting Corporation and the founder of Authintic (acquired by 500px). Christopher has implemented breakthrough social-analytics programs for AB-Inbev, Research In Motion, and Coca-Cola. He participated in ecommerce redesigns at Gucci and Dell, mobile integrations for Best... Read More.
John Berryman’s first career was as an aerospace engineer, but after several years in aerospace, he found that he most loved his job when he was either programming or working on a good math problem. Eventually, John cut out the aircraft and satellites and started... Read More.
David Beyer is currently an investor with Amplify Partners, a $50M VC firm focused exclusively on early-stage IT infrastructure and data companies. David began his career in technology as the cofounder and CEO of Chartio.com, a pioneering provider of cloud-based data visualization and analytics.... Read More.
Milind Bhandarkar was the founding member of the team at Yahoo that took Apache Hadoop from 20-node prototype to data center-scale production system and has been contributing and working with Hadoop since version 0.1.0. Milind started the Yahoo Grid solutions team focused on training, consulting,... Read More.
Anurag Bhardwaj currently leads data science efforts at Quad Analytix, where he focuses on large-scale product classification, large-scale smart extraction, and various other machine-learning techniques. Previously, he worked on image understanding at eBay Research Labs. Anurag received his PhD and MS from the State University... Read More.
Lori Bieda is VP of business analytics and insights at the Bank of Montreal. Lori has 20 years of analytics, marketing, technology, and leadership experience across the financial services, insurance, telecommunications, technology, retail, publishing, and marketing service provider sectors. Lori’s proven successes include leading all... Read More.
Keith Bigelow leads the commercial cloud and drone division at 3DR. Prior to 3DR, Keith was the GM and SVP of the Analytics Cloud, the fastest-growing product line in Salesforce history. Previously, Keith held executive positions at SAP, where he served as
Sarah Bird is a software engineer at Continuum Analytics. She has been a core Bokeh developer since 2015 and has given numerous talks and tutorials on Bokeh. Previously, she worked at Aptivate as a full stack web developer building IT solutions for the international development... Read More.
Joerg Blumtritt is the founder and CEO of Datarella, a computational social science startup delivering mobile analytics, self-tracking solutions, and data science consulting. After graduating from university with a thesis on machine learning, Joerg worked as a researcher in behavioral sciences, focused on nonverbal... Read More.
Farrah Bostic created the Difference Engine based on her belief that deep understanding of customer needs is essential to growing businesses through great products and services. Farrah has honed her customer-centric insights as an advisor to some of the world’s most respected brands, including Apple,... Read More.
Yaron “Roni” Burd is a principal program manager on the Big Data team at Microsoft working on Hadoop and Azure Data Lake, where he focuses on making machine learning with big data scalable and easy. Roni has spent eight years helping Microsoft build its internal... Read More.
Mark Burnette is a director of sales engineering and major accounts at Pentaho, where he leads teams of engineers across western US and Japan that focus on designing and proving out big data and embedded solutions for Fortune 500 companies, including cybersecurity, telematics, mobile network... Read More.
Matt Butner is CTO and cofounder of Stride Health, which delivers intelligent healthcare, coverage, and tax compliance to self-employed and independent working Americans. Stride’s suite of benefits for independents is directly integrated into the largest on-demand marketplaces, including Uber, Postmates, and TaskRabbit. Backed by... Read More.
Mike Cafarella is one of the cofounders of the Apache Hadoop and Nutch open source projects. Mike is also an assistant professor of computer science and engineering at the University of Michigan. His research interests include databases, information extraction, data integration, and data mining. Recently,... Read More.
JR Cahill leads the Enterprise Analytics Architecture team at Kellogg, supporting the Global Analytics team that consists of Data Science, Advanced Analytics, Data Services, Visualizations and Reporting. JR has 20 years of operational development and architecture experience in data warehousing and analytics. He is also... Read More.
Arturo Canales leads the analytics team in the Global BI & Big Data unit at Telefónica. Arturo has been involved in the creation of many data products for internal BI teams across all the countries where Telefónica operates, from a social-network-analysis-approach product to better understand... Read More.
Arno Candel is the chief architect at H2O, a distributed and scalable open source machine-learning platform. Arno is also the main author of H2O’s Deep Learning. Before joining H2O, Arno was a founding senior MTS at Skytree, where he designed and implemented high-performance machine-learning... Read More.
John F. Canny is a computer scientist and the Paul and Stacy Jacobs Distinguished Professor of Engineering in the Computer Science Department of the University of California, Berkeley. John has made significant contributions in various areas of computer science and mathematics, including artificial intelligence, robotics,... Read More.
Matt Cardillo is a senior director of FINRA technology. Matt is an avid Scrum evangelist at FINRA and exercises it in the delivery of highly usable, innovative big data analytic solutions.
Amber Case is the director of Esri’s R&D Center in Portland, where she works on open source developer tools and next-generation location-based technology. Previously, Amber was the CEO of and cofounder of Geoloqi, a location-based software company acquired by Esri in 2012. She is... Read More.
An entrepreneurial executive with over 25 years of industry experience, Michele Chambers is currently CMO of Continuum Analytics. Prior to Continuum Analytics, Michele held executive leadership roles at database and analytic companies Netezza, IBM, Revolution Analytics, MemSQL, and RapidMiner. In her career, Michele... Read More.
Evan Chan is a distinguished software engineer at Tuplejump. Evan loves to design, build, and improve bleeding-edge distributed data and backend systems using the latest open source technologies. He has led the design and implementation of multiple big data platforms based on Storm, Spark, Kafka,... Read More.
Vinoth Chandar works on data infrastructure at Uber, with a focus on Hadoop and Spark. Vinoth has keen interest in unified architectures for data analytics and processing. Previously, Vinoth was the LinkedIn lead on Voldemort and worked on Oracle server’s replication engine, HPC, and... Read More.
Manjeet Chayel is a specialist SA for AWS working on big data technology solutions. Manjeet focuses on Amazon EMR and helps customers solve their big data problems using the right techniques and tools for the job.
Ewen Cheslack-Postava is an engineer at Confluent building a stream data platform based on Apache Kafka to help organizations reliably and robustly capture and leverage all their real-time data. Ewen received his PhD from Stanford University, where he developed Sirikata, an open source system for... Read More.
Adam Cheyer is cofounder and VP of engineering at Viv. Previously, Adam was cofounder and VP of engineering at Siri. He’s also served as program director in SRI’s Artificial Intelligence Center and chief architect of the CALO/PAL project. A pioneer in the areas... Read More.
Trina Chiasson lives at the intersection of data, design, and code. Trina is a senior product manager at Tableau Software, where she enjoys helping people see and understand data. Previously, she was the cofounder and CEO of Infoactive, a web app for turning live... Read More.
Mok Choe is an accomplished technologist whose career spans a diverse group of financial services businesses and successful Internet companies. Mok is a proven transformational leader, with extensive experience leading enterprise architecture at firms including TD Bank Group, Commonwealth Bank of Australia, Union Bank of... Read More.
Kelvin Chu is a founding member of the Hadoop team at Uber, where he creates tools and services on top of Spark to support multitenancy and large-scale computation-intensive applications. Kelvin is the creator and lead engineer of the Spark Uber development kit, Paricon, SparkPlug, and... Read More.
Brian Clapper is a senior instructor and curriculum developer at Databricks. Brian has more than 32 years’ experience as a software developer. Brian has worked for a stock exchange, the US Navy, a large software company, several startups, and small companies and, most recently, as... Read More.
Brian Clark is VP of product management at Objectivity. Brian has nearly 30 years of software and technology experience and was one of the early architects of Objectivity/DB. Before joining Objectivity, Brian worked at Automation Technology Products, providing leading tools in the MCAD market.... Read More.
Christopher Colburn is just another data scientist at Netflix.
Eric Colson is chief algorithms officer at Stitch Fix as well as an advisor to several big data startups. Previously, Eric was vice president of data science and engineering at Netflix. He holds a BA in economics from SFSU, an MS in information systems... Read More.
Mike Conover builds machine-learning technologies that leverage the behavior and relationships of hundreds of millions of people. A staff data scientist at LinkedIn, Mike has a PhD in complex systems analysis with a focus on information propagation in large-scale social networks. His work has appeared... Read More.
James Crawford has two decades of experience leading innovative software projects, including empowering farmers with climate data at the Climate Corporation, working to put a commercial robot on the moon at Moon Express, making the world’s books searchable at Google, and managing robotics at NASA’s... Read More.
Charlie Crocker is a data geek with 20 years of experience bringing data out of the shadows to drive business value and optimize operational costs. At Autodesk, he is currently working across divisions to identify and validate potential reliable data sources and access mechanisms, while... Read More.
Alistair Croll is an entrepreneur with a background in web performance, analytics, cloud computing, and business strategy. In 2001, he cofounded Coradiant (acquired by BMC in 2011) and has since helped launch Rednod, CloudOps, Bitcurrent, Year One Labs, and several other early-stage companies. He works... Read More.
Nick Curcuru has been delivering analytics solutions for nearly 20 years in operations and consulting. He is currently principal of the big data analytics practice at MasterCard Advisors, where he works with the executive suite cascading to the operational level to enable data-driven strategy for... Read More.
Doug Cutting is the chief architect at Cloudera and the founder of numerous successful open source projects, including Lucene, Nutch, Avro, and Hadoop. Doug joined Cloudera from Yahoo, where he was a key member of the team that built and deployed a production Hadoop storage-and-analysis... Read More.
Michelangelo D’Agostino is the director of data science R&D at Civis Analytics, where he leads a team that develops statistical models and writes software to help companies and nonprofits leverage their data. As a reformed particle physicist turned data scientist, Michelangelo loves mungeable datasets, machine... Read More.
Timothy Danford is a computer scientist working on advanced automation approaches to big data variety in the pharmaceutical and healthcare industries. Previously, Timothy was a software architect, engineer, and founding team member for Genome Bridge LLC, a Broad Institute subsidiary organized to develop... Read More.
Tathagata Das is an Apache Spark committer and a member of the PMC. He is the lead developer behind Spark Streaming, which he started while a PhD student in the UC Berkeley AMPLab, and is currently employed at Databricks. Prior to Databricks, Tathagata worked... Read More.
Sudipto Shankar Dasgupta is a AVP and head of engineering for the Platforms group at Infosys Ltd., where he works on big data and analytics platform development for large enterprises. Prior to that he was chief architect with SAP, working on SAP... Read More.
Prior to joining Amplify as a general partner, Mike Dauber spent over six years at Battery Ventures, where he led early-stage enterprise investments on the West Coast, including Battery’s investment in a stealth security company that is also in Amplify’s portfolio. Most recently, Mike sat... Read More.
Bolke de Bruin is putting advanced analytics in the heart of the wholesale business line of European bank ING.
Donna Denio is a communications and business development specialist who is passionate about teamwork and generating productive relationships. Donna has over 20 years’ experience helping leaders of multinational companies identify and secure new business opportunities in design and construction. Ten years ago, Donna’s search for... Read More.
Anthony Dina serves as the director of enterprise technologists at Dell, Inc., where he leads a team of solutions architects with expertise in big data and application acceleration to work with customers on how to transform IT into better business outcomes. Anthony has 17 years... Read More.
Renee DiResta is the vice president of business development at Haven, a private marketplace for booking ocean freight shipments. Previously, Renee was a principal at seed-stage VC fund O’Reilly AlphaTech Ventures (OATV) and spent seven years as a trader at Jane Street Capital, a... Read More.
Scott Donaldson is senior director for Market Regulation Technology at FINRA. Scott leads the data and analytics teams responsible for the surveillance of US equities and fixed-income markets.
Mark Donsky leads data management and governance solutions at Cloudera. Previously, Mark held product management roles at companies such as Wily Technology, where he managed the flagship application performance management solution, and Silver Spring Networks, where he managed big data analytics solutions that reduced greenhouse... Read More.
Scott Draves is an award-winning software artist, VJ, and pioneer of the open source movement. His clients and exhibitions range from the likes of MoMA.org, LACMA, Google, and the Adler Planetarium to Skrillex. He has a PhD in computer science from Carnegie Mellon University... Read More.
Chris DuBois is a data scientist focused on building tools for other data scientists. At Dato, Chris has helped design and implement tools for creating recommendation systems and for large-scale text analysis. His current work makes it simpler to train models that generalize well. After... Read More.
Ted Dunning has been involved with a number of startups—the latest is MapR Technologies, where he is chief application architect working on advanced Hadoop-related technologies. Ted is also a PMC member for the Apache Zookeeper and Mahout projects and contributed to the Mahout clustering,... Read More.
Don Bosco Durai is an Apache committer and currently working as a security architect at Hortonworks, focused on enabling enterprise-grade security within the Hadoop platform. Bosco brings years of experience building and managing enterprise data security products. Before Hortonworks, Bosco was the cofounder and chief... Read More.
Glynn Durham is a senior instructor at Cloudera. Previously, he worked for Oracle, Forté Software, MySQL, and Cloudera, spending five or more years at each.
An Apache Cassandra committer and PMC member, Gary Dusbabek is a lifelong programmer specializing in distributed systems. His past experience includes working with large-scale text and image indexes in the newspaper industry and building a multi-data-center distributed metrics and monitoring system for a large... Read More.
Joey Echeverria is the director of engineering at Rocana, where he builds applications for scaling IT operations built on the Apache Hadoop platform. Joey is a committer on the Kite SDK, an Apache-licensed data API for the Hadoop ecosystem. Joey was previously a... Read More.
Committer to several open source projects including the Spark Cassandra Connector, Cassandra Kafka Connector, a previous contributor to Akka (2 new features in Akka Cluster), Spring Integration and several others. She is also a speaker at international Big Data and Scala conferences: Kafka Summit, Spark... Read More.
Alexei (Alyosha) Efros joined UC Berkeley in 2013 as associate professor of electrical engineering and computer science. Prior to that, Alyosha spent nine years on the faculty of Carnegie Mellon University. He has also been affiliated with École Normale Supérieure/INRIA and the University of... Read More.
Jana Eggers is a tech executive focused on products and the messages surrounding them. Jana has started and grown companies and led large organizations within even bigger companies. She supports, subscribes to, and contributes to customer-inspired innovation, systems thinking, lean analytics, and Autonomy/Mastery/Purpose-style leadership. Jana’s... Read More.
Stephen Elston is an experienced big data geek, data scientist, and software business leader. Steve is principal consultant at Quantia Analytics, LLC, where he leads the building of new business lines, manages P&L, and takes software products from concept and financing through development, intellectual... Read More.
Bin Fan is a software engineer at Alluxio and a PMC member of the Alluxio project. Prior to Alluxio, Bin worked at Google building next-generation storage infrastructure, where he won Google’s Technical Infrastructure award. Bin has a PhD in computer science from Carnegie Mellon... Read More.
Moty Fania is a principle engineer for big data analytics at Intel IT, where he drives the overall technology and architectural roadmap and owns development and architecture. Moty has over 13 years of experience in BI, data warehousing, and decision-support solutions. He holds a bachelor’s... Read More.
Faisal Farooq is currently the principal scientist in the Watson Health group of IBM Watson, where he works on next-generation healthcare software to improve patient care. Faisal is an expert in applying machine learning in the healthcare domain, and his general areas of interest... Read More.
Sameer Farooqui is a client services engineer at Databricks, where he works with customers on Apache Spark deployments. Sameer works with the Hadoop ecosystem, Cassandra, Couchbase, and general NoSQL domain. Prior to Databricks, he worked as a freelance big data consultant and trainer globally and... Read More.
Camille Fournier is the former head of engineering at Rent the Runway. She was previously a vice president at Goldman Sachs. Camille is an Apache ZooKeeper committer and PMC member and a Dropwizard framework PMC member.
Michael Franklin is the Thomas M. Siebel Professor of Computer Science at UC Berkeley and the director of the AMPLab. The AMPLab, which received an NSF CISE Expeditions in Computing award announced as part of the White House Big Data Research Initiative in... Read More.
Eric Frenkiel is the cofounder and CEO of MemSQL, an in-memory distributed database that combines real-time and historical big data analytics. MemSQL is a Y Combinator company that has raised more than $45M in venture capital. Prior to MemSQL, Eric worked at Facebook on... Read More.
Julia Galef cofounded the Center for Applied Rationality (CFAR), a nonprofit devoted to developing cognitive, science-based strategies for reasoning and decision making. In addition to research, CFAR runs workshops for companies and talented individuals who want to use rationality to address global problems.... Read More.
Tanya Gallagher is a veteran technical instructor with thousands of hours of classroom experience across a 20-year career. Tanya has spent the past two years at DataStax writing curriculum and leading the curriculum development team. Prior to DataStax, she was a curriculum developer and technical... Read More.
Ilya Ganelin is a roboticist turned data engineer. After a few years building self-discovering robots at the University of Michigan and another few years working on embedded DSP software with cell phones and radios at Boeing, he landed in the world of big data... Read More.
Siddha Ganju is a master’s student of computational data science at Carnegie Mellon University and was a 2015 summer openlab intern at CERN. She has implemented several projects at the junction of machine learning, natural language processing, and information retrieval, and her research also... Read More.
Yael Garten leads a team of data scientists at LinkedIn that focuses on understanding and increasing growth and engagement of LinkedIn’s 400 million members across mobile and desktop consumer products. Yael is an expert at converting data into actionable product and business insights that impact... Read More.
Deepak Gattala is a big data architect in IT project management at Dell.
Matthew Gee is cofounder and principal at the Impact Lab, a data-analytics company focused exclusively on developing scalable data science solutions to social-sector problems. He is also a senior research scientist at the University of Chicago’s Center for Data Science and Public Policy and a... Read More.
Lise Getoor is a professor in the Computer Science Department at the University of California, Santa Cruz. Her research areas include machine learning, data integration, and reasoning under uncertainty, with an emphasis on graph and network data. Lise was recently recognized as one of the... Read More.
Charles Givre is an unapologetic data geek who is passionate about helping others learn about data science and become passionate about it themselves. For the last five years, Charles has worked as a data scientist at Booz Allen Hamilton for various government clients and has... Read More.
Colette Glaeser is a principal data strategist at Silicon Valley Data Science. With a proven track record in applying analytics to provide a competitive advantage, Colette brings over 20 years of experience in driving business development, customer insight, operational analysis, and continuous process improvement across... Read More.
Dennis Gleeson is the chief evangelist at 1010data. Prior to joining 1010data, Dennis was a director of strategy in the Central Intelligence Agency (CIA)’s Directorate of Analysis. He began his career with the CIA in 2002 as a political analyst. In 2009, he... Read More.
Scott Gnau is the CTO of Hortonworks, a company at the forefront of emerging connected data platforms, where he works intimately with leaders in the Fortune 1000 undergoing business transformation through real-time data. Scott has spent his entire career in the data industry; previously,... Read More.
Joe Goldberg is the lead solutions marketing manager at BMC Software, where he helps BMC products leverage new technology to deliver market-leading solutions with a focus on workload automation and big data. Joe has more than 35 years of experience in the design,... Read More.
Kevin Goode is the director of platform engineering at Inmar. Kevin has 20 years of IT experience, 19 years of which has been SQL-server focused, starting with version 6.5. For the past four years, he has been focused on big data, Hadoop, and NoSQL.... Read More.
Alex Gorelik is the founder and CEO of Waterline Data, a startup focused on enhancing the value of Hadoop through data self-service and governance. Alex is a serial entrepreneur and innovator who has spent over 25 years inventing and bringing to market cutting-edge data-oriented... Read More.
Jon Gosier is a serial tech entrepreneur and venture capitalist working at the intersection of data science and design. Based in Philadelphia, Jon is also the cofounder of Predictive Pop (aka PredPop), a data company changing way the music industry monitors and monetizes music. Prior... Read More.
Alexander Gray is an associate professor at Georgia Tech and the CEO of Skytree, Inc. His research focuses on scaling up all of the major practical methods of machine learning (ML) to massive datasets. Alex began working on this problem at NASA in... Read More.
Dave Gray is the founder and chairman of XPLANE, the visual thinking company. Founded in 1993, XPLANE has grown to be the world’s leading consulting and design firm focused on information-driven communications. Dave spends his time researching and writing about visual business, as... Read More.
Garrett Grolemund is a data scientist and chief instructor for RStudio, Inc. Garrett is a longtime user and advocate of R; he wrote the popular lubridate package for working with dates and times in R. Garrett designed and delivered the highly rated O’Reilly video series... Read More.
Mark Grover is a software engineer working on Apache Spark at Cloudera. Mark is a committer on Apache Bigtop and a committer and PMC member on Apache Sentry and has contributed to a number of open source projects including Apache Hadoop, Apache Hive, Apache... Read More.
Carlos Guestrin is the Amazon Professor of Machine Learning in Computer Science & Engineering at the University of Washington and the cofounder and CEO of Dato. Carlos also coteaches the Machine Learning Specialization through UW and Coursera. His previous positions include the Finmeccanica Associate... Read More.
Kanu Gulati is a senior associate at Zetta Venture Partners. Kanu has over 10 years of operating experience as an engineer, scientist, and strategist. She owned Intel’s multicore CAD algorithms research roadmap, developed advanced parallel CAD solutions, and pioneered metrics-driven methodology improvements for... Read More.
Sijie Guo is a staff software engineer at Twitter, where he is tech lead of Message team. He is also the founder of Apache DistributedLog (incubating) and the PMC chair of Apache BookKeeper.
Vida Ha is currently a solutions engineer at Databricks. Previously, she worked on scaling Square’s reporting analytics system. Vida first began working with distributed computing at Google, where she improved search rankings of mobile-specific web content and built and tuned language models for speech recognition... Read More.
Patrick Hall is a senior staff scientist at SAS and an adjunct professor in the Department of Decision Sciences at George Washington University. Patrick designs new data-mining and machine-learning technologies. He is the 11th person worldwide to become a Cloudera certified data scientist. Patrick... Read More.
Jordan Hambleton is a solutions architect for Cloudera, based in the San Francisco office. While at Cloudera, his focus has been partnering with customers to build and manage scalable enterprise products on the Hadoop stack. Prior to Cloudera, Jordan was a member of technical staff... Read More.
Bob Hansen, the engineer in charge of making Vertica a vibrant part of the greater Hadoop ecosystem, turns customers’ needs into new features, making Vertica a peaceful island floating in the center of your data lake. Over his entire career, Bob has been dedicated to... Read More.
Moritz Hardt is a senior research scientist at Google Research, where his mission is to build the theory and tools that make machine learning more reliable. After obtaining a PhD in computer science from Princeton University, Moritz spent three years at IBM Research Almaden... Read More.
Todd Harple is an experience engineer at Intel, where he has worked since 2005. Todd has conducted global ethnographic and design research and presently he leads strategic innovation and pathfinding activities within Intel’s New Devices Group. Over the past three years, his focus has increasingly... Read More.
Derrick Harris works for datacenter software startup Mesosphere. He was previously a technology journalist, most notably covering cloud computing, big data, and other emerging IT trends for Gigaom since 2009. There’s a strong possibility that Derrick has written the words “cloud” and “Hadoop” more than... Read More.
Kate Heddleston is a software engineer who focuses on using open source tools to build web applications, with a particular interest in the portions of the product that interface with the user. When she’s not programming, Kate is involved with organizations like Hackbright Academy, PyLadies,... Read More.
Joseph M. Hellerstein is the Jim Gray Chair of Computer Science at UC Berkeley and cofounder and CSO at Trifacta. Joe’s work focuses on data-centric systems and the way they drive computing. He is an ACM fellow, an Alfred P. Sloan fellow, and... Read More.
Hylke Hendriksen is a data scientist at ING. Hylke studied computer science at Delft University of Technology. After demonstrating his graduate thesis project to the ING Wholesale Banking Advanced Analytics team on real-time anomalous click path detection, Hylke is now implementing this in... Read More.
Bill Hinderman is the engineering manager for air site optimization at Expedia, and was the senior site optimization UI engineer at Orbitz Worldwide. In human terms: he built the A/B testing development practice from the ground up. He and his team focus on experimenting and... Read More.
Throughout his eight-year tenure in the advanced electronics industry, Allen Hoem has a focused on process optimization and product development. At Roku Inc., Allen streamlined the firmware deployment model for the New Products, Television division. Prior to that, Allen was the development lead and process coach for developing
Joshua Hoffman is the CEO of Zymergen. Prior to Zymergen, Josh was a partner at Norcob Capital and before that a managing director in merchant banking at Rothschild, where he was a member of the Management Committee. He began his career at McKinsey &... Read More.
Jeff Holoman is a systems engineer at Cloudera. Jeff is a Kafka contributor and has focused on helping customers with large-scale Hadoop deployments, primarily in financial services. Prior to his time at Cloudera, Jeff worked as an application developer, system administrator, and Oracle technology specialist.... Read More.
Jeremy Howard is a serial entrepreneur, business strategist, developer, and educator. He is the CEO of Enlitic, a startup he founded to use recent advances in machine learning to transform the practice of medicine and bring modern medical diagnostics to billions of people in... Read More.
Johnson Hsieh is a cofounder at Cardiogram, where he is applying deep learning to medicine. Previously a software engineer at Google building user models (e.g. user interests) to improve cross-product personalization/recommendation using various ML techniques. He also worked on the Google Voice Assistant (a.k.a. “Ok... Read More.
John Hugg has spent his entire career working with databases and information management. In 2008, John was lured away from a PhD program by Mike Stonebraker to work on what became VoltDB. As the first engineer on the product, he liaised with a team of... Read More.
Leah Hunter writes about the human side of tech for Fast Company, the Guardian, and O’Reilly. She is authoring two upcoming books—one on augmented reality from O’Reilly and the other on the future in five years. Leah speaks about both topics, as well as fashion... Read More.
Alysa Z. Hutnik is a partner in the Advertising & Marketing and Privacy & Information Security practices at Kelley Drye & Warren LLP in Washington, DC. Her practice represents clients in all forms of consumer-protection matters, from counseling to defending regulatory investigations and litigation.... Read More.
Tim Hwang is a lawyer and researcher focusing on the intersection of intelligent agents and society, currently at the Intelligence and Autonomy project at Data & Society in New York. He has formerly served in research roles with the Stanford Center for Legal Informatics, the... Read More.
Noah Iliinsky is a senior UX architect with Amazon Web Services. Noah strongly believes in the power of intentionally crafted communication and has spent the last decade researching, writing, and speaking about best practices for designing visualizations, informed by his graduate work in user experience... Read More.
Mario Inchiosa’s passion for data science and high-performance computing drives his work at Microsoft, where he focuses on delivering parallelized, scalable advanced analytics integrated with the R language. Previously, Mario served as Revolution Analytics’s chief scientist and as analytics architect in IBM’s Big Data organization,... Read More.
Alex Ingerman leads the product management team for Amazon Machine Learning. He joined Amazon in 2012 after working on products including web-scale search, content recommendation systems, immersive data-exploration environments, and enterprise email and content servers. Alex holds a BS in computer science and an MS... Read More.
Marco M. Ippolito is the data model architect for French-based geophysical services company, CGG, Inc., a fully integrated geoscience company providing leading geological, geophysical, and reservoir capabilities to a broad base of customers primarily from the global oil and gas industry. Since joining
Sreeni Iyer is CTO, CIO, and cofounder of Quad Analytix, a big data company in the ecommerce vertical. Sreeni is focused on machine learning, big data in batch and quasi real time, and insightful visualizations. Sreeni’s previous positions include director of architecture for... Read More.
Mridul Jain is a senior principal architect for Yahoo’s monitoring platform. He has been using Storm and Kafka to solve various real-time problems at Yahoo for almost three years. Mridul is also the author of Pig on Storm. His interests are mostly in the area... Read More.
Rohit Jain is the CTO at Esgyn for Trafodion, a transactional SQL-on-HBase RDBMS. Rohit worked for Hewlett-Packard for 28 years on applications and databases, undertaking such roles as solutions architect, consultant, software engineer, architect, development and QA manager, product manager, and chief... Read More.
Jeroen Janssens is the founder of Data Science Workshops, which provides on-the-job training and coaching in data visualisation, machine learning, and programming. For one day a week, Jeroen is an assistant professor at Jheronimus Academy of Data Science. Previously, he was a data scientist... Read More.
Calvin Jia is a software engineer at Tachyon Nexus and a top contributor to Tachyon.
Aaron Kalb has spent his career crafting and empowering delightful human-computer interactions, especially through natural language interfaces. Aaron currently leads the design team and guides the product vision at Alation, after leaving Stanford with a BS and an MS in symbolic systems and working at... Read More.
Dave Kale is a PhD student in computer science and an Alfred E. Mann Innovation in Engineering fellow at the University of Southern California. His research uses machine learning to extract insight from digital data in high-impact domains, including, but not limited to, healthcare. His... Read More.
Holden Karau is a software development engineer at IBM and is active in open source. Prior to IBM, she worked on a variety of big data, search, and classification problems at Alpine, Databricks, Google, Foursquare, and Amazon. Holden is the author of Learning... Read More.
Aneesh Karve is co-founder and CTO at Quilt, a data virtualization platform for data scientists. Previously, Aneesh worked as a product manager, lead designer, and software engineer at companies like Microsoft, NVIDIA, and Matterport. Aneesh was the general manager for AdJitsu, the first... Read More.
Mubashir Kazia is a solutions architect at Cloudera focusing on security. Mubashir started the initiative integrating Cloudera Manager with Active Directory for kerberizing the cluster and provided sample code. Mubashir has also contributed patches to Apache Hive that fixed security-related issues.
Brian Kent is a machine-learning engineer at Dato. His passion is developing statistical and machine learning tools and using these tools to help people solve problems with data. Brian holds a PhD in statistics from Carnegie Mellon University. His research focused on clustering methods and... Read More.
Paul Kent is vice president of big data initiatives at SAS, where he divides his time between customers, partners, and the Research & Development teams discussing, evangelizing, and developing software at the confluence of big data and high-performance computing. Paul was previously vice president... Read More.
Grega Kešpret is the director of engineering for analytics at Celtra, where he builds analytics pipeline and optimization systems. Grega also leads teams of engineers and data scientists in San Francisco and Ljubljana working on Celtra’s analytics platform. Prior to Celtra, Grega worked at
Amandeep Khurana is a solutions architect at Cloudera, where he’s involved in the entire lifecycle of Hadoop adoption for customers from use-case discovery to taking systems to production. Amandeep is also a coauthor of HBase In Action, a book geared toward building applications using HBase.... Read More.
Spencer Kimball is the cofounder and CEO of Cockroach Labs, where he maintains a delicate balance between a love for programming distributed systems and the excitement of helping the company grow smoothly. He cut his teeth on databases during the dot-com heyday and had... Read More.
Jonathan H. King is the Head of Cloud Strategy for Ericsson. He was previously Head of Cloud Strategy and business development for CenturyLink Technology Solutions. Prior to that Jonathan was SVP of WW business development at Joyent, an innovative cloud-computing company based in San... Read More.
Adam is an IBM Distinguished Engineer and CTO of the Cloud Data Services group. He joined IBM in 2014 via the acquisition of Cloudant, where he built a highly available, scalable database and drove the development of the systems required to offer... Read More.
Benedikt Koehler studied sociology, anthropology, and psychology in Munich, where he received his PhD in 2006. After founding a mobile-Web startup in the late 1990s, he worked as a consultant for Internet and media companies. In 2008, Benedikt cofounded the Social Media Association, the first... Read More.
Marcel Kornacker is a tech lead at Cloudera and the architect of Apache Impala (incubating). Marcel has held engineering jobs at a few database-related startup companies and at Google, where he worked on several ad-serving and storage infrastructure projects. His last engagement was as the... Read More.
Jay Kreps is the cofounder and CEO of Confluent, a company focused on Apache Kafka. Previously, Jay was one of the primary architects for LinkedIn, where he focused on data infrastructure and data-driven products. He was among the original authors of a number of... Read More.
Balaji Krishna has been with SAP for over 16 years, with customer-facing experience as support consultant, RIG, solution management, and currently product management. He has been a trusted advisor to customers in architecting and implementing the best end-to-end EDW and analytics solutions.... Read More.
Balaji Krishnapuram is responsible for analytics at IBM Watson Health, where he currently leads the development of two products and a cloud-based analytics platform for healthcare. Previously, Balaji led teams that launched seven commercially successful products using machine learning over the last 10 years... Read More.
Chi-Yi Kuan is director of business analytics at LinkedIn. He has over 15 years of extensive experience in applying big data analytics, business intelligence, risk and fraud management, data science, and marketing mix modeling across various business domains (social network, ecommerce, SaaS, and consulting) at... Read More.
Scott Kurth is the vice president of advisory services at Silicon Valley Data Science, where he helps clients define and execute the strategies and data architectures that enable differentiated business growth. Building on 20 years of experience making emerging technologies relevant to enterprises, he has... Read More.
Yann Landrin is a data scientist and data engineer with over 15 years of personalization and big data experience. He has worked on all aspects of big data, from large-scale machine learning to infrastructure optimization. At Autodesk, he is working on the next-generation data platform,... Read More.
Costin Leau is an engineer at Elasticsearch, where he leads big data efforts. An open source veteran, Costin led various Spring projects (Spring OSGi, GemFire, Redis, Hadoop) and authored an OSGi spec. He has spoken about Java, big data, and Elasticsearch-related topics at a number... Read More.
Alex Leblang is an engineer at Cloudera on the RecordService team. Previously, Alex was an Apache Impala (incubating) engineer and interned at Vertica. He holds a bachelor’s degree from Brown University with concentrations in computer science and Latin American studies.
Erin Ledell is a statistician and machine-learning scientist at H2O.ai. Erin is the main author of H2O Ensemble. Before joining H2O, she was the principal data scientist at Wise.io and Marvin Mobile Security (acquired by Veracode in 2012) and the founder of DataScientific, Inc. Erin... Read More.
Mike Lee Williams is director of research at Fast Forward Labs, an applied machine intelligence lab in New York City, where he builds prototypes that bring the latest ideas in machine learning and AI to life and helps Fast Forward Labs’s clients understand how to... Read More.
Bob Levy is CEO of Virtual Cove, focused on emerging technologies for improving human data processing potential. He has over two decades of executive, product, marketing, and R&D experience with firms including IBM, MathWorks, Hancock Software, Harte Hanks, and Rational Software. Bob was... Read More.
Linus Liang is a serial entrepreneur with expertise in technology, medical devices, and social enterprises. He most recently cofounded Embrace, a social enterprise that develops and distributes a low-cost infant incubators to developing countries. Unlike traditional incubators that cost up to $20,000, the Embrace Infant... Read More.
Todd Lipcon is an engineer at Cloudera, where he primarily contributes to open source distributed systems in the Apache Hadoop ecosystem. Previously, he focused on Apache HBase, HDFS, and MapReduce, where he designed and implemented redundant metadata storage for the NameNode (QuorumJournalManager), ZooKeeper-based automatic... Read More.
Zack Lipton is a graduate student in the Artificial Intelligence group at the University of California, San Diego. He works on the theory and application of machine learning, particularly deep learning and multilabel classification, and develops algorithms to exploit sparsity, enabling the efficient training of... Read More.
Darren Lo is currently a lead engineer on Cloudera Manager. He previously worked on the Model Repository Server at Informatica.
Bill Loconzolo is the vice president of Intuit’s Data Engineering and Analytics team, where he leads the development of Intuit’s central big data platform, which leverages the power of the collective data of 45 million Intuit customers. The platform creates unique data-driven insights and product... Read More.
David Loftesness has been a software engineer and manager at a range of tech companies, including Amazon, Twitter, Xmarks, and Geoworks, each with its own unique strengths and challenges. David is currently taking time to share what he’s learned through talks and blog posts before... Read More.
Michael Lopp is a Silicon Valley-based engineering leader who builds both teams and software at companies such as Borland, Netscape, Palantir, and Apple. Michael has written two books. His first book, Managing Humans, a popular guide to the art of engineering leadership, clearly explains that... Read More.
Ben Lorica is the chief data scientist at O’Reilly Media. Ben has applied business intelligence, data mining, machine learning, and statistical analysis in a variety of settings, including direct marketing, consumer and market research, targeted advertising, text mining, and financial engineering. His background includes stints... Read More.
Michael Ludden is an IBMer in developer relations at Watson. Previously, Michael was developer marketing manager lead at Google, head of developer marketing at Samsung, a developer evangelist at HTC, and global director of developer relations at startups Quixey and Nexmo and was involved... Read More.
Roger Magoulas is the research director at O’Reilly Media and chair of the Strata + Hadoop World conferences. Roger and his team build the analysis infrastructure and provide analytic services and insights on technology-adoption trends to business decision makers at O’Reilly and beyond. He and... Read More.
Seshadri Mahalingam is a software engineer at Trifacta, where, in addition to building out Wrangle, Trifacta’s domain-specific language for expressing data transformation, he develops the low-latency compute framework that powers Trifacta’s fluid and immersive data wrangling experience. Seshadri holds a BS in EECS from... Read More.
Ted Malaska is a senior solution architect at Blizzard. Previously, he was a principal solutions architect at Cloudera. Ted has 18 years of professional experience working for startups, the US government, some of the world’s largest banks, commercial firms, bio firms, retail firms, hardware appliance... Read More.
James Malone is a product manager for Google Cloud Platform and manages Cloud Dataproc and Apache Beam (incubating). Previously, James worked at Disney and Amazon. James is a big fan of open source software because it shows what is possible when people come together to... Read More.
Vikash Mansinghka is a research scientist at MIT, where he leads the Probabilistic Computing Project, as well as a cofounder of Empirical Systems, a new venture-backed AI startup aimed at improving the credibility and transparency of statistical inference. Previously, Vikash cofounded a venture-backed startup... Read More.
Keith is the CTO focused on Analytics for Dell EMC. He brings more then 24 years of Identity Fraud Analytics experience, alternative and traditional data architectures experience, and Financial Systems and Analytics experience. Keith is an advisory board member of the University of... Read More.
As the CEO of Silicon Valley Data Science, Sanjay Mathur has brought together a team of world-class data scientists and engineers to help companies become more data driven. Previously, Sanjay was SVP of product management for LiveOps, where he was responsible for LiveOps’s... Read More.
Drew Mattison is a connector, facilitator, communicator, rationalizer, strategist, and advocate who helps clients get things done. For the last 20 years, he has worked where business and strategy intersect with design and communications. Drew is responsible for ensuring XPLANE teams exceed expectations and... Read More.
Patrick McFadin is one of the leading experts in Apache Cassandra and data-modeling techniques. As a consultant and the chief evangelist for Apache Cassandra at DataStax, Patrick has helped build some of the largest and most exciting deployments in production. Prior to DataStax, he was... Read More.
Pat McGarry brings extensive technology and leadership experience in hardware and software engineering to his role as vice president of engineering at Ryft. Pat joined Ryft from Ixia Communications, where he was responsible for federal security systems engineering programs. During his tenure at Ixia and... Read More.
Emma McGrattan is SVP of engineering at Actian, where she leads the Actian Vector, Actian Vector Hadoop Edition, and Actian Matrix development teams. A leading authority in DBMS technologies, Emma has over 20 years’ experience managing, supporting, and developing a variety of databases,... Read More.
Denise McInerney is a data professional with over 16 years of experience. Denise began her career as a database administrator, managing and developing databases for online transactional systems. She now works as a data architect at Intuit, where she designs and implements BI and analytics... Read More.
Wes McKinney is a software architect at Two Sigma Investments. He is the creator of Python’s pandas library, and he is a PMC member for Apache Arrow and Apache Parquet. He wrote the book Python for Data Analysis. Previously, Wes worked for Cloudera, and... Read More.
Eric McNulty helps leaders and organizations create long-term value and increase their positive impact on the full range of stakeholders. Eric is a writer, speaker and conversation catalyst, teacher, and advisor and holds an appointment as director of research and professional programs at the National... Read More.
Stephen Merity is a senior research scientist at MetaMind, part of Salesforce Research, where he works on researching and implementing deep learning models for vision and text, with a focus on memory networks and neural attention mechanisms for computer vision and natural language processing... Read More.
Leo Meyerovich cofounded Graphistry, Inc. to scale visual graph analysis (think exploring security alerts) by connecting browsers to GPU clusters. Graphistry builds upon the founding team’s work at UC Berkeley on the first parallel web browser and Superconductor, a declarative GPU-accelerated data visualization... Read More.
Claire Mitchell is a product experience designer at Temboo in NYC. With experience ranging from creative strategy to design, Claire has designed interfaces that show the potential for the future, developed conceptual pitches for award-winning commercials, and built and managed teams that can effectively... Read More.
i’m a neurosurgery resident at stanford. i have doctorates in physics, medicine, and neuroscience. surgically i am focused on epilepsy, brain tumors, and deep brain stimulation. my research focuses on neural engineering, human electrophysiology, and imaging in neurosurgery. i’m enthusiastic about how machine learning and... Read More.
Donald Miner is the founder of the data science firm Miner & Kasch, where he specializes in Hadoop enterprise architecture and applying machine learning to real-world business problems. Donald is author of MapReduce Design Patterns and the forthcoming Enterprise Hadoop, both published by O’Reilly Media.... Read More.
As an executive at mobile pioneers such as Facebook, Trulia, and Nokia, SC Moatti has launched and monetized mobile products that are used by billions of people and have received prestigious awards, including an Emmy nomination. Currently, SC runs Products That Count, a company that... Read More.
Prat Moghe is the founder and CEO of Cazena. Prat is a successful big data entrepreneur with nearly 20 years of experience inventing next-generation products and building strong teams in the technology sector. Prior to founding Cazena, as SVP of strategy, products, and... Read More.
Rajat Monga leads TensorFlow, an open source machine-learning library and the center of Google’s efforts at scaling up deep learning. He is one of the founding members of the Google Brain team and is interested in pushing machine-learning research forward toward general AI. Previously, Rajat... Read More.
Aurelia Moser is a developer and curious cartographer building communities around code at Mozilla Open Science. Previously of Ushahidi, Internews Kenya, and CartoDB, she’s been working in the open tech and nonprofit science space for a few years. Recent projects include mapping sensor data to... Read More.
Conrad Mulcahy is an associate managing director and director of data analytics in K2 Intelligence’s New York Office. In his time at K2 Intelligence, Conrad has conducted numerous investigations targeting risk, fraud, corruption, anti-money laundering, and bankruptcy for clients such as law firms, government agencies,... Read More.
Sean Patrick Murphy serves as the chief data scientist for PingThings, an Industrial Internet of Things (IIoT) startup bringing advanced data science and machine learning to the nation’s electric grid. He also advises several startups and provides learning-analytics consulting for EverFi. Previously, he served as... Read More.
Justin Murray is a technical product marketing manager in big data at VMware, where he works with VMware’s customers and field engineering to create guidelines and best practices for using virtualization technology for big data. He has spoken at a variety of conferences on these... Read More.
Jacques Nadeau is the cofounder and CTO of Dremio. He is also the founding PMC chair of the open source Apache Drill project, spearheading the project’s technology and community. Previously, Jacques was the architect and engineering manager for Drill and other distributed systems... Read More.
As a catalyst of systems change, Nina Narelle brings over 15 years of experience in organizational design and systems thinking to inform her work leading organizational transformation. Nina helps groups dream big about their future state and emerge with stronger relationships and clear agreements for... Read More.
Neha Narkhede is the cofounder and head of engineering at Confluent, a company backing the popular Apache Kafka messaging system. Prior to founding Confluent, Neha led streams infrastructure at LinkedIn, where she was responsible for LinkedIn’s petabyte-scale streaming infrastructure built on top of Apache Kafka... Read More.
Tony Ng is a director of engineering at eBay, where he leads the User Behavior Analytics, Experimentation, and Marketing Platform products. Tony is involved in building eBay’s core platforms and services, including cloud, big data analytics, real-time streaming, web services, and messaging systems. Prior to... Read More.
Christopher Nguyen is CEO and cofounder of Arimo (née Adatao), the leader in collaborative, predictive intelligence for enterprises. Previously, Christopher served as engineering director of Google Apps and cofounded two successful startups. As a professor, he also cofounded the computer engineering program at
Robert Nishihara is a fourth-year PhD student working in the UC Berkeley RISELab with Michael Jordan. He works on machine learning, optimization, and artificial intelligence.
Alex Nisnevich is a data scientist at Bayes Impact. Previously, he worked on machine-learning pipelines at Workday and built natural language interfaces for databases at UPSHOT. He received his MS in NLP at UC Berkeley.
Jack Norris is the senior vice president of data and applications at MapR Technologies. Jack has a wide range of demonstrated successes, from defining new markets for small companies to increasing sales of new products for large public companies, in his 20 years spent in... Read More.
Kevin O’Dell currently works as a field engineer for Rocana, helping companies take IT operations to the next level, and has been an HBase contributor since 2012. Kevin regularly works to architect, size, and deploy big data applications across a wide variety of verticals in... Read More.
A leading expert on big data architecture and Hadoop, Stephen O’Sullivan has 20 years of experience creating scalable, high-availability data and applications solutions. A veteran of WalmartLabs, Sun, and Yahoo, Stephen leads data architecture and infrastructure at Silicon Valley Data Science.
Amy O’Connor is a big data evangelist and telecommunications specialist at Cloudera, the leading big data vendor. She advises customers globally as they introduce big data solutions and adopt enterprise-wide big data delivery capabilities. Amy was recently named one of Information Management’s 10 Big Data... Read More.
As CEO of Continuum Analytics, Travis Oliphant engages customers in all industries, develops business strategy, and helps guide the technical direction of the company. Travis actively contributes to software development and engages with the wider open source community in the Python ecosystem. He has... Read More.
With a background in computer engineering and visual analytics, Silvia Oliveros has worked on several projects helping clients explore and analyze their data. Silvia is interested in building and optimizing the infrastructure and data pipelines used to gather insights from various datasets.
Matt Olson is a principal network architect at CenturyLink. Matt’s current focus is on big data analytics for SDN/NFV performance management with the aim of building automated feedback loops for adaptive intelligent network services. Matt has many years of experience leveraging data analytics,... Read More.
John Omernik is currently a data architect at Secureworks, where he helps build up systems to bridge the gap between data and security. John has been active in the banking industry; he began in systems architecture before moving to information security and finally to fraud... Read More.
Jerry Overton is a data scientist, distinguished engineer, and head of advanced analytics research at CSC. Jerry is also the chief data scientist for Industrial Machine Learning (a strategic alliance between CSC and Microsoft)—96 enterprise-scale applications across the banking and capital markets, energy... Read More.
Todd Palino is a site reliability engineer at LinkedIn tasked with keeping Zookeeper, Kafka, and Samza deployments fed and watered. His days are spent, in part, developing monitoring systems and tools to make that job a breeze. Previously, Todd was a systems engineer at Verisign,... Read More.
Ganesan Pandurangan is a principal technology architect at Infosys Ltd. He has around 20 years of experience in building large-scale online, batch, and data warehouse systems. Ganesan is currently leading the development and implementation of Infosys’s big data platform, the Infosys Information Platform, and has... Read More.
Josh Patterson is the director of field engineering for Skymind. Previously, Josh ran a big data consultancy, worked as a principal solutions architect at Cloudera, and was an engineer at the Tennessee Valley Authority, where he was responsible for bringing Hadoop into the smart grid... Read More.
Joshua Patterson is the Director of Applied Solution Engineering at Nvidia and a former Presidential Innovation Fellow. His current passions are graph analytics, GPUs, and advanced visualization. Josh also loves storytelling with data, and some of his work can be seen at Hotshotcharts,
Robert Peglar is vice president of advanced storage solutions at Micron Technology. A 38-year industry veteran and published author, Robert leads efforts in advanced storage systems strategy, leads executive-level planning with key customers and partners worldwide for Micron’s Storage Business Unit, and defines future storage... Read More.
Paulo Pereira is the GE executive responsible the technical aspects of data security and governance for GE Digital. In this capacity, Paulo leads the efforts around big data cloud infrastructure, governance, and security. Working with heavily regulated data from several GE businesses and clients, Paulo... Read More.
Don Perigo is the IT chief enterprise architect of GE Power Services, a $15B organization within GE Power. Power Services is a combination of two of the best service teams in the power industry—GE’s Power Generation Services and the former Alstom Thermal Services (acquired Q4... Read More.
Daniella Perlroth is chief data scientist at Lyra Health, a technology company transforming behavioral health with data and a human touch, where she is developing treatment and provider recommendations to help mental health patients get access to the best quality care. Prior to Lyra, Daniella... Read More.
Thomas Phelan is cofounder and chief architect of BlueData. Prior to BlueData, Tom was an early employee at VMware and as senior staff engineer was a key member of the ESX storage architecture team. During his 10-year stint at VMware, he designed and developed... Read More.
Sébastien Pierre is the director of FFunction, an award-winning data visualization studio. He has worked with clients such as HP, National Geographic, the Bill & Melinda Gates Foundation, Edelman, and many other high-profile organizations. Trained both as a software engineer and a designer, Sébastien regularly... Read More.
Jeff Pohlmann is the vice president of NA big data at Oracle. Jeff has more than 30 years of leadership and management experience with over 15 years of it managing and consulting with Fortune 500 companies deploying analytical information solutions. Prior to joining Oracle, Jeff... Read More.
Jake Porway is the founder and executive director of DataKind, a nonprofit that harnesses the power of data science in the service of humanity. He is an alum of the New York Times R&D Lab and has worked at Google and Bell Labs. A recognized... Read More.
Timothy Potter is a senior member of the engineering team at Lucidworks, a committer on the Apache Solr project, and the coauthor of Solr In Action, a comprehensive guide to using Solr 4. Tim focuses on scalability and hardening the distributed features in Solr. Previously,... Read More.
Thirty-two years ago, Paula Poundstone climbed on a Greyhound bus and traveled across the country—stopping in at open mic nights at comedy clubs as she went. Today, she is one of our country’s foremost humorists. You can hear her through your laughter as a regular... Read More.
Prabhat leads the Data and Analytics Services team at NERSC. His current research interests include scientific data management, parallel I/O, high-performance computing, and scientific visualization. He is also interested in applied statistics, machine learning, computer graphics, and computer vision. Prabhat received an ScM in... Read More.
Nitin Prabhu has been with Transamerica for over a decade in IT roles. He now serves as manager for strategy and architecture.
Peter Prettenhofer is a data scientist and software engineer at DataRobot. He is a contributor to scikit-learn, where he coauthored a number of modules such as Gradient Boosted Regression Trees, Stochastic Gradient Descent, and Decision Trees. Peter studied computer science at Graz University of Technology,... Read More.
As the executive director at the Human Rights Data Analysis Group, Megan Price designs strategies and methods for statistical analysis of human rights data for projects in a variety of locations including Guatemala, Colombia, and Syria. Megan’s work in Guatemala includes serving as the lead... Read More.
Richard Probst is VP of infrastructure technology strategy at SAP and is currently focused on working with SAP partners on innovative cloud architectures for SAP application landscapes to help SAP customers become more agile.
Yvonne Quacken is a senior big data architect and engineer at Siemens. In her role as BI and big data technology lead, Yvonne is responsible for platform and solution architecture, cloud integration, and DevOps for BI and big data. Yvonne has been working in this... Read More.
Rachel Quint is a fellow in the Global Development and Population Program at the Hewlett Foundation. Before joining the Hewlett Foundation, Rachel lived in Addis Ababa, Ethiopia, where she worked in the UN World Food Programme’s Africa office, serving as a liaison to the African... Read More.
Mohammad Quraishi is a senior principal technologist at Cigna with 20 years of experience in application architecture, design, and development. Mohammad has specific experience in mobile native applications, SOA platform implementation, web development, distributed applications, object-oriented analysis and design, requirements analysis, data modeling, and... Read More.
Phillip Radley is chief data architect on BT’s core Enterprise Architecture team, where he is responsible for data architecture across BT Group Plc. Based at BT’s Adastral Park campus in the UK, Phill currently leads BT’s MDM and big data initiatives, driving associated strategic... Read More.
Siva Raghupathy leads the Americas Big Data Solutions Architecture team at AWS, where he guides developers and architects in building successful big data solutions on AWS. Previously, as a principal technical program manager for AWS Database Service, Siva gathered emerging NoSQL requirements... Read More.
Karthik Ramasamy is the engineering manager and technical lead for real-time analytics at Twitter. Karthik is the cocreator of Heron and has more than two decades of experience working in parallel databases, big data infrastructure, and networking. He cofounded Locomatix, a company that specializes in... Read More.
Jun Rao is the cofounder of Confluent, a company that provides a stream data platform on top of Apache Kafka. Previously, Jun was a senior staff engineer at LinkedIn, where he led the development of Kafka, and a researcher at IBM’s Almaden research data center,... Read More.
Naveen Rao is cofounder and CEO of Nervana, where he brings together engineering disciplines and neural computational paradigms to build state-of-the-art technology that makes machines smarter. Naveen’s fascination with computation in synthetic and neural systems began when, at nine years old, he began learning... Read More.
Chris Rawles is a data scientist at Pivotal, where he works with customers across a variety of domains, building models to derive insight and business value from their data. Prior to joining Pivotal, Chris worked in both the oil and gas and alternative energy industries,... Read More.
Atish Ray has more than 15 years of technology and management experience in application architecture and delivery. He has broad experience in planning, estimation, design, integration and implementation of tiered web-centric applications. Based in the Washington DC metro area, he is the Data Engineering Lead... Read More.
Tom Reilly is the CEO of Cloudera. Tom has had a distinguished 30-year career in the enterprise software market. Previously, Tom was vice president and general manager of enterprise security at HP; CEO of enterprise security company ArcSight, where he led the company... Read More.
Dieter Reuther is a leadership consultant who focuses on people, process, and technology. He helps organizations balance creative chaos with structure to bring out the best in teams and individuals. As a strong believer in the power positive leadership can have on people’s motivation, performance,... Read More.
Evan Richards is a member of the Hadoop Compute Platform team at Uber, where he works as the tech lead for the schemas and schema management projects. Previously, Evan interned with Zappos, helping monitor the migration of their catalog from their proprietary format and local... Read More.
Katrina Riehl is a senior data scientist at Continuum Analytics, where she leads the Memex team. Over the last decade, Katrina has worked extensively in the fields of scientific computing, machine learning, data mining, and visualization. Most notably, she worked at Enthought, the signal and... Read More.
Travis Ringger is a manager in PwC’s Risk & Compliance Systems and Analytics practice. Travis specializes in designing and delivering analytical solutions that provide better awareness and understanding of compliance and business risk, particularly solutions incorporating unstructured data and natural language processing. He has deep... Read More.
Cody Rioux is a senior analytics engineer at Netflix working in the real-time analytics space to design fully autonomous systems that support availability and reliability in the Netflix cloud environment on Amazon Web Services. Cody is passionate about using stream processing, functional programming, and Bayesian... Read More.
Julie Rodriguez is associate creative director at Sapient Global Markets. Julie is an experience designer focusing on user research, analysis, and design for complex systems. Julie has patented her work in data visualizations for MATLAB, compiled a data visualization pattern library, and publishes... Read More.
Monica Rogati is an independent data science executive and advisor who has built key data products and teams at Jawbone and LinkedIn; she now helps startups make the most out of their data. As the VP of data at Jawbone, Monica built Jawbone’s data science... Read More.
Bob Rogers is chief data scientist for big data solutions at Intel, where he applies his experience solving problems with big data and analytics to help Intel build world-class customer solutions. Prior to joining Intel, Bob was cofounder and chief scientist at Apixio, a big... Read More.
Brandon Rohrer is a data scientist in Microsoft’s Azure Machine Learning group. He creates end-to-end data science solutions for external customers and supports the development of core algorithms and functionality in Azure ML. Brandon obtained his data science skills working in a variety of applications,... Read More.
Irene Ros is the director of data visualization at Bocoup and the program chair of OpenVis Conf, a two-day conference on data visualization on the open Web. Irene is an information visualization researcher and developer, making engaging, informative, and interactive data-driven stories,... Read More.
Alan Ross is a senior principal engineer and chief cloud security architect at Intel. Alan has more than 20 years of information security experience in various capacities, from policy and awareness and security/risk analysis to engineering and architecture. Previously, Alan worked as a security administrator... Read More.
Laurel Ruma is the director of talent for O’Reilly Media. Most recently, Laurel cochaired Where 2.0, OSCON Java, and Gov 2.0 Expo. She joined O’Reilly after working for five years at various IT analyst firms in the Boston area. Laurel is the coeditor of... Read More.
Sandy Ryza is a senior data scientist at Clover Health. He was previously at Cloudera doing engineering and data science. He is an author of O’Reilly’s Advanced Analytics with Spark, as well as a Spark committer and member of the Hadoop project management committee. He... Read More.
Mohan Sadashiva is VP of products at Waterline Data, where he leverages his extensive experience in managing large-scale software products and cloud services to drive new innovations in big data. Previoiusly, Mohab was the SVP of products and business development at Narus, a cybersecurity... Read More.
Neelesh Srinivas Salian is a software engineer on the Data Platform team at Stitch Fix, where he works closely with the Apache Spark ecosystem as part of the infrastructure group. Previously, he worked at Cloudera where he was working with Apache projects like YARN,... Read More.
Chris Sanden is a senior analytics engineer at Netflix with a focus on real-time analytics and machine learning. He is part of the Insight Engineering team responsible for building systems that allow everyone at Netflix visibility into the state of the cloud environment. Chris is... Read More.
Majken Sander is a data nerd, business analyst, and solution architect at TimeXtender. Majken has worked with IT, management information, analytics, BI, and DW for 20+ years. Armed with strong analytical expertise, She is keen on “data driven” as a business principle, data science, the... Read More.
Krishna Sankar is a consulting data scientist working on retail analytics, social media data science, and forays into deep learning, as well as codeveloping the DeepLearnR package interfacing R over TensorFlow/Skflow. Previously, Krishna was a chief data scientist at Blackarrow.tv, where he focused on... Read More.
Kaz Sato is a staff developer advocate on the Cloud Platform team at Google, where he leads the developer advocacy team for machine-learning and data analytics products such as TensorFlow, the Vision API, and BigQuery. Kaz has been leading and supporting developer communities for... Read More.
Bill Schmarzo is responsible for setting the strategy and defining the service line offerings and capabilities for the EMC Consulting Enterprise Information Management and Analytics service line. Bill has more than two decades of experience in data warehousing, BI, and analytic applications. Bill has... Read More.
Andreas Schmidt is a product manager at Blue Yonder, a leading European company for predictive applications in retail. Previously, he was a senior data scientist there for several years, designing and implementing applications such as replenishment optimization for fresh and perishable goods. During his PhD... Read More.
Jim Scott is the director of enterprise strategy and architecture at MapR Technologies, Inc. Across his career, Jim has held positions running operations, engineering, architecture, and QA teams in the consumer packaged goods, digital advertising, digital mapping, chemical, and pharmaceutical industries. Jim has built systems... Read More.
Kim Malone Scott is an advisor at Dropbox, Kurbo, Qualtrics, Rolltape, Shyp, Twitter, and several Silicon Valley startups. Kim was a member of the faculty at Apple University and before that led AdSense, YouTube, and Doubleclick online sales and operations at Google. Known for her... Read More.
Currently the CTO of Robin Systems, Partha Seetala has more than 16 years of technology and product expertise. Previously, Partha was a distinguished engineer and senior director of engineering at Veritas, Symantec’s information management business, where he conceived, architected, and led engineering teams to... Read More.
Jonathan Seidman is a software engineer on the Partner Engineering team at Cloudera. Previously, he was a lead engineer on the Big Data team at Orbitz Worldwide, helping to build out the Hadoop clusters supporting the data storage and analysis needs of one of the... Read More.
Debora Seys works on delivering a trusted self-service data experience at eBay. She’s been helping users help themselves to find, use, and collaborate with information and data for 15+ years. Prior to her current role, Deb drove search and taxonomy technology capabilities at Kaiser Permanente... Read More.
Hiren Shah is currently a principal program manager in Microsoft’s Cortana Analytics group, where he focuses on big data analytics and data science. Over the last seven years, Hiren has worked on a variety of big data technologies in Bing and Azure. Hiren has a... Read More.
Abin Shahab is a senior software engineer at Altiscale as well as a contributor to Docker and LXC. Abin’s work at Altiscale is focused on multitenant Hadoop clusters using Docker containers. Prior to joining Altiscale, Abin worked on graph databases and search engines at Guidewire, Symantec, and Vivisimo... Read More.
Gwen Shapira is a system architect at Confluent, where she helps customers achieve success with their Apache Kafka implementation. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in... Read More.
Ben Sharma is CEO and cofounder of Zaloni. Ben is a passionate technologist with experience in solutions architecture and service delivery of big data, analytics, and enterprise infrastructure solutions. With previous experience in technology leadership positions for NetApp, Fujitsu, and others, Ben’s expertise ranges... Read More.
Chang She is a software engineer at Cloudera currently working on metadata management tools for Hadoop. Prior to joining Cloudera, Chang was cofounder and CTO of DataPad, a next-gen BI/analytics company. An early core contributor to the pandas library, Chang’s passion is creating data... Read More.
Jayant Shekhar is the founder of Sparkflows Inc., which enables machine learning on large datasets using Spark ML and intelligent workflows. Jayant focuses on Spark, streaming, and machine learning and is a contributor to Spark. Previously, Jayant was a principal solutions architect at Cloudera working... Read More.
Jeff Shmain is a principal solutions architect at Cloudera. He has 16+ years of financial industry experience with a strong understanding of security trading, risk, and regulations. Over the last few years, Jeff has worked on various use-case implementations at 8 out of 10 of... Read More.
Alex Silva is a chief data architect at Pluralsight, where he leads the development of the company’s data infrastructure and services. He has been instrumental in establishing Pluralsight’s data initiative by architecting a platform that is used to capture valuable insights on real-time video analytics... Read More.
Jiri Simsa is a software engineer at Alluxio, Inc., where he is one of the maintainers and top contributors of the Alluxio open source project. Before joining Alluxio, Inc., Jiri was a software engineer at Google, working on yet another distributed applications framework. He earned... Read More.
Sumeet Singh is a senior director of product management for cloud and big data platforms at Yahoo. In his current role, he leads the Hadoop products team responsible for both Apache open source contributions and Yahoo projects. Sumeet is responsible for introducing several new multitenant... Read More.
Vartika Singh is a solutions architect at Cloudera with over 10 years of experience applying machine-learning techniques to big data problems.
Joseph Sirosh is the corporate vice president of Microsoft’s Data group, leading the database, big data, and machine-learning products, as well as a talented team of engineers, data scientists, and product leaders who are developing tools and services to transform data at scale into actionable... Read More.
Ram Shankar is a security data wrangler in Azure Security Data Science, where he works on the intersection of ML and security. Ram’s work at Microsoft includes a slew of patents in the large intrusion detection space (called “fundamental and groundbreaking” by evaluators). In addition,... Read More.
Paul Soldera is currently head of strategy at Equation Research, a full-service market-research company focused on helping clients design, execute, and internalize data and insights from customer- and consumer-focused surveys. Paul also acts as an advisor to other companies looking to grow internal research and... Read More.
Jean-Marc Spaggiari is a Cloudera senior solution architect with many years’ experience as a big data architect, specializing in HBase solutions. An active HBase contributor, Jean-Marc has contributed more than 50 patches to the community and participates in all release testing. Prior to Cloudera, Jean-Marc... Read More.
Ben Spivey is a principal solutions architect at Cloudera providing consulting services for large financial-services customers. Ben specializes in Hadoop security and operations. He is the coauthor of Hadoop Security from O’Reilly Media (2015).
Vikram Sreekanti is a software engineer working on research in the AMPLab at UC Berkeley. A graduate of Berkeley’s computer science department, he has served as a teaching assistant at Berkeley and an intern at Cloudera and Yammer.
Krishna Sridhar is a data scientist at Dato. He holds a PhD in computer science from the University of Wisconsin-Madison, where he worked on high-performance software for large-scale problems in mathematical optimization and data analysis. Krishna’s work has been used in applications such as healthcare,... Read More.
Vikram Srivastava is a software engineer at Cloudera.
Jeremy is currently the VP of data science at Instacart, where he works closely with data scientists who are integrated into product teams to drive growth and profitability through logistics, catalog, search, consumer, shopper, and partner applications. Previously, Jeremy was chief data scientist and
Sandy Steier is the cofounder and CEO of 1010date. With more than a quarter century of industry experience, Sandy is recognized as an innovator behind the adoption of advanced analytic technologies by financial services institutions. Before cofounding 1010data, Sandy was a vice president and... Read More.
As community strategist at Age of Peers, Louis Suarez-Potts strategizes the formation of and manages productive commons-based peer networks (open source communities). Louis helps communities consolidate good work, good connections, and good intentions into a force held in common, producing something all can look at... Read More.
Anand Subbaraj is a principal program manager in the Microsoft Information Management & Machine Learning division. Anand has over 12 years of experience in the IT industry delivering products and services that solve challenging business problems and delight customers. Anand currently specializes in big data... Read More.
Brian Suda is a master informatician currently residing in Reykjavík, Iceland. Since first logging on in the mid-’90s, he has spent a good portion of each day connected to the internet. When he is not hacking on microformats or writing about web technologies, he enjoys... Read More.
Adam Sugano serves as the head of predictive modeling and advanced analytics at Autodesk, where he leads a team of both internal and external data scientists charged with delivering innovative, actionable data-driven solutions that help empower Autodesk’s customer-retention and engagement-optimization efforts across the customer lifecycle.... Read More.
Roshan Sumbaly currently leads the Content Experience and Teaching team at Coursera. Prior to that he worked at LinkedIn, where he led the Data Platform team responsible for serving all social feeds and gestures across LinkedIn. He also worked on various data-mining-based products, while also... Read More.
Chao Sun is currently a software engineer at Cloudera working on the RecordService project. Before that, Chao worked on the Hive on Spark project. He holds a PhD in computer science from the University of Wisconsin-Milwaukee, where he focused on type systems and programming languages.... Read More.
Jagane Sundar is the CTO at WANdisco. Jagane has extensive big data, cloud, virtualization, and networking experience. He joined WANdisco through its acquisition of AltoStor, a Hadoop-as-a-service platform company. Previously, Jagane was founder and CEO of AltoScale, a Hadoop- and HBase-as-a-platform company acquired... Read More.
David Taieb is the STSM for the Cloud Data Services Developer Advocacy team at IBM, leading a team of avid technologists with the mission of educating developers on the art of possible with cloud technologies. Previously, David was the lead architect for the... Read More.
Roopa Tangirala is an experienced engineering leader with extensive background in databases, be they distributed or relational. She manages the database engineering team at Netflix responsible for operating cloud persistent and semipersistent run-time stores for Netflix, which includes Cassandra, Elasticsearch, and MySQL databases, by ensuring... Read More.
Piotr Teterwak works on the toolkit development team at Dato. He received a BA in computer science from Dartmouth College, where he conducted work exploring the learning of convolutional deep neural nets with applications in computer vision.
Arun Thangamani is a software architect for CDK Global (formerly ADP Dealer Services), where he helped lay the foundation for the Open BI Platform (a big-data initiative), which provides integrated value to CDK Global customers. Before CDK, Arun spent about a... Read More.
Robin Thottungal is the EPA’s first chief data scientist focused on creating and implementing an agency-wide vision on analytics for effective decision making. Prior to joining the EPA, Robin was at Deloitte Consulting, where he focused on selling and delivering large-scale analytics projects for... Read More.
Kathleen Ting is currently a technical account manager at Cloudera, where she helps strategic customers deploy and use the Hadoop ecosystem in production. Kathleen has spoken on Hadoop, ZooKeeper, and Sqoop at many big data conferences, including Hadoop World, ApacheCon, and OSCON. She’s contributed... Read More.
Sravya Tirukkovalur is a software engineer at Cloudera focusing on Hadoop security, specifically working on authorization. Sravya is one of the core contributors of Apache Sentry. She is also a committer and a PPMC member of the project driving the Apache community. Sravya has... Read More.
Steven Totman is the financial services industry lead for Cloudera’s Field Technology Office, where he helps companies monetize their big data assets using Cloudera’s Enterprise Data Hub. Prior to Cloudera, Steve ran strategy for a mainframe-to-Hadoop company and drove product strategy at IBM for... Read More.
Anh Trinh is a software architect at Arimo (née Adatao), where he coauthored three patent-pending inventions: the Distributed Data Framework for Data Analytics, Collaboration using Shared Documents for Processing Distributed Data, and Multi-language Support for Interfacing with Distributed Data. He is also a coauthor of... Read More.
Eric Tschetter is the creator and one of the main contributors to Druid, an open source, real-time analytical data store. Eric is currently a distinguished engineer at Yahoo, where he works on speeding up analytics with a mix of data science and traditional BI. Eric... Read More.
Daniel Tunkelang is a data science and engineering executive who has built and led some of the strongest teams in the software industry. He was a founding employee and chief scientist of Endeca, a search pioneer that Oracle acquired for $1.1B. He led a local... Read More.
Joseph Turian is currently a principal engineer at Workday. He headed the machine-learning consultancy MetaOptimize LLC and founded the startup UPSHOT (acquired by Workday), which allowed users to query enterprise data from a mobile device using natural language.
Joseph holds a PhD in... Read More.
Nick Turner has made a career in data that spans more than 25 years. Since 2013, Nick has led the Enterprise Data team at Markerstudy, where he oversees the award-winning Big Data Insights project and is responsible for the collection, analysis, and visualization of hundreds... Read More.
Kostas Tzoumas is a PMC member of the Apache Flink project and cofounder of data Artisans, the company founded by the original development team that created Flink. Kostas has spoken extensively about Flink, including at Hadoop Summit San Jose 2015.
Alexander Ulanov is a senior researcher at Hewlett Packard Labs, where he focuses his research on machine learning on a large scale. Currently, Alexander works on deep learning and graphical models. He has made several contributions to Apache Spark; in particular, he implemented the multilayer... Read More.
Matt van Adelsberg is chief data scientist at CACI, where he is responsible for managing the development of advanced, scalable solutions to complex data-analytics problems from small to big data regimes. Matt’s data science team provides end-to-end solutions to support customers throughout the commercial... Read More.
Bryan Van de Ven is a software engineer at Continuum Analytics. Previously, Bryan worked at the Applied Research Labs, developing software for sonar feature detection and classification systems on US Naval submarine platforms, and Enthought, where he worked on problems in financial risk modeling and... Read More.
Jake Vanderplas is the director of research in the physical sciences at the University of Washington’s eScience Institute, where his research is primarily in the area of data-driven astronomy and astrophysics. In addition, Jake is a maintainer and/or frequent contributor to many open source Python... Read More.
Krishnan Venkata is the director for the US West Coast at LatentView Analytics, where he’s responsible for sales leadership and relationship management for LatentView’s clients, especially in the technology sector. Krishnan has over 11 years of experience in global IT services delivery in the US,... Read More.
Mythili Venkatakrishnan is an IBM senior technical staff member and is the z Systems architecture and technology lead. Mythili has been with IBM for 25 years, all in the mainframe environment working with clients in various capacities. Her focus areas have been diverse... Read More.
Pratik Verma is the founder and chief product officer at BlueTalon. Pratik founded BlueTalon to accelerate big data deployments and remove security as a barrier to adoption. Previously, he led AgeTak, a healthcare startup build on technologies created by Rakesh Verma. He is an angel... Read More.
Amit Walia is the executive vice president and chief product officer at Informatica, where he is responsible for product development, product management, product marketing, and engineering. Previously, Amit was the senior vice president and general manager for Informatica’s Data Integration and Data Security business unit.... Read More.
Laura Waller is an assistant professor at UC Berkeley in the Department of Electrical Engineering and Computer Sciences (EECS) and a senior fellow at the Berkeley Institute of Data Science (BIDS), with affiliations in Bioengineering and Applied Sciences & Technology. Previously, Laura was... Read More.
Guozhang is a an engineer at Confluent, building a stream data platform on top of Apache Kafka. Prior to Confluent, Guozhang was a senior software engineer at LinkedIn, developing and maintaining its backbone streaming infrastructure on Apache Kafka and Apache Samza. He holds a PhD... Read More.
Haojun Wang is a tech lead on Baidu’s US autonomous driving car team. Currently, Haojun is driving the in-car computing platform and offline data platform. Prior to Baidu, he worked at the IBM Silicon Valley Lab, focusing on database core development and big data... Read More.
Wei Wang is the senior director of product marketing at Hortonworks, where she serves as the primary leadership force behind strategic marketing execution, with a focus on boosting Hortonworks Data Platform market expansion and revenue generation globally. Wei is an accomplished international marketing executive with... Read More.
Daniel Weeks manages the Big Data Compute team at Netflix and is a Parquet committer. Prior to joining Netflix, Daniel focused on research in big data solutions and distributed systems.
Director managing development of global integrated marketing solutions, processes, and technologies for Dell marketing units. Marketing thought leader for researching emerging technology, solutions, and business development opportunities across worldwide groups.
Dave Wells is actively involved in information management and business management, especially at their intersection. Dave is a consultant and educator dedicated to building meaningful connections throughout the path from data to business value. Knowledge sharing and skills development are Dave’s passions, carried out through... Read More.
Mike Wendt is a principal engineer at Accenture Cyber Security Labs in Washington, DC. Since joining Accenture Labs, Mike has led engineering work on big data technologies like Hadoop, Datastax Cassandra, Storm, Spark, and others. In addition to his research on optimal Hadoop deployments, Mike... Read More.
Timoni West leads design for Unity Labs, focusing on new game development and creation tools in VR. Previously, Timoni was SVP of design at Alphaworks, a new startup helping to democratize small business funding, and cofounder and creative director of Recollect, a social backup... Read More.
Cack Wilhelm is a principal at Scale Venture Partners, where she focuses on investments in early-stage software companies, with an eye toward those helping businesses better utilize data, automate workflows, incorporate AI, and build more resilient software. Looking further ahead, Cack is watching closely as... Read More.
Roseanne Wincek joined IVP in March 2015. She focuses on investing in later-stage, high-growth consumer and enterprise companies. Roseanne actively works with IVP portfolio companies Compass and PopSugar. Roseanne was previously a principal at Canaan Partners, a leading early-stage venture firm, where she... Read More.
Christina Wodtke has led redesigns and initial product offerings for such companies as LinkedIn, Myspace, Zynga, Yahoo, Hot Studio, and eGreetings. Christina has founded two consulting startups, a product startup, and Boxes and Arrows, an online magazine of design. She also cofounded the Information Architecture... Read More.
Kristi Wolff is special counsel in Kelley Drye’s Washington, DC, office. Kristi’s practice focuses on food, dietary supplements, medical devices, and emerging health/wearable technology and privacy issues. Kristi has extensive experience advising clients whose products are within the overlapping jurisdictions of the Food and Drug... Read More.
Steve Wooledge is vice president of product marketing at MapR, where he is responsible for communicating the business value and technical advantages of MapR innovations and solutions for Hadoop. Steve was previously vice president of marketing for Teradata Unified Data Architecture, where he drove big... Read More.
Kristine Woolsey is the practice lead for creative environments at MAYA, a design and technology innovation consultancy. Kristi is well known as a behavioral strategist with years of speaking and research on the impact that the physical environment has on human behavior. She joined... Read More.
Ian Wrigley has taught tens of thousands of students over the last 25 years in subjects ranging from C programming to Hadoop development and administration. Ian is currently the director of education services at Confluent, where he heads the team building and delivering courses focused... Read More.
Jennifer Wu is director of product management for cloud at Cloudera, where she focuses on cloud strategy and solutions. Before joining Cloudera, Jennifer worked as a product line manager at VMware, working on the vSphere and Photon system management platforms.
Yinglian Xie is the CEO and cofounder of DataVisor, a startup in the area of big data analytics for security. Yinglian has been working in the area of internet security and privacy for over 10 years and has helped improve the security of billions... Read More.
Reynold Xin is a cofounder and chief architect at Databricks as well as an Apache Spark PMC member and release manager for Spark’s 2.0 release. Prior to Databricks, Reynold was pursuing a PhD at the UC Berkeley AMPLab, where he worked on large-scale data... Read More.
Caiming Xiong is a senior researcher at Metamind. Before that, he was a postdoctoral researcher in the Department of Statistics at the University of California, Los Angeles. Caiming holds a PhD in computer science and engineering from SUNY Buffalo and a BS and MS... Read More.
Fangjin Yang is a coauthor of the open source Druid project and a cofounder of Imply, a data analytics startup based in San Francisco. Previously, Fangjin held senior engineering positions at Metamarkets and Cisco Systems. Fangjin holds a BASc in electrical engineering and an MASc... Read More.
Chuck Yarbrough is the senior director of solutions marketing and management at Pentaho, a leading big data analytics company that helps organizations engineer big data connections, blend data, and report and visualize all of their data. Chuck is responsible for creating and driving Pentaho solutions... Read More.
Martin Yip is a product line marketing manager for VMware’s Cloud Platform business unit, where he oversees product marketing for a portfolio of products including vSphere, vSphere with Operations Management, and Big Data. Martin has been in the high technology industry for over 10 years... Read More.
Mike Yoder is a software engineer at Cloudera who has worked on a variety of Hadoop security features and internal security initiatives. Most recently, he implemented log redaction and the encryption of sensitive configuration values in Cloudera Manager. Prior to Cloudera, he was a security... Read More.
Jin Zhang is a passionate technology leader who is currently leading analytics at CA Technologies. Previously, Jin led Apigee to its IPO as their VP of engineering and was an engineering executive with IBM, where she was responsible for managing large teams as... Read More.
Owen Zhang is the chief product officer at DataRobot. Owen spent most of his career in the property and casualty insurance industry. Most recently Owen served as vice president of modeling of the newly formed AIG Science team.
After spending several years in IT... Read More.
Weidong Zhang is an engineering manager on the Data Analytics Infrastructure team at LinkedIn and leads the marketing and customer-service data warehouse vertical. Weidong has a passion for analytics, research, and data-driven decision making. He spent 10+ years in the data warehouse ETL and... Read More.
Yongzheng Zhang is a business analytics manager at LinkedIn and an active researcher and practitioner of text mining and machine learning. He has developed many practical and scalable solutions for utilizing unstructured data for ecommerce and social-networking applications, including search, merchandising, social commerce, and customer-service... Read More.
Alice Zheng manages the optimization team on Amazon’s Ad Platform. Alice specializes in research and development of machine-learning methods, tools, and applications. Outside of work, she is writing a book, Mastering Feature Engineering. Previously, Alice worked at GraphLab/Dato/Turi, where she led the machine-learning toolkits team... Read More.
As vice president of products at Trifacta, Wei Zheng combines her passion for technology with experience in enterprise software to define and shape Trifacta’s product offerings. Having founded several startups of her own, Wei believes strongly in innovative technology that solves real-world business problems. Most... Read More.
Shivon Zilis is a venture capitalist and founding member of Bloomberg Beta, where she focuses on early-stage data and machine-intelligence investments. Shivon has led 12 investments since launch. One, Newsle, was acquired by LinkedIn; others include Context Relevant, Alation, and InfluxDB. She recently released a... Read More.
Nina Zumel is cofounder and principal at Win-Vector LLC, a data science consultancy based in San Francisco. She frequently writes and speaks on statistics and machine learning. She is also the coauthor of the popular book Practical Data Science with R (Manning 2014).
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.