Ben Lorica hosts a conversation with Doug Cutting and Mike Cafarella, the cofounders of Apache Hadoop.
Ben Lorica is the chief data scientist at O’Reilly Media. Ben has applied business intelligence, data mining, machine learning, and statistical analysis in a variety of settings, including direct marketing, consumer and market research, targeted advertising, text mining, and financial engineering. His background includes stints with an investment management company, internet startups, and financial services.
Doug Cutting is the chief architect at Cloudera and the founder of numerous successful open source projects, including Lucene, Nutch, Avro, and Hadoop. Doug joined Cloudera from Yahoo, where he was a key member of the team that built and deployed a production Hadoop storage-and-analysis cluster for mission-critical business analytics. Doug holds a bachelor’s degree from Stanford University and sits on the board of the Apache Software Foundation.
Mike Cafarella is one of the cofounders of the Apache Hadoop and Nutch open source projects. Mike is also an assistant professor of computer science and engineering at the University of Michigan. His research interests include databases, information extraction, data integration, and data mining. Recently, he cofounded Lattice Data (http://lattice.io), a company that aims to transform “dark data,” such as unstructured text documents and reports, into high quality structured databases.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.