Get the free Ebook:
Private and Open Data in Asia: A Regional Guide.
Scikit-learn has emerged as one of the most popular open source machine learning toolkits, now widely used in academia and industry. Scikit-learn provides easy-to-use interfaces to perform advanced analysis and build powerful predictive models. The tutorial will cover basic concepts of machine learning, such as supervised and unsupervised learning, cross validation, and model selection. We will see how to prepare data for machine learning, and go from applying a single algorithm to building a machine learning pipeline. We will also cover how to build machine learning models on text data, and how to handle very large data sets.
Andreas Mueller received his PhD in machine learning from the University of Bonn. After working as a machine learning researcher on computer vision applications at Amazon for a year, he recently joined the Center for Data Science at New York University. In the last four years, he has been maintainer and one of the core contributors of scikit-learn, a machine learning toolkit widely used in industry and academia, and author and contributor to several other widely-used machine learning packages. His mission is to create open tools to lower the barrier of entry for machine learning applications, promote reproducible science, and democratize access to high-quality machine learning algorithms.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.