Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Introduction to natural language processing with Python

Barbara Fusinska (Google)
9:0012:30 Tuesday, 22 May 2018
Secondary topics:  Text and Language processing and analysis
Average rating: ****.
(4.33, 3 ratings)

Who is this presentation for?

  • Software developers and data scientists

Prerequisite knowledge

  • Programming experience, preferably in Python

Materials or downloads needed in advance

  • A WiFi-enabled laptop with a modern browser installed
  • A Katacoda account

What you'll learn

  • Understand basic machine learning and NLP techniques
  • Learn how to build a bag-of-words text representation for text classification


Natural language processing techniques help address tasks like text classification, information extraction, and content generation. They can give the perception of machines being able to understand humans and respond more naturally.

Barbara Fusinska offers an overview of natural language processing and walks you through building a bag-of-words representation, using Python and its machine learning libraries, and then using it for text classification. This solution can be used to recognize the sentiment, category, or author of the document.

Photo of Barbara Fusinska

Barbara Fusinska


Barbara Fusinska is a machine learning strategic cloud engineering manager at Google with a strong software development background. Previously, she was at a variety of different companies like ABB, Base, Trainline, and Microsoft, where she gained experience in building diverse software systems, ultimately focusing on the data science and machine learning field. Barbara believes in the importance of data and metrics when growing a successful business. In her free time, Barbara enjoys programming activities and collaborating around data architecture. She can be found on Twitter as @BasiaFusinska and blogs at