Industrialized capsule networks for text analytics

Vijay Srinivas Agneeswaran (Walmart Labs), Abhishek Kumar (Publicis Sapient)

1:45pm–2:25pm Thursday, September 12, 2019

Location: 230 A

Implementing AI

Secondary topics: Deep Learning, Health and Medicine, Machine Learning, Text, Language, and Speech

Who is this presentation for?

Data scientists, data engineers, ML engineers, CxOs, and data architects

Level

Intermediate

Description

Vijay Agneeswaran and Abhishek Kumar explore multilabel text classification problems, where multiple tags or categories have to be associated with a given text or documents. Multilabel text classification occurs in numerous real-world scenarios, for instance, in news categorization and in bioinformatics (such as the gene classification problem). The Kaggle dataset is representative of the problem.

Several other interesting problems in text analytics exist, such as abstractive summarization, sentiment analysis, search and information retrieval, entity resolution, document categorization, document clustering, machine translation, etc. Deep learning has been applied to solve many of the above problems, such as applying a convolutional network to make effective use of word order in text categorization. Recurrent neural networks (RNNs) have been effective in various tasks in text analytics. Significant progress has been achieved in language translation by modeling machine translation using an encoder-decoder approach with the encoder formed by a neural network.

However, certain cases require modeling in the hierarchical relationship in text data and is difficult to achieve with traditional deep learning networks because linguistic knowledge may have been incorporated in these networks to achieve high accuracy. Moreover, deep learning networks don’t consider hierarchical relationships between local features, as the pooling operation of convolutional neural network (CNNs) loses information about the hierarchical relationships.

Vijay and Abhishek examine an industrial-scale use case of capsule networks they’ve implemented for their client in the realm of text analytics—news categorization. They use the precision, recall, and F1 metrics to detail the performance of capsule networks on the news categorization task. They benchmark the performance of recurrent capsule networks (RCNs) for the same task and compare the two implementations against a baseline model. Importantly, they outline how to tune key hyperparameters of capsule networks such as batch size, number and size of filters, initial learning rate, and number and dimension of capsules, as well as the key challenges they faced.

Prerequisite knowledge

A basic understanding of NLP and deep learning

What you'll learn

Discover the motivation for capsule networks and how they can be used in text analytics
Gain an overview of RCNs
Understand the implementation of RCNs in TensorFlow and PyTorch
Learn how to benchmark capsule networks with dynamic routing and RCNs for a real multilabel text classification use case for news categorization

Vijay Srinivas Agneeswaran

Walmart Labs

Vijay Srinivas Agneeswaran is a director of data sciences at Walmart Labs in India, where he heads the machine learning platform development and data science foundation teams, which provide platform and intelligent services for Walmart businesses around the world. He’s spent the last 18 years creating intellectual property and building data-based products in industry and academia. Previously, he led the team that delivered real-time hyperpersonalization for a global automaker, as well as other work for various clients across domains such as retail, banking and finance, telecom, and automotive; he built PMML support into Spark and Storm and realized several machine learning algorithms such as LDA and random forests over Spark; he led a team that designed and implemented a big data governance product for a role-based fine-grained access control inside of Hadoop YARN; and he and his team also built the first distributed deep learning framework on Spark. He’s been a professional member of the ACM and the IEEE (senior) for the last 10+ years. He has five full US patents and has published in leading journals and conferences, including IEEE Transactions. His research interests include distributed systems, artificial intelligence, and big data and other emerging technologies. Vijay has a bachelor’s degree in computer science and engineering from SVCE, Madras University, an MS (by research) from IIT Madras, and a PhD from IIT Madras and held a postdoctoral research fellowship in the LSIR Labs, Swiss Federal Institute of Technology, Lausanne (EPFL).

Website

Abhishek Kumar

Publicis Sapient

Abhishek Kumar is a senior manager of data science in Publicis Sapient’s India office, where he looks after scaling up the data science practice by applying machine learning and deep learning techniques to domains such as retail, ecommerce, marketing, and operations. Abhishek is an experienced data science professional and technical team lead specializing in building and managing data products from conceptualization to the deployment phase and interested in solving challenging machine learning problems. Previously, he worked in the R&D center for the largest power-generation company in India on various machine learning projects involving predictive modeling, forecasting, optimization, and anomaly detection and led the center’s data science team in the development and deployment of data science-related projects in several thermal and solar power plant sites. Abhishek is a technical writer and blogger as well as a Pluralsight author and has created several data science courses. He’s also a regular speaker at various national and international conferences and universities. Abhishek holds a master’s degree in information and data science from the University of California, Berkeley. Abhishek has spoken at past O’Reilly conferences, including Strata 2019, Strata 2018, and AI 2019.