Mar 15–18, 2020

Attention networks all the way to production using Kubeflow

Vijay Srinivas Agneeswaran (Walmart Labs), Pramod Singh (Walmart Labs ), Akshay kulkarni (Publicis Sapient)
1:30pm5:00pm Monday, March 16, 2020
Location: 210 F

Level

Beginner

According to the industry estimates, more than 80% of generated data is in unstructured format in the form of text, image, audio, video, etc. Data is generated as we speak, as we write, as we tweet, as we use social media platforms, as we send messages on messaging platforms, as we use ecommerce for shopping, and in various other activities. The majority of this data exists in the textual form.

Many insights can be mined from this huge repository of unstructured datasets, but it needs a sophisticated approach. Text data is most common kind and covers more than 50% of the unstructured data. Vijay Srinivas Agneeswaran, Pramod Singh, and Akshay Kulkarni focus on text data, uncovering different methodologies to reveal real value. In order to produce significant and actionable insights from text data, you’ll use natural language processing (NLP) coupled with machine learning, deep learning, and state-of-the-art techniques.

With the latest developments and improvements in the field of deep learning and artificial intelligence, many demanding NLP tasks become easy to implement and execute. Text summarization is one of the tasks that can be done using attention networks. You’ll learn how to efficiently build and use NLP-based applications for text summarization using attention networks on TensorFlow 2.0. Text summarization requires a great deal of abstraction, so you’ll use sequence-to-sequence models and bidirectional encoder and decoders. You’ll get to see the notebooks for the problems outlined above and how some of text analytics can be implemented on top of Kubeflow, which helps build scalable productionizable implementations.

Outline:

History of NLP

  • NLP curves and current stage
  • Categories of NLP tasks
  • Different NLP methodologies

Introduction to text summarization

  • Deep dive of topic modeling
  • Encoder-decoder model

TensorFlow 1.x versus TensorFlow 2.x

  • New model: Eager execution
  • tf.keras coupling

Text summarization using an attention network

  • Text processing
  • Build attention network in TF 2.0
  • Evaluate performance of the model

Introduction to ML Deployment using Kubeflow

  • Challenges of manual deployment
  • Introduction to Kubeflow

Productionization of an attention network

  • Containerization of the TF model
  • Deployment of attention network model as a service on Kubeflow

Evaluation, challenges, and the way forward

  • Pushing changes into the Kubeflow
  • Standard practices for Kubeflow

Prerequisite knowledge

  • A basic understanding of NLP
  • Familiarity with deep learning
  • A working knowledge of machine learning principles
  • General knowledge of linear algebra and calculus (useful but not required)

Materials or downloads needed in advance

  • A WiFi-enabled laptop with a browser installed (All notebooks and required datasets will be provided using a cloud-hosted environment.)
  • A working GCP account and credit points

What you'll learn

  • Understand text summarization using an attention network
  • Discover how to use TensorFlow 2.0 to build attention networks
  • Learn about the end-to-end productionization of an attention network using Kubeflow
Photo of Vijay Srinivas Agneeswaran

Vijay Srinivas Agneeswaran

Walmart Labs

Vijay Srinivas Agneeswaran is a director of data sciences at Walmart Labs in India, where he heads the machine learning platform development and data science foundation teams, which provide platform and intelligent services for Walmart businesses around the world. He’s spent the last 18 years creating intellectual property and building data-based products in industry and academia. Previously, he led the team that delivered real-time hyperpersonalization for a global automaker, as well as other work for various clients across domains such as retail, banking and finance, telecom, and automotive; he built PMML support into Spark and Storm and realized several machine learning algorithms such as LDA and random forests over Spark; he led a team that designed and implemented a big data governance product for a role-based fine-grained access control inside of Hadoop YARN; and he and his team also built the first distributed deep learning framework on Spark. He’s been a professional member of the ACM and the IEEE (senior) for the last 10+ years. He has five full US patents and has published in leading journals and conferences, including IEEE Transactions. His research interests include distributed systems, artificial intelligence, and big data and other emerging technologies. Vijay has a bachelor’s degree in computer science and engineering from SVCE, Madras University, an MS (by research) from IIT Madras, and a PhD from IIT Madras and held a postdoctoral research fellowship in the LSIR Labs, Swiss Federal Institute of Technology, Lausanne (EPFL).

Photo of Pramod Singh

Pramod Singh

Walmart Labs

Pramod Singh is a senior machine learning engineer at Walmart Labs. He has extensive hands-on experience in machine learning, deep learning, AI, data engineering, designing algorithms, and application development. He has spent more than 10 years working on multiple data projects at different organizations. He’s the author of three books Machine Learning with PySpark, Learn PySpark, and Learn TensorFlow 2.0. He’s also a regular speaker at major conferences such as the O’Reilly Strata Data and AI Conferences. Pramod holds a BTech in electrical engineering from BATU, and an MBA from Symbiosis University. He’s also done data science certification from IIM–Calcutta. He lives in Bangalore with his wife and three-year-old son. In his spare time, he enjoys playing guitar, coding, reading, and watching football.

Photo of Akshay kulkarni

Akshay kulkarni

Publicis Sapient

Akshay Kulkarni is a senior data scientist on the core AI and data science team at Publicis Sapient, where he’s part of strategy and transformation interventions through AI, manages high-priority growth initiatives around data science and works on various machine learning, deep learning, natural language processing, and artificial intelligence engagements by applying state-of-the-art techniques. He’s a renowned AI and machine learning evangelist, author, and speaker. Recently, he’s been recognized as one of “Top 40 under 40 Data Scientists” in India by Analytics India Magazine. He’s consulted with several Fortune 500 and global enterprises in driving AI and data science-led strategic transformation. Akshay has rich experience of building and scaling AI and machine learning businesses and creating significant client impact. He’s actively involved in next-gen AI research and is also a part of next-gen AI community. Previously, he was at Gartner and Accenture, where he scaled the AI and data science business. He’s a regular speaker at major data science conferences and recently gave a talk on “Sequence Embeddings for Prediction Using Deep Learning” at GIDS. He’s the author of a book on NLP with Apress and authoring couple more books with Packt on deep learning and next-gen NLP. Akshay is a visiting faculty (industry expert) at few of the top universities in India. In his spare time, he enjoys reading, writing, coding, and helping aspiring data scientists.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires