Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK

Deep learning and natural language processing with Spark

Andy Petrella (Kensu), Melanie Warrick (Google)
11:15–11:55 Thursday, 2/06/2016
Data science & advanced analytics
Location: Capital Suite 8/9 Level: Advanced
Average rating: ***..
(3.35, 17 ratings)

Prerequisite knowledge

Attendees should be proficient in probability, statistics, and algebra as well as programming and familiar with distributed computing techniques.


Deep learning is taking data science by storm. Unfortunately, most existing solutions aren’t particularly scalable. Andy Petrella and Melanie Warrick show how to implement a Spark­-ready version of the long short­-term memory (LSTM) neural network, widely used in the hardest natural language processing and understanding problems, such as automatic summarization, machine translation, question answering, and discourse. Andy and Melanie then demo an LSTM network with interactive, real­-time visualizations using the Spark Notebook and Spark Streaming.

Photo of Andy Petrella

Andy Petrella


Andy is an entrepreneur with Mathematics and Distributed Data background focused on unleashing unexploited business potentials leveraging new technologies in machine learning, artificial intelligence, and cognitive systems. Andy is also recognized for the Spark Notebook and as a public speaker, keynotes at international class events around data governance and science, e.g. Strata Data, Spark Summit.

Andy is the CEO of Kensu which has an ambitious mission to monitor and enable the sustainability of data-driven companies — ensuring economical and competitive advantages are created in an ethical and efficient fashion.

Photo of Melanie Warrick

Melanie Warrick


Melanie Warrick is a senior developer advocate at Google with a passion for machine learning problems at scale. Melanie’s previous experience includes work as a founding engineer on Deeplearning4j and as a data scientist and engineer at