Deep learning is taking data science by storm. Unfortunately, most existing solutions aren’t particularly scalable. Andy Petrella and Melanie Warrick show how to implement a Spark-ready version of the long short-term memory (LSTM) neural network, widely used in the hardest natural language processing and understanding problems, such as automatic summarization, machine translation, question answering, and discourse. Andy and Melanie then demo an LSTM network with interactive, real-time visualizations using the Spark Notebook and Spark Streaming.
Andy Petrella is a mathematician turned distributed computing entrepreneur. Besides being a Scala/Spark trainer, Andy participated in many projects built using Spark, Cassandra, and other distributed technologies in various fields including geospatial analysis, the IoT, and automotive and smart cities projects. Andy is the creator of the Spark Notebook, the only reactive and fully Scala notebook for Apache Spark. In 2015, Andy cofounded Data Fellas with Xavier Tordoir around their product the Agile Data Science Toolkit, which facilitates the productization of data science projects and guarantees their maintainability and sustainability over time. Andy is also member of the program committee for the O’Reilly Strata, Scala eXchange, Data Science eXchange, and Devoxx events.
Melanie Warrick is a senior developer advocate at Google with a passion for machine learning problems at scale. Melanie’s previous experience includes work as a founding engineer on Deeplearning4j and as a data scientist and engineer at Change.org.
©2016, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.