Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK

Text conference sessions

16:00–16:30 Wednesday, 1/06/2016
Roxana Danger (reed.co.uk)
One of the main challenges organizations face is the semantic categorization of textual data. Roxana Danger offers an overview of ROOT, the reed online occupational taxonomy, which was constructed to improve the quality of services at reed.co.uk, and discusses this semisupervised methodology for generating (and maintaining) taxonomies from large collections of textual data.
14:05–14:45 Friday, 3/06/2016
Alyona Medelyan (Thematic)
With the rise of deep learning, natural language understanding techniques are becoming more effective and are not as reliant on costly annotated data. This leads to an explosion of possibilities of what businesses can do with language. Alyona Medelyan explains what the newest NLU tools can achieve today and presents their common use cases.
11:15–11:55 Thursday, 2/06/2016
Andy Petrella (Kensu), Melanie Warrick (Google)
Deep learning is taking data science by storm, due to the combination of stable distributed computing technologies, increasing amounts of data, and available computing resources. Andy Petrella and Melanie Warrick show how to implement a Spark­-ready version of the long short­-term memory (LSTM) neural network, widely used in the hardest natural language processing and understanding problems.
12:00–12:30 Wednesday, 1/06/2016
Piotr Mirowski (Google DeepMind)
Piotr Mirowski looks under the hood of recurrent neural networks and explains how they can be applied to speech recognition, machine translation, sentence completion, and image captioning.
12:05–12:45 Thursday, 2/06/2016
David Talby (Pacific AI), Claudiu Branzan (Accenture AI)
David Talby and Claudiu Branzan offer a live demo of an end-to-end system that makes nontrivial clinical inferences from free-text patient records. Infrastructure components include Kafka, Spark Streaming, Spark, Titan, and Elasticsearch; data science components include custom UIMA annotators, curated taxonomies, machine-learned dynamic ontologies, and real-time inferencing.