Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Using LSTMs To Aid Professional Translators

Darren Cook (QQ Trend Ltd.)
17:2518:05 Wednesday, 23 May 2018
Data science and machine learning, Emerging technologies and case studies
Location: Capital Suite 13 Level: Intermediate

Who is this presentation for?

Primarily data scientists and computational linguists.

Prerequisite knowledge

While aimed at data scientists and computational linguists, we will be skipping any hard theory (which is widely available in many books, videos and courses), focusing more on the design decisions to get all the pieces to work together. So it is hoped the talk will be accessible to a wider audience of software developers and project managers interested in NLP and/or deep learning

What you'll learn

Design patterns for integrating LSTMs, and similar bleeding-edge technology, into practical NLP applications. A good feel for what is possible, and what is unrealistic, in the realm of machine translation.


Completely automatic machine translation still has a long way to go to achieve professional quality, particularly in the challenging language pairs such as Japanese/English. QQ Trend Ltd. have been working with a professional translation company to see how the latest machine learning technology, and the latest NLP technology, can be used to improve their efficiency. In other words, in partnership with the human professional, rather than trying to replace them.

This talk will be about the core technologies being used, particularly the deep learning LSTM networks, and it will focus on the practical issues of integrating them with the latest tokenizers, dictionaries, structured data sources, unstructured data sources, and customer style sheets; as well as one eye on performance and platform portability. Adapting to new terminology or simply the translator’s preferred vocabulary and writing style, will also be covered.

For the examples, we will use Japanese as the source language, with English as the target language, explaining which parts of the process are language-specific, and which are more generic.

Photo of Darren Cook

Darren Cook

QQ Trend Ltd.

Darren Cook is technical director at QQ Trend, a financial data analysis and data products company. Darren has over 25 years of experience as a software developer, data analyst, and technical director and has worked on everything from financial trading systems to NLP, data visualization tools, and PR websites for some of the world’s largest brands. He is skilled in a wide range of computer languages, including R, C++, PHP, JavaScript, and Python. Darren is the author of two books, Data Push Apps with HTML5 SSE and Practical Machine Learning with H2O, both from O’Reilly, as well as a Coursera course on machine learning and H2O.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)