Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Using LSTMs to aid professional translators

Darren Cook (QQ Trend)
17:2518:05 Wednesday, 23 May 2018
Data science and machine learning, Emerging technologies and case studies
Location: Capital Suite 13 Level: Intermediate
Secondary topics:  Text and Language processing and analysis

Who is this presentation for?

  • Data scientists and computational linguists

What you'll learn

  • Explore design patterns for integrating LSTMs and similar bleeding-edge technology into practical NLP applications
  • Understand what is currently possible (and what is unrealistic) in the realm of machine translation

Description

Completely automatic machine translation still has a long way to go to achieve professional quality, particularly in the challenging language pairs such as Japanese and English. QQ Trend has been working with a professional translation company to use the latest machine learning technology and the latest NLP technology to improve its efficiency. In other words, this is machine learning in partnership with a human professional—not trying to replace them.

Darren Cook demonstrates how to use LSTMs to tackle translation, focusing on one of the most difficult language pairs: Japanese to English. Darren covers the core technologies, practical issues when integrating them with the latest tokenizers, dictionaries, structured data sources, unstructured data sources, and customer style sheets, and the solution’s performance and platform portability. He also explains how to adapt the solution to new terminology or the translator’s preferred vocabulary and writing style.

Photo of Darren Cook

Darren Cook

QQ Trend

Darren Cook is a director at QQ Trend, a financial data analysis and data products company. Darren has over 20 years of experience as a software developer, data analyst, and technical director and has worked on everything from financial trading systems to NLP, data visualization tools, and PR websites for some of the world’s largest brands. He is skilled in a wide range of computer languages, including R, C++, PHP, JavaScript, and Python. Darren is the author of two books, Data Push Apps with HTML5 SSE and Practical Machine Learning with H2O, both from O’Reilly.