Making Open Work
May 8–9, 2017: Training & Tutorials
May 10–11, 2017: Conference
Austin, TX

Global empire: Building for fun and profit

Michelle Casbon (Qordoba)
1:45pm2:25pm Wednesday, May 10, 2017
The Cutting Edge
Location: Meeting Room 18 A/B
Level: Intermediate
Average rating: ****.
(4.50, 2 ratings)

Who is this presentation for?

  • Engineers, architects, and those working on product

Prerequisite knowledge

  • Basic familiarity with distributed systems is expected
  • Knowledge of machine learning, natural language processing, or statistical machine translation (useful but not required)

What you'll learn

  • Learn how Qordoba uses common open source tools in its globalization platform centered on machine learning

Description

To establish a global user base, a product needs to support a variety of locales. The challenge with supporting multiple locales is the maintenance and generation of localized strings. Michelle Casbon explains how open source tools like Scala, Apache Spark, Apache Kafka, and Apache PredictionIO (incubating) provide structure for a scalable localization platform with machine learning at its core.

Michelle explains how Qordoba has addressed these challenges using highly scalable technologies and machine learning to automate the process, specifically, by generating high-quality translations in many different languages and making them available in real-time across platforms (e.g., mobile, print, and web). Such a platform offers continuous deployment of localized strings, live syncing across platforms (mobile, web, photoshop, sketch, help desk, etc.), content generation for any locale, and emotional response.

Michelle also shares Qordoba’s architecture for handling billions of localized strings in many different languages, which uses:

  • Scala and Akka as an orchestration layer
  • Apache Cassandra and MariaDB as a storage layer
  • Apache Spark, Apache PredictionIO (incubating), Apache HBase, and Elasticsearch for natural language processing
  • Apache Kafka as a message bus for reporting, billing, and notifications
  • Docker, Marathon, and Apache Mesos for containerized deployment
Photo of Michelle Casbon

Michelle Casbon

Qordoba

Michelle Casbon is director of data science at Qordoba. Michelle’s development experience spans more than a decade across various industries, including media, investment banking, healthcare, retail, and geospatial services. Previously, she was a senior data science engineer at Idibon, where she built tools for generating predictions on textual datasets. She loves working with open source projects and has contributed to Apache Spark and Apache Flume. Her writing has been featured in the AI section of O’Reilly Radar. Michelle holds a master’s degree from the University of Cambridge, focusing on NLP, speech recognition, speech synthesis, and machine translation.