Making Open Work
May 8–9, 2017: Training & Tutorials
May 10–11, 2017: Conference
Austin, TX

Global empire: Building for fun and profit

Michelle Casbon (Google)
1:45pm2:25pm Wednesday, May 10, 2017
The Cutting Edge
Location: Meeting Room 18 A/B
Level: Intermediate
Average rating: ****.
(4.50, 2 ratings)

Who is this presentation for?

  • Engineers, architects, and those working on product

Prerequisite knowledge

  • Basic familiarity with distributed systems is expected
  • Knowledge of machine learning, natural language processing, or statistical machine translation (useful but not required)

What you'll learn

  • Learn how Qordoba uses common open source tools in its globalization platform centered on machine learning


To establish a global user base, a product needs to support a variety of locales. The challenge with supporting multiple locales is the maintenance and generation of localized strings. Michelle Casbon explains how open source tools like Scala, Apache Spark, Apache Kafka, and Apache PredictionIO (incubating) provide structure for a scalable localization platform with machine learning at its core.

Michelle explains how Qordoba has addressed these challenges using highly scalable technologies and machine learning to automate the process, specifically, by generating high-quality translations in many different languages and making them available in real-time across platforms (e.g., mobile, print, and web). Such a platform offers continuous deployment of localized strings, live syncing across platforms (mobile, web, photoshop, sketch, help desk, etc.), content generation for any locale, and emotional response.

Michelle also shares Qordoba’s architecture for handling billions of localized strings in many different languages, which uses:

  • Scala and Akka as an orchestration layer
  • Apache Cassandra and MariaDB as a storage layer
  • Apache Spark, Apache PredictionIO (incubating), Apache HBase, and Elasticsearch for natural language processing
  • Apache Kafka as a message bus for reporting, billing, and notifications
  • Docker, Marathon, and Apache Mesos for containerized deployment
Photo of Michelle Casbon

Michelle Casbon


Michelle Casbon is a senior engineer on the Google Cloud Platform developer relations team, where she focuses on open source contributions and community engagement for machine learning and big data tools. Michelle’s development experience spans more than a decade and has primarily focused on multilingual natural language processing, system architecture and integration, and continuous delivery pipelines for machine learning applications. Previously, she was a senior engineer and director of data science at several San Francisco-based startups, building and shipping machine learning products on distributed platforms using both AWS and GCP. She especially loves working with open source projects and is a contributor to Kubeflow. Michelle holds a master’s degree from the University of Cambridge.