Presented By O'Reilly and Cloudera
December 5-6, 2016: Training
December 6–8, 2016: Tutorials & Conference

Machine learning in practice with Spark MLlib: An intelligent data analyzer

Flavio Clesio (Movile), J.P. Eiti Kimura (Movile)
2:35pm–3:15pm Thursday, December 8, 2016
Chat, machine learning, and AI
Location: Summit 1 Level: Intermediate
Tags: telecom
Average rating: ***..
(3.50, 4 ratings)

Prerequisite Knowledge

  • An understanding of Java and basic Scala programming
  • Familiarity with Apache Spark and machine learning

What you'll learn

  • Learn how to train machine-learning data models using Apache Spark MLlib
  • Understand how linear regressions can be applied to create a platform able to analyze data, warn users, and reduce revenue loss


With the rapid development and evolution of applications, high data volumes, and users influx, there’s a need to develop intelligent systems that can assist in data analysis and decision making. Flavio Clesio and Eiti Kimura offer a practical demonstration of using machine learning to create an intelligent monitoring application based on a distributed system data analysis using Apache Spark MLlib.

Monitoring distributed systems usually is a tricky task. Flavio and Eiti share their experience implementing machine-learning techniques in the development of a data analysis application to monitor a distributed platform responsible for charging user subscriptions at mobile carriers in Brazil. The application, Watcher-AI, uses linear regression algorithms with Apache Spark (MLlib) to make a forecast and check if the platform is experiencing any operational problems. Flavio and Eiti used Scala for processing and training machine-learning models and then developed a Java application that uses these models to predict the expected outputs. Watcher-AI is able to detect deviations in charging numbers and provides notifications stating the problem based on the numbers or on the platform so that users can work quickly to avoid serious problems that directly impact the company’s revenues and reduce the time for action.

Photo of Flavio Clesio

Flavio Clesio


Flavio Clesio is specialist in machine learning and revenue assurance at Movile, where he helps build core intelligent applications to exploit revenue opportunities and automation in decision making. Prior to Movile, Flavio was a business intelligence consultant in financial markets, specifically in nonperforming loans. He holds a master’s degree in computational intelligence applied in financial markets.

Photo of J.P. Eiti  Kimura

J.P. Eiti Kimura


Eiti Kimura is an IT coordinator and architect of distributed and high-performance platforms at Movile Brazil. Eiti has over 15 years of experience working with software development. He is an enthusiast of open technologies—he was an Apache Cassandra MVP from 2014 to 2015—and has vast experience with backend systems for carrier billing services, sending bulk text messages (SMS), and user action tracking. Eiti holds a master’s degree in electrical engineering with a specialization in software engineering.

Comments on this page are now closed.


Picture of Flavio Clesio
12/07/2016 8:44pm +08

Hi everyone! This is the main repository where our code are hosted:

You can fork and use with your data with the same architecture.


João Paulo Eiti Kimura
11/30/2016 2:44am +08

Hello Julius,

In fact you don’t need you laptop. We’ll show you some code and examples of machine learning algorithms and a reference to a repository, so you can download and test the code by yourself afterwards.

Julius Novan Cahyadi
11/29/2016 7:40pm +08


Should I bring my laptop? If yes, is there any prerequisites installation that I need to prepare?