Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK

Sightseeing, venues, and friends: Predictive analytics with Spark ML and Cassandra

14:05–14:45 Thursday, 2/06/2016
Data science & advanced analytics
Location: Capital Suite 15/16 Level: Intermediate
Average rating: ****.
(4.56, 9 ratings)

Which venues have similar visiting patterns? How can we detect when a user is on vacation? Can we predict which venues will be favorited by users by examining their friends’ preferences? Natalino Busa explains how these predictive analytics tasks can be accomplished by using Spark SQL, Spark ML, and just a few lines of Scala code.

Natalino presents a collection of machine-learning techniques to extract insights from location-based social networks such as Facebook, demonstrating how to combine a dataset of venues’ check-ins with the user social graph using Spark and how to use Cassandra as a storage layer for both events and models before sketching how to operationalize such predictive models and embed them as microservices.

Photo of Natalino Busa

Natalino Busa


Natalino Busa is the chief data architect at DBS, where he leads the definition, design, and implementation of big, fast data solutions for data-driven applications, such as predictive analytics, personalized marketing, and security event monitoring. Natalino is an all-around technology manager, product developer, and innovator with a 15+-year track record in research, development, and management of distributed architectures and scalable services and applications. Previously, he was the head of data science at Teradata, an enterprise data architect at ING, and a senior researcher at Philips Research Laboratories on the topics of system-on-a-chip architectures, distributed computing, and parallelizing compilers.

Comments on this page are now closed.


Picture of Natalino Busa
Natalino Busa
7/06/2016 9:03 BST

Slides available: