Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK

Demonstrating the art of the possible with Spark and Hadoop

11:15–11:55 Friday, 3/06/2016
Location: Capital Suite 2/3
Average rating: **...
(2.00, 3 ratings)

Apache Spark is on fire. Over the past five years, more and more organizations have looked to leverage Spark to operationalize their teams and the delivery of analytics to their respective businesses. Adrian Houselander and Joy Spohn demonstrate two use cases of how Apache Spark and Apache Hadoop are being used to harness valuable insights from complex data across cloud and hybrid environments.

The first example showcases RedRock, an application that lets the user act on data-driven insights discovered from Twitter. Powered by IBM Analytics running on Spark and Hadoop, it finds patterns in user tweets to see influential individuals, related topics of interest, and where in the world the conversation is taking place. RedRock leverages two specific data science algorithms, Word2vec and k-means, to build screens in the app. The Word2vec algorithm, based on deep neural networks, assigns a numerical vector to each of the words in the Twitter data. Once a feature matrix is formed with the Word2vec algorithm, k-means is applied to the cluster words.

The second example showcases a financial institution that derives cross-sell/up-sell insight targeted to their specific clients for purposes of customer retention/loyalty. The financial institution wants to leverage business-owned on-premises data found in DB2 for z/OS, IMS, and VSAM within their z Systems (mainframe) environment augmented with insight from sentiment analysis of Twitter data and public S&P stock price data, which could be based on cloud implementations. Apache Spark running natively on z/OS provides flexibility, economic advantages, and governance through avoidance of unnecessary ETL leveraging federated analytics.

This session is sponsored by IBM.

Joy Spohn

IBM

.

ADRIAN HOUSELANDER

IBM

.