14–17 Oct 2019

Anomaly detection using deep learning to measure the quality of large datasets

Sridhar Alla (BlueWhale)
13:4514:25 Thursday, 17 October 2019
Location: Blenheim Room - Palace Suite
Average rating: *....
(1.67, 3 ratings)

Who is this presentation for?

  • Data engineers, scientists, and technical managers

Level

Advanced

Description

Any business, big or small, depends on analytics, whether the goal is revenue generation, churn reduction, or sales or marketing purposes. No matter the algorithm and the techniques used, the result depends on the accuracy and consistency of the data being processed. Take a look at some techniques used to evaluate the quality of data and the means to detect the anomalies in the data.

Sridhar Alla walks you through deep learning neural networks and various techniques you can use to detect anomalies in data. In order to derive value from data, no matter what kind of ML algorithms and modeling techniques are implemented such as predictive analytics, clustering, Bayesian belief networks, regression models, ultimately the effectiveness of the models depends directly on the features used, which is again dependent on the input data sources consumed for the purpose. To solve this problem, modules were implemented to define the properties of the data being consumed and detect anomalies in the data, report it, and enable the stakeholders to discuss and take corrective action.

Sridhar showcases how using NVIDIA GPUs, Keras, and TensorFlow using Python 3.6 has pushed the limits on the amount of data that can be profiled and anomalies detected. Similar techniques were implemented on time series data, particularly using LSTM. You’ll learn about deep learning-based autoencoders, unsupervised clustering, and density-based methods. Sridhar shows some code using a Jupyter notebook to show you how you can implement a similar strategy in you organization.

Prerequisite knowledge

  • Familiarity with machine learning and Python

What you'll learn

  • Learn about the application of deep learning to the problem of ensuring data quality in a data processing and modeling pipeline
Photo of Sridhar Alla

Sridhar Alla

BlueWhale

Sridhar Alla is cofounder and CTO at BlueWhale, which brings together the worlds of big data and artificial intelligence to provide comprehensive solutions to meet the business needs of organizations of all sizes. He and his team are cloud and tool agnostic and strive to embed themselves into the workstream to provide strategic and technical assistance, with solutions such as predictive modeling and analytics, capacity planning, forecasting, anomaly detection, advanced NLP, chatbot development, SAS to Python migration, and deep learning-based model building and operationalization. Sridhar is also the author of three books and an avid presenter at conferences including Strata, Hadoop World, Spark Summit and others.

Comments on this page are now closed.

Comments

Picture of Sridhar Alla
Sridhar Alla | Cofounder and CTO
18/10/2019 13:56 BST

for slides and notebooks, https://github.com/blue-whale-one/strataAICon2019

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

aisponsorships@oreilly.com

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires