Skip to main content

Big Data Analysis 0-60 in 90 days

Chad Naber (Intel), David Elfi (Intel Corporation)
Average rating: **...
(2.50, 2 ratings)
Slides:   external link

Do you know how long could it take to your team start producing value in the Big Data and Machine Learning area? This talk shows a real team experience starting from scratch to a functional Big Data and Machine Learning platform using several open source tools such as Apache Hadoop, Apache Hive and Python frameworks SciPy/Numpy/scikit-learn.
Based on just a business question, the presentation will contain a walk through several points our team has followed for getting a valuable introduction to Data Analysis concepts and their application in Hadoop.

The presentation is based on a timeline depicting all the stages experienced:
1. Identifying the right business needs. Making the difference between the how and the what
2. Data types: structure vs unstructured
3. Big Data concept and applicability. Hadoop as the best choice
4. Is data reliable or complete? Using Hive for profiling your data.
5. Process for replying the Business needs: descriptive and prescriptive analysis. Research based on RapidMiner and implementation on Python stack

Chad Naber


Chad is a data engineer with more than 12 years of progressive experience in leading, industry-recognized multi-national service organizations such as Intel, Nike and He has experience with leading edge technologies such as Redshift, SQL Server , Hadoop (both utilizing the streaming API and direct Java), and Hive with a baseline of experience in Agile data warehousing. Chad currently is working as a big data architect at Intel.

Photo of David Elfi

David Elfi

Intel Corporation

Certainly, David has gained experience in all the different roles he played at Intel since 2008. Mainly based on Ecommerce products guided by AppUp product, he worked for different product flavors in the areas of web services, consumer experience and mobile application development.
Recently, he entered in the world of managing data for business analysis in the research of improvements based on data collected from the field.

Comments on this page are now closed.


Picture of David Elfi
David Elfi
07/25/2014 8:56am PDT

Hi Khanh,
You can download the repo and use the .slides.html file. Open it with any browser (it should work).
Thanks for asking

07/25/2014 6:37am PDT

Can we get the presentation? The link to gethub is just source codes.