Presented By O'Reilly and Cloudera
Make Data Work
Sept 29–Oct 1, 2015 • New York, NY

Practical data science on Hadoop

BRANDON MACKENZIE (IBM), John Rollins (IBM), Jacques Roy (IBM), Chris Fregly (Amazon Web Services), Mokhtar Kandil (IBM)
9:00am–5:00pm Tuesday, 09/29/2015
Location: 1B 03
Average rating: **...
(2.50, 12 ratings)

Materials or downloads needed in advance

Hands-on lab environment will be provided by IBM. Participants may want to bring a small USB drive to save their files, if desired.


In this three-day course, you will:

  • Learn how to use machine learning, text analysis, and real-time analytics to solve frequently
    encountered, high-value business problems.
  • Understand data science methodology and end-to-end work flow of problem solution including
    data preparation, model building and validation, and model deployment.
  • Use Apache Spark and other tools for analytics.


Day 1

  • Fundamental data science methodology
  • Overview of selected machine learning methods
  • Hands-on labs with Spark MLlib and SystemML libraries
  • Descriptive statistics
  • Feature transformations
  • Supervised and unsupervised methods
  • Diagnostics

Day 2

  • Text analytics concepts
  • Text analytics development, testing, and deployment
  • Continuous analytics (streaming)
  • Hands-on labs on text analytics and streaming

Day 3

  • Recommendation engines with hands-on lab
  • Using Apache Spark with IBM SPSS Modeler
  • What’s coming in data science
  • Spark and hardware accelerators
  • Machine learning pipelines with hands-on lab
  • Productization with Spark

Target Audience

Data scientists, business analysts.
Some knowledge of R and/or Python is preferable but not required.

Additional Information

Hands-on lab environment will be provided by IBM.




Brandon MacKenzie is the Data Science on Hadoop leader on IBM’s Worldwide Technical Sales team for Information Management Software. He is an expert on statistical processing in Hadoop and HPC environments. Brandon earned his master’s degree from The University of Edinburgh.

Photo of John Rollins

John Rollins


John B. Rollins, Ph.D. is a data scientist in the IBM Analytics division of IBM. His background is in the fields of data mining, engineering, and econometrics in many industries. He holds seven patents, and has authored a best-selling engineering textbook and many technical papers. He holds doctoral degrees in economics and petroleum engineering from Texas A&M University.

Photo of Jacques Roy

Jacques Roy


Jacques Roy is a member of the IBM worldwide analytics platform technical team, specializing in big data streaming analytics. He has also worked in many technology areas including operating systems, databases, and application development. He is the author of multiple books, with the most recent being The Power of Now: Real-Time Analytics and IBM InfoSphere Streams. He is also a regular contributor to IBM Data magazine. Jacques has been a presenter at many conferences including IBM’s Information on Demand (IOD).

Photo of Chris Fregly

Chris Fregly

Amazon Web Services

Chris Fregly is a senior developer advocate focused on AI and machine learning at Amazon Web Services (AWS). Chris shares knowledge with fellow developers and data scientists through his Advanced Kubeflow AI Meetup and regularly speaks at AI and ML conferences across the globe. Previously, Chris was a founder at PipelineAI, where he worked with many startups and enterprises to deploy machine learning pipelines using many open source and AWS products including Kubeflow, Amazon EKS, and Amazon SageMaker.

Mokhtar Kandil


Comments on this page are now closed.


Kshitija Gokhale
09/28/2015 10:48am EDT

It says up top
“Hands-on lab environment will be provided by IBM. Participants may want to bring a small USB drive to save their files, if desired. "

Carlos Miron
09/26/2015 8:26am EDT

Do I need to bring my own laptop?

Tija Gokhale
09/24/2015 6:04pm EDT

For the hands-on training will I need to bring my own laptop? ro will the computing platform be provided by IBM?

Picture of Armen Donigian
Armen Donigian
09/23/2015 5:11pm EDT

can u post a link to training materials or things we need to download/setup prior to arrival?

Picture of Ben Lorica
Ben Lorica
08/10/2015 6:55am EDT


“When I attend this 3-day training, is there still time for visiting sessions or it is 3days fulltime”
>> this training will coincide with the sessions, it probably won’t be possible to visit sessions while attending this training.

You can see this from the “daily grid” for Tue/Wed/Thu
Alexander Bij
08/10/2015 5:29am EDT

When I attend this 3-day training, is there still time for visiting sessions or it is 3days fulltime.
Then I should buy a TrainingsTicket instead.

Pradipti Pal
06/03/2015 9:18am EDT

Would this be a one on one session including practicals?
Is there a verified certificate for this course?

Suresh Devanathan
06/01/2015 1:11am EDT

Can you please list any pre-requisites for this training?

Kathy Yu
05/22/2015 11:53am EDT

Hi Zhibo – this is a three-day course that runs from Tuesday-Thursday.

Zhibo Zheng
05/22/2015 10:17am EDT

Is this one-day, two-day, or three-day course?