Presented By O'Reilly and Cloudera
Make Data Work
December 1–3, 2015 • Singapore

Apache Hadoop operations for production systems

Kathleen Ting (Cloudera), Jonathan Hsieh (Cloudera, Inc), Philip Langdale (Cloudera), Kostas Sakellis (Cloudera)
1:30pm–5:00pm Tuesday, 12/01/2015
Production-ready Hadoop
Location: 321-322 Level: Intermediate
Average rating: ***..
(3.62, 8 ratings)
Slides:   external link,   2-PDF 

Prerequisite Knowledge

- Basic knowledge of Hadoop platform - Knowledge of Linux command line - Laptop with internet access, web browser, and ssh client

Description

Hadoop is emerging as the standard for big data processing and analytics. However, as usage of Hadoop clusters grow, so do the demands of managing and monitoring these systems.

In this tutorial, attendees will get an overview of all phases of successfully managing Hadoop clusters, with an emphasis on production systems — from installation to configuration management, service monitoring, troubleshooting, and support integration.

We will review tooling capabilities and highlight the ones that have been most helpful to users, and share some of the lessons learned and best practices from users who depend on Hadoop as a business-critical system.

Agenda Topics:

  • Installation (hardware considerations, OS prerequisites, sanity testing, security considerations)
  • Configuration (mechanics, key configurations, resource management)
  • Troubleshooting (managing, troubleshooting, and debugging Hadoop clusters and applications)
    * Enterprise donsiderations (scaling, logs, failure testing)
Photo of Kathleen Ting

Kathleen Ting

Cloudera

Kathleen Ting is a technical account manager at Cloudera where she helps strategic customers deploy and use the Apache Hadoop ecosystem in production. She’s a frequent conference speaker, has contributed to several projects in the open source community, and is a committer and PMC member on Apache Sqoop. Kathleen is also a co-author of O’Reilly’s Apache Sqoop Cookbook.

Photo of Jonathan Hsieh

Jonathan Hsieh

Cloudera, Inc

Jonathan Hsieh is a software engineer at Cloudera. He is an Apache HBase committer, and Apache Flume founder.

Photo of Philip Langdale

Philip Langdale

Cloudera

Philip Langdale is the engineering lead for cloud at Cloudera. He joined the company as one of the first engineers building Cloudera Manager and served as an engineering lead for that project until moving to working on cloud products. Previously, Philip worked at VMware, developing various desktop virtualization technologies. Philip holds a bachelor’s degree with honors in electrical engineering from the University of Texas at Austin.

Kostas Sakellis

Cloudera

Kostas Sakellis is the lead and engineering manager of the Apache Spark team at Cloudera. Kostas holds a bachelor’s degree in computer science from the University of Waterloo, Canada.

Comments on this page are now closed.

Comments

Picture of Kathleen Ting
Kathleen Ting
12/01/2015 10:29pm +08

@Igor – thanks for your questions, which we’ll answer during today’s tutorial.

Picture of Kathleen Ting
Kathleen Ting
12/01/2015 10:27pm +08

Thanks for the interest and our Strata Singapore slides can be found here: http://tiny.cloudera.com/hadoop-ops-singapore-slides

Igor Nikolaev
11/27/2015 1:45am +08

How to monitor and deal with filling disks?

Igor Nikolaev
11/27/2015 1:44am +08

How to deal with under replicated and corrupted blocks?

Igor Nikolaev
11/27/2015 1:42am +08

What is a strategy of monitoring and replacing broken disks?

Igor Nikolaev
11/27/2015 1:32am +08

Is it a good idea to install Hive Server2 on all nodes?