7–9 November 2016: Conference & Tutorials
9–10 November 2016: Training
Amsterdam, The Netherlands

Elasticsearch for logs and metrics: A deep dive

Radu Gheorghe (Sematext Group), Rafał Kuć (Sematext Group)
14:40–15:20 Monday, 7/11/2016
Metrics/monitoring Cloud, Databases Auditorium (Ground + Balcony) Audience level: Intermediate
Average rating: ***..
(3.00, 13 ratings)

Prerequisite knowledge

  • A basic understanding of Linux and Elasticsearch

What you'll learn

  • Understand best practices for running Elasticsearch in production for indexing metrics and logs
  • Learn how to scale the cluster, tune individual nodes for performance, and tune the pipeline that buffers and processes data on its way to Elasticsearch

Description

Doing a proof of concept with Elasticsearch and the Elastic stack is easy. Pushing the limits of its performance and scale is quite another thing. Radu Gheorghe and Rafał Kuć concentrate on the latter, discussing both the pitfalls and the best practices of using Elasticsearch for logs and metrics.

Radu and Rafał start by looking at how to scale Elasticsearch through a combination of time- and size-based indices and how to divide the cluster in tiers in order to handle the potentially spiky load in real time. They’ll focus largely on tuning individual nodes, covering everything from refreshes and flushes, buffers and caches, and merge policies and doc values to OS settings like disk scheduler, SSD caching, and huge pages. Some of these settings will be different for storing logs and metrics—Radu and Rafał explain how and why.

Radu and Rafał conclude with a look at the pipeline for getting the logs to Elasticsearch and demonstrate how to make it fast and reliable: where should buffers live, which protocols to use, where should the heavy processing be done (like parsing unstructured data), and which tools from the ecosystem can help.

Photo of Radu Gheorghe

Radu Gheorghe

Sematext Group

Radu Gheorghe is a search consultant and software engineer at Sematext, working mainly with Elasticsearch- and logging-related projects. He is the coauthor of Elasticsearch in Action.

Photo of Rafał Kuć

Rafał Kuć

Sematext Group

Rafał Kuć is a search consultant and software engineer at Sematext Group, Inc. mainly focused on Lucene, Solr, Elasticsearch, Hadoop, and Mahout. Rafał is the author of the Apache Solr Cookbook series and Elasticsearch Server. He is a father, a consultant at Sematext, and cofounder of the blog solr.pl, where he tries to share his knowledge.