Last month, I wrote that DevOps is the struggle to bring harmony to diverse human systems and human-designed technical systems. Recently, I had an insightful conversation with Radu Gheorghe (@radu0gheorghe) and Rafał Kuć (@kucrafal) about scaling log systems that underscored this challenge of diverse systems.
Radu and Rafał are engineers at Sematext and will be speaking at Velocity New York about how to go "from zero to production hero" doing log analysis with Elasticsearch. One of the big challenges to becoming a production hero is scaling.
"Basically, Elasticsearch was built up as a distributive system… It scales very, very well," says Rafał.
But he warns, "When you have thousands of servers and devices around, and each produces logs every second, then it starts to get harder and harder. The amount of data there starts to be enormous and even hard for Elasticsearch to handle."
He adds that it becomes even worse when you try to do sophisticated data analysis on it.
Radu explains that there's no single solution and the easy scalability of the cloud isn't a panacea. "These are different workloads, different hardware, requiring a different sort of configuration."
The diversity and scale of log and time-series data producers only exacerbates the problem.
"You end up getting almost as much hardware for shipping the logs as for storing the logs. That is not a good thing," Radu says. "You want shipping to be very, very light."
Like most diversity problems, finding commonality, especially in communication, can pay exponentially.
"Log in JSON if you can." he says. Even with services like the Apache web server that output plain text, "you can configure the log format to something that's basically in JSON. So you don't worry later on about parsing which field is which. It's about performance, because it's much easier to parse JSON than to use regular expressions."
But Rafał points out, "You don't always control how your logs are structured. In your organization you can have not only applications, but you can have hardware that outputs log in a certain way that you can't really control. You can only consume them."
However, the difficult integrations are where Rafał finds the joy, "Choose the right hardware. Run performance tests to see if everything will be working the way you would like it to. Then, roll to production and actually start playing with it because that's where the fun starts."
Join Radu Gheorghe and Rafał Kuć's tutorial at Velocity to learn more about the challenges of scaling log analysis with Elasticsearch and more importantly, the strategies to solve them.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org