Sanoma has been running big data as a self-service platform for over five years, mainly as a service for business analysts to work directly on the source data. The road to getting business analysts to directly do their analyses on Hadoop was far from smooth. Sander Kieft explores Sanoma’s journey and shares some lessons learned along the way.
When Sanoma started with Hadoop, the company’s knowledge was limited. To kickstart the project, the decision was made to work together with a business intelligence company. But because of the consultant’s own limited experience and the lack of Hadoop support in the company’s ETL tool, the process was painfully slow. In the end, a new setup had to be created, which included switching to a real Hadoop distribution.
Not everyone is accustomed to accessing data by programming in Java or writing SQL queries, so to really make the data worthwhile, Sanoma introduced Hue. To get business analysts up to speed, the company had to design a training program, consisting of two full-days covering SQL basics and Hive specifics as well as an introduction to the dashboard.
The original project really gained momentum thanks to redundant hardware from a virtualization project, allowing rapid growth against very limited investments. Of course this came with a price: running so much end-of-support and -life hardware is only possible in a colocation environment. Last year the decision was made to move to the cloud. On paper, it was an easy switch, but the reality was slightly more complicated.
Sander Kieft is the ICT architect at Sanoma Media, where he is responsible for the common services and performance-based titles within Sanoma. His team designs and builds (web) services for some of the largest websites and most popular mobile applications in the Netherlands, Belgium, and Finland. Sander has been working with large-scale data in media for 15 years and with Hadoop and big data platforms in production for nearly a decade. Previously, he was a developer, architect, and technology manager for some of the largest websites in the Netherlands.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org