As a data-driven enterprise, ING is heavily investing in big data, analytics, and stream processing. As in many other enterprises, ING deals with a large variety of data sources. Some are responsible for primary processes, while others are used to improve the quality of the service and keep internal operations going on smoothly. The amount of data that must be handled goes beyond the computing performance of single machines, and vertical scalability is hardly an option.
An important building block in ING’s analytics architecture is a state-of-the-art data lake, built with Hadoop and Spark. The data lake replaces several enterprise data warehouses and is the central repository for all types of data, supporting various types of queries for our stakeholders’ demands: batch, real-time, and both large and small datasets. Key elements of ING’s data lake include RESTfull APIs, secured and managed access to big data storage and processing, and real-time streaming analytics. Data is handled more often than not as streams, and ING works with Kafka and streaming computing (Spark, Flume, and Flink) to provide faster, more reactive, and up-to-date user experiences and journeys. In addition, machine learning (MLlib, H2O.ai, Python, and R) aids traditional SQL analytics to provide better insight when it comes to operational excellence, business processes, marketing, and security applications.
Bas Geerdink shares three use cases at ING that have a streaming data source at their core—the “look ahead” feature for predicting account balances, the actionable insights engine, and the fraud detection system—and discusses their respective architectures and technology. All software is currently in production, running with modern tools such as Kafka, Cassandra, Spark, Flink, and H2O.ai.
Bas Geerdink is a programmer, scientist, and IT manager at ING, where he’s responsible for the fast data systems that process and analyze streaming data. Bas has a background in software development, design, and architecture with broad technical experience from C++ to Prolog to Scala. His academic background is in artificial intelligence and informatics. Bas’s research on reference architectures for big data solutions was published at the IEEE conference ICITST 2013. He occasionally teaches programming courses and is a regular speaker at conferences and informal meetings.
Comments on this page are now closed.
©2017, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org