Presented By O'Reilly and Cloudera
December 5-6, 2016: Training
December 6–8, 2016: Tutorials & Conference
Singapore

Support digital applications with a resilient, highly available, and NRT Hadoop backend

Jorge Pablo Fernandez (Isban UK, Santander Group), Nicolette Bullivant (Santander UK Technology)
4:15pm–4:55pm Wednesday, December 7, 2016
Production-ready Hadoop
Location: Summit 2 Level: Beginner
Average rating: ***..
(3.33, 3 ratings)

Prerequisite Knowledge

  • A basic understanding of big data, the differences between Spark Streaming and Flume interceptors, HBase, APIs, and Kerberos security

What you'll learn

  • Explore the architecture and implementation of Santander's Spendlytics app

Description

Jorge Pablo Fernandez and Nicolette Bullivant explore Santander Bank’s Spendlytics app, which helps customers track their spending by offering a listing of transactions, transactions aggregations, and real-time enrichment based on the categorization of transactions depending on market and brands.

Jorge and Nicolette offer an overview Spendlytics’s architecture, focusing on how it helped reduce the cost of consultative requests. They also cover how the system was implemented for a resilient, highly available service, covering the solution put in place to meet the challenge of NRT, the use of a Lambda architecture in a real production system, the solutions used for responsive E2E, and the potential to use Kudu in the future. Along they way, they share the challenges encountered and lessons learned while implementing the app, as well as the residual issues found after the implementation and the improvements Santander is making to the platform.

Topics include:

  • The architecture for the NRT Lambda architecture, using Flafka for NRT enrichment
  • Using HBase for highly demanding Internet applications
  • Future system adaptations with Lilly and Solr
  • How to distribute the clusters and racks in order to create a highly resilient environment
  • Access patterns and APIs accessing the cluster architecture
  • Syncing data into HDFS
  • Problems encountered (gaps in resilience, security, speed processing; HFiles versus puts; Flafka issues in high-resiliency environments; and problems and solutions for coprocessors in HBase) and the proposed solutions
Photo of Jorge Pablo Fernandez

Jorge Pablo Fernandez

Isban UK, Santander Group

Jorge Pablo is the head of data Hadoop applications on the Data Innovation team at Isban UK (Santander), where he is responsible for building the development team for Hadoop and bringing new technologies and methodologies to Santander UK.

Photo of Nicolette Bullivant

Nicolette Bullivant

Santander UK Technology

Nicolette Bullivant is the head of data engineering at Santander UK Technology. A technical manager with 18 years’ experience in the IT services industry, she previously led large-scale multilocation change projects comprising data provision, managed MI and data warehouses, ETL, system integration, and IT alignment.