



 
In retail banking, product managers have to regularly optimize their consumer portfolio across products, markets, customer segments, and other dimensions for a range of objective functions. These range from maximizing total revenue over N months across the entire portfolio with the least interest expense to adjusting front and back book pricing to narrowly defined regional and product-level targets. In all use cases, the unit of optimization is the most granular pricing cell where rate is a variable, and the optimization scope can easily involve hundreds of thousands of such pricing cells across multiple geographies, products, and channels. What makes it even more complicated are real-world constraints on those pricing cells that make them interdependent (such as price ordering, lock-step behavior, “frozen” cells, and more).
Kaushik Deka and Ted Gibson share a large-scale optimization architecture in Spark for a consumer product portfolio optimization use case in retail banking. The architecture combines a simulator that distributes computation of complex real-world scenarios and a constraint optimizer that uses business rules as constraints to meet growth targets.
The team faced three challenges in building this solution—finding or creating a declarative language framework that could be interpreted both by the Spark-based simulator and the optimizer, allowing a feedback loop; a model abstraction framework to enable fast optimization of real-world simulations; and a distributed architecture that integrates the simulator and optimizer within an application session
To solve the first problem, they adopted a “rules” framework used to express the varied set of inputs to both the simulator and optimizer. For example, the portfolio simulator uses bank rate rules to represent direct pricing changes, competitor rules to modify the competitive landscape, macrorelated rules to account for the changing rate environment, and more. Even constraints and targets to the optimizer were designed as rules to enable a flexible expression of business goals.
To solve the second problem, they first divided the simulation space into partitions defined by a set of identifiers, which allows independent distributed computing using a Spark-based simulator. Second, they built partition-level “approximation models” for interested metrics by training machine learning models on a training dataset generated from a full simulation run. The optimization space can then be explored by the optimizer using these approximation models, obviating expensive full simulation runs and vastly improving optimization runtime.
To solve the third problem, they designed a job orchestration framework to create a simulator-generated training dataset on the cluster, built approximation models, optimized, resimulated, and ultimately fed optimization results back to the end user application via a real-time Kafka channel.
 
        Kaushik Deka is a partner and CTO at Novantas, where he is responsible for technology strategy and R&D roadmap of a number of cloud-based platforms. He has more than 15 years’ experience leading large engineering teams to develop scalable, high-performance analytics platforms. Kaushik holds an MS in computer science from the University of Missouri, an MS in engineering from the University of Pennsylvania, and an MS in computational finance from Carnegie Mellon University.
 
        Ted Gibson is a product management principal at Novantas Solutions, where he is responsible for content product management for the PriceTek suite of products, focusing on business use cases, metrics, models, and calculations for innovative new development. In his more than eight years working on PriceTek, Ted has held various roles across product management, sales, client services, and engineering and has experience in pricing for consumer deposits, home equity, mortgage, auto, and unsecured lending. He holds a BA in applied mathematics from Yale University.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com