Presented By O'Reilly and Cloudera
Make Data Work
December 1–3, 2015 • Singapore

Patterns and paradigms: Managing semi-structured data with high velocity change for large scale e-commerce

Utkarsh B (Flipkart), Vinod Venkatraman (Flipkart Internet Private Limited)
11:50am–12:30pm Thursday, 12/03/2015
Hadoop & Beyond
Location: 328-329 Level: Intermediate
Tags: commerce
Average rating: ****.
(4.00, 2 ratings)
Slides:   1-FILE 

Prerequisite Knowledge

Brief understanding of HBase and Elasticsearch.


In this talk, we unravel the experience of developing “Hoodoo” — an in-house solution at to manage the enormous catalog of the marketplace. We’ll share the paradigms and patterns that evolved through the lifecycle of the solution.

Hoodoo is a generic, distributed, and elastic data store abstraction that helps to manage semi-structured data that has a high velocity of change in semantics and structural definition. Using primitive concepts of entities and relationships (E-R Modelling), it helps model and manage functional data with such traits. Hoodoo unifies data access patterns in its APIs (id based access, parametrized queries, search, et al) and provides tuneable consistency levels for stored data.

Functional data can often be non-trivial to manage and serve, especially when it is constantly evolving. As an example, consider catalog data for a retail marketplace like Flipkart.

  • The metadata for a catalog entry is dynamic in nature (elasticity)
  • Catalog entries share meaningful associations that could be transient or static, with time (flexibility)
  • Multiple looking glasses to the same data (semantic relevance)
  • Additionally, the flux of change is large (variability)
  • Now all this is to be managed when the catalog data size is 3 billion and growing

Hoodoo uses the following patterns, techniques, and technologies:

  • HBase to store entities and Elasticsearch to index entity properties, enabling search as well as optimized id-based look-ups
  • Provide for eventual consistency between the data stores using techniques like write ahead logs that are then applied reliably
  • Support multi-tenancy and tuneable consistency schemes while serving data with low latencies at scale
  • Timestamp consistent data views to entities and their associations
Photo of Utkarsh B

Utkarsh B


Utkarsh B. is the technology advisor to the CEO, a distinguished architect, and a senior principal architect at Flipkart. He’s been driving architectural blueprints and coherence across diverse platforms in Flipkart through multiple generations of their evolution and leveraging technology to solve for scale, resilience, business continuity, and disaster recovery. He has extensive experience (18+ years) in building platforms across a wide spectrum of technical and functional problem domains.

Vinod Venkatraman

Flipkart Internet Private Limited

Vinod Venkatraman is currently helping build the marketplace technology platform at Flipkart. Vinod’s specialties are Core Java, Java concurrency, SOA, JMS, web services, JTA, web development, Spring, Struts, JDBC, SQL, JavaScript, Ajax, Ext-JS, DB schema design, and performance optimization.