Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

Making architecture choices for small and big data problems

Nischal HP (Unnati Data Labs), Raghotham Sripadraj (Ericsson)
4:20pm5:00pm Thursday, March 16, 2017
Data engineering and architecture
Location: LL20 A Level: Intermediate
Secondary topics:  Architecture
Average rating: ****.
(4.67, 3 ratings)

Who is this presentation for?

  • Data architects, data engineers, software engineers, and data scientists

Prerequisite knowledge

  • A basic understanding of data science, machine learning, and software engineering

What you'll learn

  • Learn lessons from building end-to-end data science systems
  • Understand the importance of adopting software engineering practices

Description

Enterprises want to be data driven from the very beginning or want to join the race for data supremacy. Being data driven requires the system to store and process every single transaction and interaction the customer makes with the product, thus enabling the business to make better decisions.

But storing, processing, and analyzing data comes with a cost. This cost is distributed across the choice of technology, infrastructure, and go-to-market strategy.

Nischal HP and Raghotham Sripadraj share their experience building data science platforms for various enterprises, with an emphasis on making the right architecture choices for things such as databases, queues, caching mechanisms, distribution of the workload, underlying technology for machine learning and predicitive models, visualization, and prototyping. Nischal and Raghotham stress the importance of using distributed and fault-tolerant tools, which themselves come with the cost of managing the infrastructure (including, by implication, a dedicated team to monitor the infra). However, with small data, simple tools take you a long way.

Many things can go unnoticed in building an end-to-end data science system, like the importance of logging, building a data pipeline that sends notifications to the required medium of communication, exposing data science as a service via APIs, or A/B testing for data science-backed feature releases when required. Only when the data science solution is in production does it power the organization the right way.

When building data science products you should live by the motto “fail fast.” Nischal and Raghotham themselves have failed fast when making these choices, but in time they came to understand that adopting the latest and the coolest technology on the planet just for the sake of it is not the right thing to do.

Photo of Nischal HP

Nischal HP

Unnati Data Labs

Nischal HP is the cofounder and data scientist at Unnati Data Labs, where he is building end-to-end data science systems in the fields of fintech, marketing analytics, and event management. Nischal is also a mentor for data science on Springboard. Previously he built, from scratch, various ecommerce systems for catalog management, recommendation engines, and sentiment analyzer during his tenure at Redmart and built various data crawlers and intention mining systems and laid down initial work on an end-to-end text mining and analysis pipeline at SAP Labs. The majority of his work, however, was centered around building gamification of technical indicators for algorithmic trading platforms. Nischal has conducted workshops in the field of deep learning across the world and has spoken at a number of data science conferences. He is a strong believer of open source and loves to architect big, fast, and reliable systems. In his free time, he enjoys music, traveling, and meeting new people.

Photo of Raghotham Sripadraj

Raghotham Sripadraj

Ericsson

Raghotham Sripadraj is senior data scientist at Ericsson. Raghotham is also a mentor for data science on Springboard. Previously, he headed the data science team at Treebo Hotels and was cofounder and data scientist at Unnati Data Labs, where he built end-to-end data science systems in the fields of fintech, marketing analytics, and event management. Before that, at Touchpoints Inc., he single-handedly built a data analytics platform for a fitness wearable company, and at SAP Labs, he was a core part of what is currently SAP’s framework for building web and mobile products, as well as a part of multiple company-wide events helping to spread knowledge both internally and to customers. Drawing on his deep love for data science and neural networks and his passion for teaching, Raghotham has conducted workshops across the world and given talks at a number of data science conferences. Apart from getting his hands dirty with data, he loves traveling, Pink Floyd, and masala dosas.