Enterprises want to be data driven from the very beginning or want to join the race for data supremacy. Being data driven requires the system to store and process every single transaction and interaction the customer makes with the product, thus enabling the business to make better decisions.
But storing, processing, and analyzing data comes with a cost. This cost is distributed across the choice of technology, infrastructure, and go-to-market strategy.
Nischal HP and Raghotham Sripadraj share their experience building data science platforms for various enterprises, with an emphasis on making the right architecture choices for things such as databases, queues, caching mechanisms, distribution of the workload, underlying technology for machine learning and predicitive models, visualization, and prototyping. Nischal and Raghotham stress the importance of using distributed and fault-tolerant tools, which themselves come with the cost of managing the infrastructure (including, by implication, a dedicated team to monitor the infra). However, with small data, simple tools take you a long way.
Many things can go unnoticed in building an end-to-end data science system, like the importance of logging, building a data pipeline that sends notifications to the required medium of communication, exposing data science as a service via APIs, or A/B testing for data science-backed feature releases when required. Only when the data science solution is in production does it power the organization the right way.
When building data science products you should live by the motto “fail fast.” Nischal and Raghotham themselves have failed fast when making these choices, but in time they came to understand that adopting the latest and the coolest technology on the planet just for the sake of it is not the right thing to do.
Nischal HP is the VP of engineering and Data science at omni:us, A Berlin-based AI startup that is focussed on bringing AI into the insurance world for claims processing.Nischal is also a mentor for data science career track on Springboard.
Previously he has had the privilege of working on various ecommerce systems for catalog management, building data science systems in the domains of fintech, marketing analytics, and event management, recommendation engines, algorithmic trading and gamification of trading indicators.
Nischal has conducted workshops in the field of deep learning across the world and has spoken at a number of data science conferences. He is a strong believer in open source and loves to architect reliable systems. In his free time, he enjoys music, traveling, and spending time with his SO.
Raghotham Sripadraj is senior data scientist at Ericsson. Raghotham is also a mentor for data science on Springboard. Previously, he headed the data science team at Treebo Hotels and was cofounder and data scientist at Unnati Data Labs, where he built end-to-end data science systems in the fields of fintech, marketing analytics, and event management. Before that, at Touchpoints Inc., he single-handedly built a data analytics platform for a fitness wearable company, and at SAP Labs, he was a core part of what is currently SAP’s framework for building web and mobile products, as well as a part of multiple company-wide events helping to spread knowledge both internally and to customers. Drawing on his deep love for data science and neural networks and his passion for teaching, Raghotham has conducted workshops across the world and given talks at a number of data science conferences. Apart from getting his hands dirty with data, he loves traveling, Pink Floyd, and masala dosas.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.