Causal inference 101: Answering the crucial ‘why’ in your analysis.
Who is this presentation for?• Any practitioner of data science - Data Scientists, Decision Scientists, Data analysts & Data Science-Managers.
Causal questions are ubiquitous in data science. For e.g. questions such as, did changing a feature in a website lead to more traffic or if digital ad exposure led to incremental purchase are deeply rooted in causality.
Randomized tests are considered to be the gold standard when it comes to getting to causal effects. However, experiments in many cases are unfeasible or unethical. In such cases one has to rely on observational (non-experimental) data to derive causal insights. The crucial difference between randomized experiments and observational data is that in the former, test subjects (e.g. customers) are randomly assigned a treatment (e.g. digital advertisement exposure). This helps curb the possibility that user response (e.g. clicking on a link in the ad and purchasing the product) across the two groups of treated and non-treated subjects is different owing to pre-existing differences in user characteristic (e.g. demographics, geo-location etc.). In essence, we can then attribute divergences observed post-treatment in key outcomes (e.g. purchase rate), as the causal impact of the treatment.
This treatment assignment mechanism that makes causal attribution possible via randomization is absent though when using observational data. Thankfully, there are statistical methods available to ensure that we are able to circumvent this shortcoming and get to causal reads.
The aim of this talk, will be to offer a practical overview of the above aspects of causal inference. Topics include:
− The fundamental tenets of causality and measuring causal effects.
− Challenges involved in measuring causal effects in real world situations.
− Distinguishing between randomized and observational approaches to measuring the same.
− Provide an introduction to measuring causal effects using observational data using matching and its extension of propensity score based matching with a focus on the a) the intuition and statistics behind it b) Tips from the trenches, basis the speakers experience in these techniques and c) Practical limitations of such approaches
− Walk through an example of how matching was applied to get to causal insights regarding effectiveness of a digital product at Walmart.
Prerequisite knowledge• A basic understanding of machine statistics and data science.
What you'll learn
Subhasish Misra is currently a Data Scientist at Walmart Labs where he leads efforts to create scalable machine learning solutions for Walmart’s customer base. Alongside this, he is also a member of the global data science board at i-com, a cross industry global think tank on harnessing data & analytics for better marketing.
Subhasish has previously worked at Hewlett Packard Co, WPP & Aon and consulted for many Fortune 500 clients across multiple geographies in his 12 years of advanced analytics career.
His broad expertise lies along a wide spectrum of marketing analytics & current data science interest areas are around modeling customer behavior & causal inference.
Subhasish holds a M.A in Economics from Delhi School of Economics, where econometrics was one of his focus areas.
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
View a complete list of Strata Data Conference contacts