Dealing with Uncertainty: What the reverend Bayes can teach us.

Data Science
Location: King's Suite - Balmoral Level: Intermediate
Average rating: ****.
(4.46, 13 ratings)

As data scientists and decision makers, uncertainty is all around us: data is noisy, missing, wrong or inherently uncertain. Statistics offers a wide set of theories and tools to deal with this uncertainty, yet most people are unaware of a unifying theory of uncertainty. In this talk I want to introduce the audience to a branch of statistics called Bayesian reasoning which is a unifying, consistent, logical and most importantly successful way of dealing with uncertainty.

Over the past two centuries there have been many proposals for dealing with uncertainty (e.g. frequentist probabilities, fuzzy logic, …). Under the influence of early 20th century statisticians, the Bayesian formalism was somewhat pushed into the background of the statistical scene. More recently though, some to the credit of computer science, Bayesian thinking has seen a revival. So what and how much should a data scientist or decision maker know about Bayesian thinking?

My talk will consist of four different parts. In the first part, I will explain the central dogma of Bayesian thinking: Bayes Rule. This simple equation (4 variables, one multiplication and one division!) describes how we should update our beliefs about the world in light of new data. I will discuss evidence from neuroscience and psychology that the brain uses Bayesian mechanism to reason about the world. Unfortunately, sometimes the brain fails miserably at taking all the variables of Bayes rule into account.

This leads to the second part of the talk where I will illustrate Bayes rule as a tool for decision makers to reason about uncertainty. The realistic example will be from a medical context. Imagine a test for an illness that is 1% accurate: i.o.w. 1 in a 100 it will fail to detect a sick patient, and 1 in a 100 it will mistakenly diagnose a healthy patient as having the illness. If the disease is very rare (e.g. 1 in 10.000); then Bayes rule helps us calculate what the chances are of someone testing positive to actually be ill.

In the third part of the talk I will give an example of how we can build machine learning systems around Bayes rule. The key idea here is that Bayes rule allows us to keep track of uncertainty about the world. In this part I will illustrate one a Bayesian machine learning system in action.

In the final part of the talk I will introduce the concept of “Probabilistic Programming”. Probabilistic programming is a new embryonic programming paradigm that introduces “uncertain variables” as a first class citizen of a programming language and then uses Bayes rule to execute the programs. Although it is very early days for the probabilistic programming community, there are many probabilistic programming language to choose from: e.g. Church, Infer.NET, Stan, … I will give a very brief demonstration on how to write a probabilistic program.

When we look at the main machine learning conferences in the last few years, the Bayesian framework has been prominent. In this talk I want to help the audience understand how the Bayesian framework can help them in their data mining and decision making processes. If people leave the talk thinking Bayes rule is the E=MC^2 of data science, I will consider the presentation a success.

Photo of Jurgen Van Gael

Jurgen Van Gael

Rangespan, Ltd

Jurgen Van Gael is the Data Science Director at Rangespan, an e-commerce and data services startup from London. Previously Jurgen held a position as an applied researcher at Microsoft Research. Jurgen holds a PhD in machine learning from the University of Cambridge, a MSc. in Computer Science from the University of Wisconsin, Madison and a Ma. in Informatics from the University of Leuven.

Comments on this page are now closed.

Comments

Picture of Jurgen Van Gael
Jurgen Van Gael
15/11/2013 13:36 GMT

The slides are now up on http://www.slideshare.net/OReillyStrata/final-28280777.

Enjoy!

Lorea Arrizabalaga
13/11/2013 10:39 GMT

Jurgen, thanks for an excellent presentation. Any chance you could share the slides? Many thanks Lorea

Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners
@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts