{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Priors and Posteriors" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Import some helper functions (please ignore this!)\n", "from utils import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Context:** If there's one thing we learned from the chapter on model selection and evaluation is that we should not blindly trust our models. Models are complicated and require a robust and diverse toolkit for responsible evaluation in their intended context. For safety-critical applications of ML, like the ones from the IHH, we must take additional precautions to ensure responsible use. We therefore adopt the following philosophy:\n", "1. **Finite information $\\rightarrow$ uncertainty.** We're often asked to make decisions without all the information necessary for certainty. We ask the same of our models: given a finite data set and an incomplete understanding of the phenomenon we're modeling, we ask models to make predictions for data they have never encountered. Therefore, for responsible use in safety-critical contexts, our models must have some way of quantifying the limits of their \"knowledge.\"\n", "2. **Not making choices $\\rightarrow$ a choice will be made for you.** If we avoid making explicit choices in the design of our model, a choice will still be made for us---and it might not be the choice we want. For example, without explicitly choosing what's important to us, we might get a model with the highest accuracy for a task for which minimizing false negatives is most important. *It's therefore better to make your choices explicitly.* Making assumptions explicit is especially important for uncertainty quantification. \n", "\n", "**Challenge:** To satisfy our new modeling philosophy, we need (1) a way to quantify uncertainty, and (2) a way to understand how uncertainty depends on our modeling choices. How can we do that with the tools we have? As we show here, we can't. We will then expand our DGM to create models that quantify uncertainty (Bayesian models) and introduce a new way of fitting ML models called Bayesian inference.\n", "\n", "**Outline:** \n", "* Motivate the need for uncertainty\n", "* Introduce a new modeling paradigm based on Bayes' rule\n", "* Provide intuition for this modeling paradigm\n", "* Implement this modeling paradigm in `NumPyro`\n", "* Gain intuition how different models have different uncertainty" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Data:** To help make the concepts concrete, we'll return to our regression data, in which we wanted to predict telekinetic ability from age. Let's load the data in:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Age | \n", "Glow | \n", "Telekinetic-Ability | \n", "
---|---|---|---|
Patient ID | \n", "\n", " | \n", " | \n", " |
90 | \n", "30.607729 | \n", "0.604085 | \n", "-0.020933 | \n", "
254 | \n", "38.531357 | \n", "0.613645 | \n", "-0.070165 | \n", "
283 | \n", "21.879414 | \n", "0.829212 | \n", "0.140791 | \n", "
445 | \n", "2.949004 | \n", "0.981120 | \n", "0.261027 | \n", "
461 | \n", "30.237446 | \n", "0.688329 | \n", "-0.027250 | \n", "
15 | \n", "29.562483 | \n", "0.796853 | \n", "-0.033701 | \n", "
316 | \n", "15.283975 | \n", "0.839546 | \n", "0.344510 | \n", "
489 | \n", "2.688488 | \n", "0.929422 | \n", "0.268031 | \n", "
159 | \n", "4.129371 | \n", "0.893813 | \n", "0.422464 | \n", "
153 | \n", "15.194182 | \n", "0.832483 | \n", "0.375658 | \n", "
241 | \n", "33.391247 | \n", "0.676760 | \n", "-0.028127 | \n", "
250 | \n", "32.363740 | \n", "0.711121 | \n", "-0.078376 | \n", "
390 | \n", "20.699366 | \n", "0.683075 | \n", "0.176542 | \n", "
289 | \n", "51.370230 | \n", "0.472696 | \n", "-0.153246 | \n", "
171 | \n", "24.983784 | \n", "0.703657 | \n", "0.028212 | \n", "