{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Model Selection & Evaluation" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Import some helper functions (please ignore this!)\n", "from utils import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Context:** At this point, we have a general framework for developing probabilistic models, as well as one way of fitting them to data, MLE. We've then instantiated this framework to develop two types of predictive models---regression and classification. We further learned how to build in expressivity using tools from deep learning---namely, neural networks---into these models. Are we finally ready to apply these models to real-life tasks? \n", "\n", "Unfortunately, there's one key piece we're still missing: so far, we've only used 1- and 2-dimensional input data and 1-dimensional output data. While in principle, we already have the tools to implement predictive models for higher dimensional data, we don't yet have the tools to *evaluate* them. We've purposefully worked with lower dimensional data because it is easy to visualize, and therefore easy to qualitatively evaluate. But as data becomes higher dimensional, it's much more difficult to get intuition using visualizations. As a result, we will have to rely on *metrics*. \n", "\n", "**Challenge:** There are many ways of measuring model performance. Which metrics should we use? What are the pros and cons of each metric? We will answer these questions here. Even though our motivation for developing evaluation metrics is our inability to visualize high-dimensional data, we will in fact focus on low-dimensional data, again. This is because we need to gain intuition about each metric we introduce. \n", "\n", "**Outline:**\n", "* What's underfitting/overfitting? How do we prevent it?\n", "* Introduce log-likelihood\n", "* Introduce metrics specific to regression\n", "* Introduce metrics specific to classification" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Data.** You've started a collaboration with doctors from IHH's Center for Rare Disorders. The doctors are interested in better understanding Antenna Inflammation, in which a being's antennas sporadically inflame for weeks at a time. This causes them to malfunction and is quite painful. Because the disease is so rare, it's been difficult to gather enough information to develop a treatment. Currently, the only known treatment is to expose the inflamed antennas to high-energy space beams. While these beams do not cure the disease, they do alleviate the pain. Doctors are interested in better understanding what beam intensity to use; they suspect that a low-intensity beam might not help much, but too high of an intensity might also contribute to pain. They have collected data for you to analyze. Let's take a look:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Intensity | \n", "Comfort | \n", "
---|---|---|
Patient ID | \n", "\n", " | \n", " |
45 | \n", "0.308006 | \n", "0.977233 | \n", "
59 | \n", "0.502982 | \n", "1.359451 | \n", "
7 | \n", "0.125683 | \n", "0.405900 | \n", "
50 | \n", "0.520344 | \n", "1.185005 | \n", "
92 | \n", "0.404359 | \n", "0.955264 | \n", "
27 | \n", "0.471006 | \n", "1.138226 | \n", "
131 | \n", "0.767512 | \n", "1.025494 | \n", "
137 | \n", "0.534193 | \n", "0.828949 | \n", "
122 | \n", "0.661706 | \n", "1.179636 | \n", "
8 | \n", "0.183754 | \n", "0.742592 | \n", "
111 | \n", "0.378024 | \n", "0.670647 | \n", "
16 | \n", "0.416132 | \n", "0.664669 | \n", "
63 | \n", "0.239045 | \n", "0.871169 | \n", "
76 | \n", "0.449077 | \n", "0.933577 | \n", "
123 | \n", "0.216425 | \n", "0.722563 | \n", "