12. The Ethics of Learning from Data#
Context: In the past few chapters, we’ve developed the machinery necessary to fit ML models to data. We can then use these fitted models for a variety of tasks, like answering scientific questions, making future predictions, etc. With this new power, we also have a new responsibility: we must understand the limitations of what our models can learn from data. By this, we don’t mean what types of data sets a model can successfully capture—we mean what conclusions can we ethically and responsibly draw about people and society, and how these conclusions should inform social change.
Challenge: As before, nothing about the technical materials we’ve presented so far can point us in the right direction. We must turn to our colleagues who study the broader sociotechnical system in which we live to help us interrogate our practice.
Outline: There are three questions we will focus on.
What conclusions can ML draw from data?
Is science objective?
Are data-driven systems objective?
All three questions are fundamentally concerned with the ethics of applying insights derived from our models to the broader society.
12.1. What conclusions can ML draw from data?#
Exercise: Generalizability of ML Systems.
Read, The myth of generalisability in clinical research and machine learning in health care.
What is generalizability? And what are the different “levels” of generalizability?
Why might an ML system struggle to generalize? (Hint: see green panel).
In science, why do we value generalizability?
What’s the argument in favor of geographic generalizability?
What’s the argument against geographic generalizability?
Do you consider generalizability a component of fairness when evaluating ML systems? Why or why not?
12.2. Is science objective?#
Exercise: Objectivity of Science
Part 1: Read, 5 things journalists should know about statistical significance in research.
What is statistical significance testing?
What are p-values?
What are the scientific criticisms against statistical significance testing?
P-values are a human-created tool that has made its way into scientific standards, which, like other standards, are debated and negotiated throughout time. How can such standards shape scientific discourse? Use p-values as an example and think of: What types of scientific discoveries gain public visibility? What kinds of scientific questions are more likely to be studied and to get funded? What kind of scientists get rewarded with more power and influence?
How can we tell if the standards used in ML are promoting good scientific practices? For example, should we use the MLE?
Part 2: Read, Unpacking the flawed science cited in the Texas abortion pill ruling.
What was scientifically flawed about the arguments made by the plaintiff?
Science is based on individuals posing hypotheses they believe are likely to be true. For example, in this article, scientific arguments were made in support of both sides of a legal case. What does this reflect on our scientific ethos, that science is a method to objectively investigate the world?
Reflecting on your own identity, values, and lived experience, where do you think your own biases might creep into your scientific process?
If you wanted to use science for important social decisions, how can you do it responsibly?
12.3. Are data-driven systems objective?#
Exercise: Objectivity of ML Systems
Part 1: Read AI Myths: AI can be objective/unbiased.
What are the types of AI biases the author describes?
What are the barriers to “fixing” AI bias?
What are the criticisms against the framing of AI harms as “biases”?
What is “AI inevitability”? And what’s the author’s critique of it?
Part 2:
Looking at the readings about the objectivity of science and of data-driven systems, are the problems identified technical? cultural? both?
What are the pillars of responsible and ethical automated learning systems? What practices will set us up for success?