What are the implications of advances in brain science for the justice system? This was the topic of a panel discussion held Tuesday at MIT’s McGovern Institute and moderated by Alan Alda in conjunction with the premier of his new PBS special, “Brains on Trial”. The discussion ranged from using fMRI for lie-detection to using it to help determine the reliability of an eye-witness.
In general, the panel rightly pointed out practical limitations of these technologies. Panelist Nancy Kanwisher highlighted, for example, that research on lie-detection is done in a controlled, non-threatening environment from which we may be unable to generalize to criminal courts where the stakes are high.
While I was sympathetic to most of this discussion, I was puzzled by one point that the panel raised several times: the problematic nature of applying data based on a group of people to say something about an individual (e.g., this particular defendant). To present a simplified example: even if we could rigorously show a measurable difference in brain activity between a group of people who told a lie in the imager and a group of people who told the truth, we cannot conclude that an individual is lying if he shows an activity pattern similar to the liars. Since the justice system makes decisions on individuals, therefore, use of group data is problematic.
To me, this categorical objection to group data seems a bit odd, and this is why: I can’t see how group data is conceptually different from ordinary circumstantial evidence.
My understanding of circumstantial evidence (please correct me if I’m wrong – I am no lawyer), is that it is any evidence from which one must infer a conclusion. If a person is stabbed and a bloody knife is found at the defendant’s house, this is circumstantial evidence from which one might infer the defendant’s culpability.
Of course, my inference could be plain wrong. It might be that this particular knife was bloodied by killing a chicken or was planted by the real stabber. So I need to build a case to increase my confidence that my inference of culpability is correct. I seek other circumstantial evidence and update my confidence accordingly. If I learn that the defendant has a dead chicken in his house, I decrease my confidence in my prior; if someone saw the defendant exiting the site of the stabbing, I increase my confidence in my prior. This is a familiar limitation of circumstantial evidence.
Furthermore, we do not ascribe equal weight to each piece of circumstantial evidence: some pieces are stronger than others. How do we make these strength-judgments?
We might say something like “how often is it the case that “P” and my inference is wrong?” where “P” could be “a bloody knife is found at someone’s house”, or “a bloody knife is found at someone’s house AND that person had a motive”. If the answer is “almost never”, this is strong evidence. If the answer is “quite often”, this is weak evidence. That is, my judgment of strength is an intuitional assignment of the probability of my inference being true in this individual case based on my mental aggregation of a number of hypothetical cases.
But doesn’t this reasoning seem very similar to what the panel was saying is problematic about applying group data to individuals in criminal courts? In fact, doesn’t it look even more problematic because I am assigning a strength to circumstantial evidence based on my folk intuitions about probabilities, many of which may be biased or at best of unknown reliability?
If we answer these questions in the affirmative, we might become equally if not more troubled by the application in courts of circumstantial evidence as of aggregate cognitive science results. But if we think that circumstantial evidence is ok, perhaps so are inferences about individuals from aggregate data – as long as we scrutinize its weight as we do circumstantial evidence. Neither is a magic bullet that pierces the heart of truth, but both can be (but need not be) informative.
But there is also a stronger conclusion: perhaps we should prefer inference from group data because we can use statistical tools to get quantitative estimates of sensitivity and specificity and thus assign “strength” in individual cases in a more evidence-based way. For example, the “likelihood ratio (L)” is used in the context of prenatal screening to estimate (based on aggregate frequency data) the likelihood that in this individual case, there is a chromosomal abnormality like trisomy 21 (Down Syndrome).
What do you think?