Skip to main content

When Statistics Replaces Judgment

One may believe that mathematical logic does not follow legal practitioners in the courtroom. However, logical fallacies are notorious in matters of conviction, especially when they occur due to incorrect probabilistic figures. In other words, people are wrongfully convicted because the prosecution convinces the jury through flawed statistical evidence. Though not majorly, Bayesian logic and probability seem to be at play here.

Of course, it seems unreasonable to assign probabilities to events you know have or have not happened. A person being guilty has a 100% chance of conviction, and if he is not guilty, then the probability of that is 0%. A figure in the middle has no business here. But we may go on about assigning probabilities when we do not know, and Bayesianism assists us in doing this rationally. Bayesian probability’s central concept is held by a posterior probability, i.e., the updated probability of an event occurring after considering new evidence or data. Generally, this type is considered more reliable in comparison to a priori probability, which is sensitive to new evidence.

A legal case involving the wrongful imprisonment of Sally Clark over a double cot-death occurred in the late 1990s. The deaths occurred due to sudden infant death syndrome (SIDS). The prosecution’s expert testified that the chance of this happening by accident was one in 73 million, since many cot deaths were attributed to Munchausen by proxy. Whether this is true remains controversial. However, the prosecution’s claim overlooks the fact that one cot death makes the second more likely genetically. Therefore, the two events are not independent. The court should have focused on the probability that the mother was a double murderer; instead, it focused on the probability that a randomly chosen family experiences two cot deaths. Ray Hill, a mathematician, found from his statistical analysis that Clark being guilty was only 10–20%.

Another case involved accusing a nurse of four deaths and three attempted murders. The prosecution claimed that the chance of these being an accident was one in 342 million, and this seems to be the case of “double-dipping.” It happens when you do not separate your data into training and testing phases. This is how one would generate a hypothesis and test it, which inflates significance, leading to overoptimistic results and invalid conclusions. This occurs because you are essentially testing on data already “seen” and influenced by chance. This makes the a posteriori probability a distorted version of the a priori probability, i.e., independent evidence (priori and data) appear to agree with each other, when actually these two are the same. Thus, the values are not statistically significant as evidence of guilt.

There was a review of Bayesian reasoning in legal cases published in 2016. It was analysed that the use of statistics in legal proceedings had increased considerably over the years. The review concluded that the lack of impact was due to misconceptions regarding Bayes’ theorem. It supported the use of the Bayesian network technique, which represents most, if not all, factors surrounding a case in the form of a weighted directed graph. This graph is capable of modeling the causal context of the evidence.

These examples demonstrate that probability, when detached from its assumptions and context, can become misleading rather than informative. The challenge for the legal system, therefore, is not whether to use Bayesian reasoning, but how to ensure it is applied with sufficient rigor to prevent erroneous convictions.

Comments

Popular posts from this blog

Understanding Turing Machine through DNA Computation-2

Following our last expedition of  Turing Machine, let us now see it apply to DNA computing.  The information in the DNA is represented using the four-character genetic alphabet- A [adenine], G [guanine], C [cytosine], and T [thymine]. These four are known as bases and are linked through deoxyribose. This connection between the bases follows a direction such that the lock-and-key principle is achieved -  (A)(T) and (C)(G) This means Adenine (A) pairs with Thymine (T), and Cytosine (C) pairs with Guanine (G). These bases make DNA a computational medium. Drawing the analogy from traditional computers to DNA computing, we see that while the former process formation sequentially, DNA computing allows for parallelism, the ability to perform many computations simultaneously by leveraging the vast number of DNA molecules. This significantly speeds up the process.  DNA computing uses biochemical reactions and DNA molecules instead of silicon-based computing like conven...

Measure Theory with respect to Soap Bubbles - 2

Since last time, we now know the crux of Measure Theory. So, let us proceed to our minimal surface. Its definition is derived quite a bit from its name: a surface with mean curvature as zero and which locally minimizes the area. Minimal surfaces solve the problem of minimizing the surface area in each class of surfaces. For example, in physical problems, a soap film spanning a wire frame naturally forms a minimal surface, as it minimizes the surface area for the given boundary. We went over something called ' countable additivity ' and though its implication remains the same, our application of it changes from a coordinate plane to a sphere (a soap bubble). The advantage of a sphere being Lebesgue measurable is that it can be broken down into smaller regions or sets such that countable additivity holds. One such way to think about it is to decompose the surface into two disjoint hemispheres.  Moreover, this can be applied to multiple connected spheres or in our case, bubb...

Anatomy of Our Bayesian Brain

When I ask, how many of you believed in Santa Claus or the tooth fairy for the larger part of your lives, I already know it will be all of you because as kids your malleable brains were still developing to assess the evidence. Here is when Reverend Thomas Bayes comes in with his ‘Bayes’ Theorem’.  Bayes Theorem While his theorem remained unpublished during his lifetime, it became greatly beneficial in many areas. Bayes' theorem states that the probability of A given B is the same as the probability of B given A times the probability of A, divided by the probability of B. It sounds a bit mouthful but it allows us to determine the chance of certain events happening when we know the probabilities of other related events beforehand.  Ever wondered how the weather is forecasted? By using the Bayes theorem. To understand the formula better, let us get our hands dirty by doing some math. What is the first sign you would look for it to rain? The clouds. So, to find the probability of ...