Skip to main content

When Statistics Replaces Judgment

One may believe that mathematical logic does not follow legal practitioners in the courtroom. However, logical fallacies are notorious in matters of conviction, especially when they occur due to incorrect probabilistic figures. In other words, people are wrongfully convicted because the prosecution convinces the jury through flawed statistical evidence. Though not majorly, Bayesian logic and probability seem to be at play here.

Of course, it seems unreasonable to assign probabilities to events you know have or have not happened. A person being guilty has a 100% chance of conviction, and if he is not guilty, then the probability of that is 0%. A figure in the middle has no business here. But we may go on about assigning probabilities when we do not know, and Bayesianism assists us in doing this rationally. Bayesian probability’s central concept is held by a posterior probability, i.e., the updated probability of an event occurring after considering new evidence or data. Generally, this type is considered more reliable in comparison to a priori probability, which is sensitive to new evidence.

A legal case involving the wrongful imprisonment of Sally Clark over a double cot-death occurred in the late 1990s. The deaths occurred due to sudden infant death syndrome (SIDS). The prosecution’s expert testified that the chance of this happening by accident was one in 73 million, since many cot deaths were attributed to Munchausen by proxy. Whether this is true remains controversial. However, the prosecution’s claim overlooks the fact that one cot death makes the second more likely genetically. Therefore, the two events are not independent. The court should have focused on the probability that the mother was a double murderer; instead, it focused on the probability that a randomly chosen family experiences two cot deaths. Ray Hill, a mathematician, found from his statistical analysis that Clark being guilty was only 10–20%.

Another case involved accusing a nurse of four deaths and three attempted murders. The prosecution claimed that the chance of these being an accident was one in 342 million, and this seems to be the case of “double-dipping.” It happens when you do not separate your data into training and testing phases. This is how one would generate a hypothesis and test it, which inflates significance, leading to overoptimistic results and invalid conclusions. This occurs because you are essentially testing on data already “seen” and influenced by chance. This makes the a posteriori probability a distorted version of the a priori probability, i.e., independent evidence (priori and data) appear to agree with each other, when actually these two are the same. Thus, the values are not statistically significant as evidence of guilt.

There was a review of Bayesian reasoning in legal cases published in 2016. It was analysed that the use of statistics in legal proceedings had increased considerably over the years. The review concluded that the lack of impact was due to misconceptions regarding Bayes’ theorem. It supported the use of the Bayesian network technique, which represents most, if not all, factors surrounding a case in the form of a weighted directed graph. This graph is capable of modeling the causal context of the evidence.

These examples demonstrate that probability, when detached from its assumptions and context, can become misleading rather than informative. The challenge for the legal system, therefore, is not whether to use Bayesian reasoning, but how to ensure it is applied with sufficient rigor to prevent erroneous convictions.

Comments

Popular posts from this blog

Understanding Turing Machine through DNA Computation-2

Following our last expedition of  Turing Machine, let us now see it apply to DNA computing.  The information in the DNA is represented using the four-character genetic alphabet- A [adenine], G [guanine], C [cytosine], and T [thymine]. These four are known as bases and are linked through deoxyribose. This connection between the bases follows a direction such that the lock-and-key principle is achieved -  (A)(T) and (C)(G) This means Adenine (A) pairs with Thymine (T), and Cytosine (C) pairs with Guanine (G). These bases make DNA a computational medium. Drawing the analogy from traditional computers to DNA computing, we see that while the former process formation sequentially, DNA computing allows for parallelism, the ability to perform many computations simultaneously by leveraging the vast number of DNA molecules. This significantly speeds up the process.  DNA computing uses biochemical reactions and DNA molecules instead of silicon-based computing like conven...

Topologist's Sine Curve

We have all been familiar with the 'sine' graph in our 10th-grade math. Nice and simple to plot, with no complications. However, a spin-off version of this graph exists, laying the foundations for topological spaces. Let S = {(x, sin 1/x): 0<x<=1}, then, S' ={(x, sin 1/x): 0<x<=1}U { 0 × [ − 1 , 1 ]}  has the capacity to oscillate infinitely in a finite domain! This occurs as 1/x becomes larger as x approaches zero.  This curve is clearly connected on the real line; however, it is not path-connected. That means if we want to go from a certain point to (0,0), we can surely walk in its direction on the curve, but we would still be a long way from the origin because of infinite oscillations near it. It tries to settle down at the origin, but keeps spiraling into existential jitters instead. Even though sin(1/x) is connected, it is not locally connected, as we cannot find connected open sets on the y-axis (recall S is the union of S with {0 x [-1,1]}). Consider ma...

Measure Theory with respect to Soap Bubbles - 2

Since last time, we now know the crux of Measure Theory. So, let us proceed to our minimal surface. Its definition is derived quite a bit from its name: a surface with mean curvature as zero and which locally minimizes the area. Minimal surfaces solve the problem of minimizing the surface area in each class of surfaces. For example, in physical problems, a soap film spanning a wire frame naturally forms a minimal surface, as it minimizes the surface area for the given boundary. We went over something called ' countable additivity ' and though its implication remains the same, our application of it changes from a coordinate plane to a sphere (a soap bubble). The advantage of a sphere being Lebesgue measurable is that it can be broken down into smaller regions or sets such that countable additivity holds. One such way to think about it is to decompose the surface into two disjoint hemispheres.  Moreover, this can be applied to multiple connected spheres or in our case, bubb...