7 Data Paradoxes You Should Be Aware Of

Data, statistics, math, numbers – these are exact things we may think, but I have to disappoint you: there are a lot of paradoxes in this science and we have to be aware of them to do our work well and to make better decisions in our day-to-day life. The more things you know the easier it is to spot them in modern world, overwhelmed with information.

Here I present you 7 different paradoxes that exist in science, logic, statistics and math to show that everything is not that clear as it seems from time to time.

1. Prosecutor’s fallacy

Let’s imagine we are in Barcelona in the future where scientists invented the machine that can compare DNA’s and say if they match or not. We are in the court, accused of cruel murder, because a DNA found on the gun is matches with ours and there is no other evidence. Although, we are pretty sure that day we were baking cookies at home. Prosecutor claims, that probability of such a match is 1 out of 1 million and with such a low probability we are definitely guilty – innocent person’s DNA wouldn’t have been found on the crime scene. Seems like we are going to jail.

But, there is a little but. What are the chances that police got a wrong guy? Having the probability of a match 1/1000000 and the population of Barcelona of 5 million people we now actually have 5 persons that can possibly be guilty. Meaning that the probability of our innocence is actually more than 80%! Case is stopped, because it needs more evidence.

The situation described above is known as Prosecutor’s fallacy which is a fallacy of statistical reasoning typically used by a prosecutor to exaggerate the likelihood of a criminal defendant’s guilt. The following claim demonstrates the fallacy in the context of a prosecutor questioning an expert witness: “the odds of finding this evidence on an innocent man are so small that the jury can safely disregard the possibility that this defendant is innocent”. The claim obscures that the likelihood of the defendant’s innocence, given the evidence found on him, in fact depends on the likely quite high prior odds of the defendant being a random innocent person – as well as the stated low odds of finding the evidence on such a random innocent person, not to mention the underlying high odds that the evidence is indeed indicative of guilt.

As you can see, people do not quite understand the power of random nor the power of big numbers. When we say: “Dude, the shit that happened to me today is likely to happen once in 10 million times”, we don’t realize that such a shit happened already 4 times only in Spain (the country’s population is approx. 45 millions). So it will be much weirder when nothing happens then 1 in 10 millions chance shit.

2. Anscombe’s quartet

Data can be tricky. A lot of times we say that numbers don’t lie, that numbers are precise, that numbers tell the truth etc. Yes, it is. All these statements are correct, but if you interpret those numbers well, if you explore them well, if you look at them from different angles.

Let me introduce you to Anscombe’s quartet. It comprises four data sets that have nearly identical simple descriptive statistics, yet have very different distributions and appear very different when graphed. Each dataset consists of eleven (x, y) points. They were constructed in 1973 by the statistician Francis Anscombe to demonstrate both the importance of graphing data before analyzing it and the effect of outliers and other influential observations on statistical properties. He described the article as being intended to counter the impression among statisticians that “numerical calculations are exact, but graphs are rough.”

Basic stats for these datasets are almost equal:
*Mean of X: 9, accuracy: exact
*Sample variance of X: 11, accuracy: exact
*Mean of Y: 7.50, accuracy: to 2 decimal places
*Sample variance of Y: 4.125, accuracy: +-0.003
*Correlation between X and Y: 0.816, accuracy: to 3 decimal places
*Linear regression line: Y=3.00 + 0.500x, accuracy: to 2 and 3 decimal places
*Coefficient of determination of the linear regression: 0.67, accuracy: to 2 decimal places

As you can see, numerically these datasets are identical, but when you plot those data points you realize that there is something fishy there 😉

This is why during data analysis we have to take into account different variables, hypothesis, different tools and approaches. Because by looking at something only from one side we will continue to see the same even on absolutely opposite data. Or as you wish, by stubbornly thinking that you are always correct and never considering another opinion, you will continue to find that everything is the same and never learn anything new. Seems boring to me.

It is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined. In an aggregated set of data we come to one conclusion while if we split this data into groups by some criteria – we end up having totally opposite to previous observations results.

One of the best-known examples of Simpson’s paradox is a study of gender bias among graduate school admissions to the University of California, Berkeley. The admission figures for the fall of 1973 showed that men applying were more likely than women to be admitted, and the difference was so large that it was unlikely to be due to chance. The data showed that 44% of all men where admitted, while this ratio for women were only 35%.

Imagine – this data is published, feminists start their revolts, the university is in the center of scandal. Boot!

When examining the individual departments, it appeared that six out of 85 departments were significantly biased against men, whereas four were significantly biased against women. In fact, the pooled and corrected data showed a “small but statistically significant bias in favor of women” !!!

No way !! Men were discriminated !!! (sarcasm and hyperbole)

Later on, the research paper by Bickel et al. concluded that women tended to apply to competitive departments with low rates of admission even among qualified applicants (such as in the English Department), whereas men tended to apply to less-competitive departments with high rates of admission among the qualified applicants (such as in engineering and chemistry).

As you can see, men are just lazy XD.

This is why it is important to go beyond the data, understand the context, find why the data is presented this way etc.

Yes, our society has a lot of issues, but by thinking more rationally we can find real issues and focus on fixing them and do not distract ourselves with manipulative data.

4. Fallacy of composition

The fallacy of composition arises when one infers that something is true of the whole from the fact that it is true of some part of the whole (or even of every proper part). For example: “This tire is made of rubber, therefore the vehicle of which it is a part is also made of rubber.” This is fallacious, because vehicles are made with a variety of parts, most of which are not made of rubber.

This fallacy is often confused with the fallacy of hasty generalization, in which an unwarranted inference is made from a statement about a sample to a statement about the population from which it is drawn. So no, if you think your girlfriend/boyfriend is stupid and therefore all women/men are stupid – it is hasty generalization or in this particular case it can be called “fallacy of the lonely fact”. We will discuss this later.

Examples:
– No atoms are alive. Therefore, nothing made of atoms is alive.
– Some people can become millionaires with the right business concept. Therefore, if everyone has the right business concept, everyone will become a millionaire.
– If a runner runs faster, he can win the race. Therefore, if all the runners run faster, they can all win the race.
– in economics: total saving may fall because of individuals’ attempts to increase their saving, and, broadly speaking, that increase in saving may be harmful to an economy (paradox of saving).

Berkson’s paradox also known as Berkson’s bias or Berkson’s fallacy is a result in conditional probability and statistics which is often found to be counterintuitive, and hence a veridical paradox. The most common example of Berkson’s paradox is a false observation of a negative correlation between two positive traits, i.e., that members of a population which have some positive trait tend to lack a second. Berkson’s paradox occurs when this observation appears true when in reality the two properties are unrelated – or even positively correlated – because members of the population where both are absent are not equally observed.

For me, the best this paradox is described by Jordan Ellenberg, author of the book “How Not To Be Wrong” (fabulous read, totally recommended).

Suppose you’re a person who dates men. You may have noticed that, among the men in your dating pool, the handsome ones tend not to be nice, and the nice ones tend not to be handsome. Is that because having a symmetrical face makes you cruel? Does it mean that being nice to people makes you ugly? Well, it could be. But it doesn’t have to be.

Now, let’s take as a working hypothesis that men are in fact equidistributed all over this square. In particular, there are nice handsome ones, nice ugly ones, mean handsome ones, and mean ugly ones, in roughly equal numbers.

But niceness and handsomeness have a common effect: They put these men in the group of people that you notice. Be honest – the mean uglies are the ones you never even consider. So inside the Great Square is a Smaller Triangle of Acceptable Men – you can see it on the pic.

Now the source of paradox is clear – our source data is biased. The handsomest men in the triangle, on average, are as nice as average person in the entire population, which, let’s face it, is not that nice. The nicest men are only averagely handsome. The negative correlation between looks and personality in your dating pool is absolutely real. But all this, because you don’t see the entire dataset. And the relation isn’t casual.

6. Hasty generalization

Generalization. We all are sinners in this one from time to time.

In logic and reasoning, a faulty generalization is a conclusion made about all or many instances of a phenomenon, that has been reached on the basis of one or a few instances of that phenomenon. It is an example of jumping to conclusions. For example, one may generalize about all people or all members of a group, based on what they know about just one or a few people:

– If one meets an angry person from a given country X, they may suspect that most people in country X are often angry.
– If one sees only white swans, they may suspect that all swans are white.

Faulty generalizations may lead to further incorrect conclusions. One may, for example, conclude that citizens of country X are genetically inferior, or that people of race Y have better sense of humour.

These are the things that you have to take into consideration when analysing data – if one strategy worked in one particular situation it doesn’t mean it will work in all of them. Yes, there is a probability of such a thing happening, but after one try you cannot be sure and start selling digital course that says this strategy works and it is the best. It seems so obvious, the need of finding enough evidence to prove something works, yet a lot of people still believe that all people from post-soviet countries drink vodka every day and never get drunk, that all Spaniards make siesta every day and all English drink only tea with milk.

7. Will Rogers phenomenon

To describe this one I will quote a person that I think didn’t think much before saying those words (or actually thought a lot), sir Rob Muldoon: “New Zealanders who emigrate to Australia raise the IQ of both countries”.

You would think, how the hell is that possible? Well, I will explain you now and you will understand that the quote above is a bit insulting. Just little bit 😏.

The Will Rogers phenomenon is obtained when moving an element from one set to another set raises the average values of both sets. It is based on the following quote, attributed (perhaps incorrectly) to comedian Will Rogers: “When the Okies left Oklahoma and moved to California, they raised the average intelligence level in both states”. You see where it goes, don’t you? 😉

The effect will occur when both of these conditions are met:
– The element being moved is below average for its current set. Removing it will, by definition, raise the average of the remaining elements.
– The element being moved is above the current average of the set it is entering. Adding it to the new set will, by definition, raise the average.

You are still lost? Consider this illustrative example with two lists:
R={1,2}
S={99, 10,000, 20 000}

Find arithmetic mean of both lists. Now move 99 from S to R and find mean again. Got it?

Now rethink both quotes and have a great day 😁

Hope you enjoyed this list of paradoxes as I did when first found about them and hope now you understand how tricky data and our interpretation of it can be. So watch out!

Karma +1 when you share it: