# Simpson’s Paradox: Explained in Simple Terms

The **Simpson’s Paradox** occurs when several groups of data show a direction but the effect reverses when they are combined.

A real-life example of this paradox is “**Kidney Stone Treatment**”. After comparing the success rates of two treatments for kidney stones, the following results can be seen:

Based on the **overall success rate**, *Treatment B* is the obvious choice since it has a higher success rate. Things get nasty when we segment treatments according to **stone size**. Now the data are reversed, *Treatment A *appears to be the better treatment.

## Which treatment should we choose?

The paradox can be understood by choosing *Treatment A** if you have a small stone, and **Treatment A** again if you have a large stone.*

## When does this paradox happen?

**Different sample sizes**. Due to the high number of cases in groups 2 and 3, the total number heavily depends on them.**Confounding variables**. The stone size is a confounding variable here. Since the success rate is influenced more by the severity of the case (Stone Size) than treatment choice (Success rates are higher in small stone sizes).

### The next time you’re segmenting look for:

- The numbers/sample sizes alongside the percentages (Avinash Kaushik’s mantra).
- Factors influencing the data that are not shown
- Create causal diagrams or identify confounding variables.

*Resources:*