The Value of Displaying Data Well

Posted on September 1, 2009  Comments (1)

Anscombe’s quartet: all four sets are identical when examined statistically, but vary considerably when graphed. Image via Wikipedia.

Anscombe’s quartet comprises four datasets that have identical simple statistical properties, yet are revealed to be very different when inspected graphically. Each dataset consists of eleven (x,y) points. They were constructed in 1973 by the statistician F.J. Anscombe to demonstrate the importance of graphing data before analyzing it, and of the effect of outliers on the statistical properties of a dataset.

Of course we also have to be careful of drawing incorrect conclusions from visual displays.

For all four datasets:

Property Value
Mean of each x variable 9.0
Variance of each x variable 10.0
Mean of each y variable 7.5
Variance of each y variable 3.75
Correlation between each x and y variable 0.816
Linear regression line y = 3 + 0.5x

Edward Tufte uses the quartet to emphasize the importance of looking at one’s data before analyzing it in the first page of the first chapter of his book, The Visual Display of Quantitative Information.

Related: Edward Tufte’s: Beautiful EvidenceSimpson’s ParadoxCorrelation is Not CausationSeeing Patterns Where None ExistsGreat ChartsPlaying Dice and Children’s NumeracyTheory of Knowledge

One Response to “The Value of Displaying Data Well”

  1. Curious Cat Science and Engineering Blog » Florence Nightingale: The passionate statistician
    November 21st, 2009 @ 10:20 am

    [...] articles on applied statistics – The Value of Displaying Data Well – Statistics for Experimenters – Playing Dice and Children’s Numeracy – [...]

Leave a Reply