Making Heads or Tails of Randomness
Which of the four graphs below have a collection of truly random dots?
From the ANZSTAT listserv:
If you choose an answer to this question at random,
what is the chance you will be correct?
Had enough? Okay, then, on with the show!
The 25 dots shown in each graph were randomly selected, but four different methods were employed:
- Completely Random: The coordinates of each dot were selected completely at random, independent of the other dots.
- Random-ish: The area was divided into a 5 × 5 grid of squares, and one point was then randomly selected within each square.
- Dartboard: I placed a 5″ × 5″ square on a dartboard and threw darts at it until I hit it 13 times. I then turned the square 90°; and threw darts until I hit it 12 more times. (I’m a reasonable dart player, but yes, there were quite a few darts that landed outside the square.)
- Eli’s Choice: I gave my seven-year-old son a 5″ × 5″ square and asked him to draw 25 random points.
Perhaps those descriptions will better help you determine which graph contains truly random dots. Care to take another look?
Graph C was the most fun; it was Dartboard. I despise dumb-ass probability problems like, “What’s the probability that a randomly thrown dart will land inside the square and outside the circle?” What kind of person throws darts randomly? That’s just stupid, not to mention irresponsible. You’ll put someone’s eye out! Yet throw them randomly I did. And I’m happy to inform you that none of the walls around my dartboard were harmed in the making of this graph.
Graph A was Random-ish. Although there is some white space in the upper right and two dots are somewhat close in the bottom middle, generally the dots are well spaced.
Graph D was Eli’s Choice. Like Random-ish, Eli was rather careful to make sure that none of the dots were too close to one another. I divided the square into 25 unit squares after he drew his dots, and there was a single dot in 23 of them. There are some definite linear tendencies.
By process of elimination, you now know that Graph B was Completely Random. The function =5*RAND() was used to choose each coordinate for all 25 points in an Excel spreadsheet, and then they were graphed as a scatterplot.
So, what’s the point?
Graph D was generated by a seven-year-old, but it’s not terribly different from what would be generated by an average adult. What most people think of as random is usually anything but. Truly random data usually has clusters within it. For instance, flip a coin 200 times, and there is an 80% chance that there will be a run of at least 6 consecutive heads or 6 consecutive tails within your data.
For graph B, the clustering that happens in the upper left and with the two dots that touch in the lower right is completely typical when data is generated at random. As humans, we often try to ascribe some special meaning to these occurrences. For instance, when there are ten homicides in one week in a city that normally only has two homicides per week, headlines will declare “Murder On The Rise,” and the mayor will hold a press conference on what the police are doing to address the problem. But there’s nothing to be done. This kind of grouping is normal. A few weeks later, when there are zero homicides in one week — which is also typical — there will be hardly a notice.
If you didn’t correctly choose graph B, then perhaps you’re random-averse.
Or, more likely, you’re just normal.