## Posts tagged ‘mean’

### Why Joe Doesn’t Think He’s Average

Today is an average day, with 182 days left in the year, and 182 days in the rear-view mirror.

I know a lot of average jokes:

When my stats teacher said that I was average, she was just being mean.

With my head in a fire

And my feet on some ice,

I’d say that, on average,

I feel rather nice.Two men are sitting in a bar when Mark Zuckerberg walks in. One of the men says to his friend, “How awesome! On average, everyone in this bar is a billionaire!”

The last joke highlights the issue with using the arithmetic mean when the distribution would be more meaningful. Let’s assume that one person in the bar has a net worth less than $10,000, two people have net worth between $10,000 and $100,000, and nine people have net worth between $100,000 and $1,000,000. (These are reasonable estimates for the distribution of net worth in the U.S., by the way.) Then a hypothetical histogram with a logarithmic scale showing the net worth of all the people in the bar might look something like this:

The average net worth of all 13 people in the bar is over $1,000,000,000 — actually, it’s over $4,000,000,000, because Zuckerberg’s net worth is around $60 billion — but only one of them actually has that much money.

In general, describing data with its average is a terrible idea…

If you’re an “average” person, then you’re a 5’9” (10%) male (50%) with brown eyes (55%) and straight (55%), black hair (85%) who wears size 10.5 (US) shoes (20%). You have 25 teeth in your mouth (30%) — including 4 wisdom teeth (80%) — normal color vision (95%), and O+ blood (40%), but you don’t have dimples (75%). You don’t have hitchhiker’s thumb (75%) or a bent little finger (95%), either, but you can roll your tongue (80%). You have an innie belly button (90%), loops in your fingerprints (65%) instead of whorls or arches, and attached earlobes (65%), and when you interlace your fingers, your left thumb rests atop your right thumb (55%). You sleep 6.5 hours per night (15%), smoke 800 cigarettes per year (10%), and consume 2 alcoholic drinks per week (30%). You’re 29 years old (2%), eat 3 servings of fruits and vegetables a day (20%), get 70 minutes of cardio exercise per week (25%), and have a body mass index (BMI) of 25 (15%).

And thanks to the artistic styling of Paul Wrangles at Sparky Teaching, the average person might look a little something like this:

The numbers in parentheses represent the percent of the world population that has the given characteristic. Admittedly, they’re WAGs; I grabbed each statistic from a random location on the web, and I have absolutely no data to back up any of these claims. Moreover, they’re not very precise; I rounded each to the nearest five percent, because a greater level of precision might give the appearance that they’re somehow more accurate.

That said, I don’t think they’re horribly wrong, either, and even if they’re slightly off, they’ll still serve my point. Which is this: Though this description captures an “average” person, it’s pretty far from representing a typical person. The probability that such a person actually exists is only about 1 in 3,500,000,000.

So if you read the description above and thought, “Hey, that’s me!” then you should feel pretty special, indeed — there is likely only one other person in the world with those same characteristics.

The characteristics for the average person used above are sometimes the **mean** for the category (height, shoe size) and sometimes the **mode** (eye color, fingerprints). Both of these measures of central tendency are known as averages, as is the **median** (which I used at the start of this post when claiming that today is an average day).

Life expectancy is another one of those situations where the average provides misleading — or, at least incomplete — information.

Today, the infant mortality rate worldwide is just under 5%, and life expectancy is 71 years. A very simplistic model for this data is to assume that 19 out of 20 people will live to age 75, but 1 out of 20 will die during their first year of life. This model is clearly wrong, but as George Box said, “All models are wrong, but some are useful.” Check out the math with this model:

This model is useful, because it shows that the 95% of people who survive infancy can expect to live to age 75.

Now compare that to the middle ages, when the infant mortality rate was a staggering 30%, and life expectancy was 35 years. Again using a simplistic model, 7 out of 10 people would live to age 50, while 3 out of 10 would die before they reached the age of 1. The math looks like this:

So, there’s a problem with using the average to talk about life expectancy, because the distribution in the middle ages was badly skewed by so many childhood deaths.

If we compare life expectancy now to the middle ages using the average of the entire population, it’s a distorted picture. But when we remove the deaths as a result of infant mortality, it’s a little less bleak: those living past age 1 today have a life expectancy of 75 years; those living past age 1 in the middle ages had a life expectancy of 50 years. The scales are still tipped heavily in our favor, but it doesn’t seem quite as drastic as a ratio of 71 to 35.

To put this in perspective, the life expectancy in 1950 was just under 50 years. Most of the increase in life expectancy has actually happened in the last century; during the last 70 years, longevity has increased by more than 20 years.

How typical are you? How long will you live? I have no idea, but I do know this: Half of the people you know are below average.

### More Number Picking

In a previous post, I mentioned the Pick-a-Number game that the folks at NPR’s Planet Money were running:

Pick a number between 0 and 100. The goal is to pick the number that’s closest to

halfthe average of all guesses. For example, if the average of all guesses were 80, the winning number would be 40.

If everyone picked randomly, you would expect the mean to be approximately 50, in which case the winning number would be 25. So, you’d choose 25, right? But if everyone uses that same logic, then the mean would be 25, and the winning number would be 12.5. So, you’d choose 12.5, right? But if everyone used that same logic…

Well, you get the point.

When making your choice, it starts to feel like a game against Vizzini, the Sicilian from *Princess Bride*.

Only a great fool would reach for what he was given. I am not a great fool, so I can clearly not choose half the expected mean. But you must have known I was not a great fool, so I can clearly not choose half of half the expected mean…

Well, the results are in, and you can view them (and an explanation) here.

I take a minimal level of pride in receiving one of 772 honorable mentions for my guess of 12. (Don’t look for my name in the list, though. I used my son’s name as a pseudonym.)

Here’s a very simple pick-a-number game:

Pick a number between 12 and 5.

Make your pick before reading the next paragraph.

Did you pick 7? Most people do. My theory is that the magnitude and order of the numbers matters. Because the larger number is given first, and because the difference between the numbers falls within the appropriate range (12 – 5 = 7), it’s the “obvious” choice.

The trick would probably work equally well if the set-up were, “Pick a number between 19 and 6.” I suspect the most common choice would be 13.

Of course, this is just pop math psychology.

Speaking of “picking” and “numbers,” here’s a line a friend of mine used on an attractive waitress:

How can it be it that I’ve memorized the first 100 digits of π, yet I don’t know the 7 digits in your phone number?

For the record, I condone neither hitting on a waitress nor using that line.

### What (Math) is in a Name?

One of my favorite online tools is the Mean and Median app from Illuminations. This tool allows you to create a data set with up to 15 elements, plot them on a number line, investigate the mean and median, and consider a box-and-whisker plot based on the data. Perhaps the coolest feature is that you can copy an entire set of data, make some changes, and compare the modified set to the original set. For example, the box-and-whisker plots below look very different, even though the mean and median of the two sets are the same.

It’s a neat tool for learning about mean and median, and I plan to use this tool in an upcoming presentation.

**Exceptional, Free Online Resources for the Middle Grades Classroom**

*G. Patrick Vennebush*

Thursday, October 20, 12:30-2:00pm

Room 401 (Atlantic City Convention Center)

For classroom use, I like to use this app with real sets of data. However, the app requires all elements of a data set to be integers from 1-100. Can you think of a data set with a reasonable spread that has no (or at least few) elements greater than 100? If so, leave a comment.

Recently, and rather accidentally, I found a data set that works well. Do the following:

Assign each letter of the alphabet a value as follows: A = 1, B = 2, C = 3, and so on. Find the sum of the letters in your name; e.g., BOB → 2 + 15 + 2 = 19.

Now imagine that every student in a class finds the sum of the letters in their first name. For a typical class, what is the range of the data? What is the mean and median?

The name with the smallest sum that I could find?

ABE → 1 + 2 + 5 = 8

The name with the largest sum?

CHRISTOPHER → 3 + 8 + 18 + 9 + 19 + 20 + 15 + 16 + 8 + 5 + 18 = 139

The Social Security Administration provides a nice resource for investigation, Popular Baby Names. Using a randomly selected set of 2,000 names and an Excel spreadsheet, I found the mean name sum to be 62.49, and 96% of the names had sums less than 100. Of the 80 names with sums greater than 100, many (such as Christopher, Timothy, Gwendolyn, Jacquelyn) have shortened forms (Chris, Tim, Gwen, Jackie) for which the sum is less than 100.

As it turns out, the frequency with which letters occur in first names differs from their frequency in common English words. The most common letter in English words is *e*, but the most common letter in names is *a*. The chart below shows the frequency with which letters occur in first names.

Because of this distribution, the average value of a letter within a first name is 10.54, which is slightly less than the 13.50 you might expect. This is because letters at the beginning of the alphabet, which contribute smaller values to the name sum, occur more often in names than letters at the end of the alphabet.

The chart below shows the distribution for the number of letters within first names. The mean number of letters within first names is 5.92 letters, and the median is 6. (In the data set of 2,000 names from which this chart is derived, no name contained more than 11 letters.)

Do you know a name that has more than 11 letters or has a name sum greater than 139 or less than 8? Let me know in the comments.

### Mean and Standard Deviation

As a follow-up to yesterday’s post, here’s a poem titled *Mean and SD* by Norman Chansky, professor emeritus at Temple University. Ostensibly, the poem first appeared in the Journal of Irreproducible Results, though I was unable to find an exact citation.

The mean is a measure of location,

The center of a population.

If at random a score you drew,

The mean’s the most likely score you’d view.You can compute the mean in your slumber:

Sum the scores, and divide by the number.

At the mean, sample scores converge;

From the mean, these scores diverge.

Near the mean, the scores are many.

In the tails, there are hardly any.But to measure a distribution’s variation,

From the mean, find each score’s deviation.

Each difference ofDscore, now you square.

Sum allDscores, all scores’ share.

Now this sum, divide byN.

That’sV, the variance, then.The square root of

Vis calledSD,

The gauge of a trait’s variability.

We’ve found two moments of a distribution,

Developed from each score’s contribution.Picturing a universe, try to see:

Its center, the mean; its orbit,SD.