## Posts tagged ‘median’

### Why Joe Doesn’t Think He’s Average

Today is an average day, with 182 days left in the year, and 182 days in the rear-view mirror.

I know a lot of average jokes:

When my stats teacher said that I was average, she was just being mean.

With my head in a fire
And my feet on some ice,
I’d say that, on average,
I feel rather nice.

Two men are sitting in a bar when Mark Zuckerberg walks in. One of the men says to his friend, “How awesome! On average, everyone in this bar is a billionaire!”

The last joke highlights the issue with using the arithmetic mean when the distribution would be more meaningful. Let’s assume that one person in the bar has a net worth less than $10,000, two people have net worth between$10,000 and $100,000, and nine people have net worth between$100,000 and $1,000,000. (These are reasonable estimates for the distribution of net worth in the U.S., by the way.) Then a hypothetical histogram with a logarithmic scale showing the net worth of all the people in the bar might look something like this: The average net worth of all 13 people in the bar is over$1,000,000,000 — actually, it’s over $4,000,000,000, because Zuckerberg’s net worth is around$60 billion — but only one of them actually has that much money.

In general, describing data with its average is a terrible idea…

If you’re an “average” person, then you’re a 5’9” (10%) male (50%) with brown eyes (55%) and straight (55%), black hair (85%) who wears size 10.5 (US) shoes (20%). You have 25 teeth in your mouth (30%) — including 4 wisdom teeth (80%) — normal color vision (95%), and O+ blood (40%), but you don’t have dimples (75%). You don’t have hitchhiker’s thumb (75%) or a bent little finger (95%), either, but you can roll your tongue (80%). You have an innie belly button (90%), loops in your fingerprints (65%) instead of whorls or arches, and attached earlobes (65%), and when you interlace your fingers, your left thumb rests atop your right thumb (55%). You sleep 6.5 hours per night (15%), smoke 800 cigarettes per year (10%), and consume 2 alcoholic drinks per week (30%). You’re 29 years old (2%), eat 3 servings of fruits and vegetables a day (20%), get 70 minutes of cardio exercise per week (25%), and have a body mass index (BMI) of 25 (15%).

And thanks to the artistic styling of Paul Wrangles at Sparky Teaching, the average person might look a little something like this:

The numbers in parentheses represent the percent of the world population that has the given characteristic. Admittedly, they’re WAGs; I grabbed each statistic from a random location on the web, and I have absolutely no data to back up any of these claims. Moreover, they’re not very precise; I rounded each to the nearest five percent, because a greater level of precision might give the appearance that they’re somehow more accurate.

That said, I don’t think they’re horribly wrong, either, and even if they’re slightly off, they’ll still serve my point. Which is this: Though this description captures an “average” person, it’s pretty far from representing a typical person. The probability that such a person actually exists is only about 1 in 3,500,000,000.

So if you read the description above and thought, “Hey, that’s me!” then you should feel pretty special, indeed — there is likely only one other person in the world with those same characteristics.

The characteristics for the average person used above are sometimes the mean for the category (height, shoe size) and sometimes the mode (eye color, fingerprints). Both of these measures of central tendency are known as averages, as is the median (which I used at the start of this post when claiming that today is an average day).

Life expectancy is another one of those situations where the average provides misleading — or, at least incomplete — information.

Today, the infant mortality rate worldwide is just under 5%, and life expectancy is 71 years. A very simplistic model for this data is to assume that 19 out of 20 people will live to age 75, but 1 out of 20 will die during their first year of life. This model is clearly wrong, but as George Box said, “All models are wrong, but some are useful.” Check out the math with this model:

$\frac{19 \times 75 + 1 \times 0}{20} \approx 71$

This model is useful, because it shows that the 95% of people who survive infancy can expect to live to age 75.

Now compare that to the middle ages, when the infant mortality rate was a staggering 30%, and life expectancy was 35 years. Again using a simplistic model, 7 out of 10 people would live to age 50, while 3 out of 10 would die before they reached the age of 1. The math looks like this:

$\frac{7 \times 50 + 3 \times 0}{10} \approx 35$

So, there’s a problem with using the average to talk about life expectancy, because the distribution in the middle ages was badly skewed by so many childhood deaths.

If we compare life expectancy now to the middle ages using the average of the entire population, it’s a distorted picture. But when we remove the deaths as a result of infant mortality, it’s a little less bleak: those living past age 1 today have a life expectancy of 75 years; those living past age 1 in the middle ages had a life expectancy of 50 years. The scales are still tipped heavily in our favor, but it doesn’t seem quite as drastic as a ratio of 71 to 35.

To put this in perspective, the life expectancy in 1950 was just under 50 years. Most of the increase in life expectancy has actually happened in the last century; during the last 70 years, longevity has increased by more than 20 years.

How typical are you? How long will you live? I have no idea, but I do know this: Half of the people you know are below average.

### What (Math) is in a Name?

One of my favorite online tools is the Mean and Median app from Illuminations. This tool allows you to create a data set with up to 15 elements, plot them on a number line, investigate the mean and median, and consider a box-and-whisker plot based on the data. Perhaps the coolest feature is that you can copy an entire set of data, make some changes, and compare the modified set to the original set. For example, the box-and-whisker plots below look very different, even though the mean and median of the two sets are the same.

It’s a neat tool for learning about mean and median, and I plan to use this tool in an upcoming presentation.

For classroom use, I like to use this app with real sets of data. However, the app requires all elements of a data set to be integers from 1-100. Can you think of a data set with a reasonable spread that has no (or at least few) elements greater than 100? If so, leave a comment.

Recently, and rather accidentally, I found a data set that works well. Do the following:

Assign each letter of the alphabet a value as follows: A = 1, B = 2, C = 3, and so on. Find the sum of the letters in your name; e.g., BOB → 2 + 15 + 2 = 19.

Now imagine that every student in a class finds the sum of the letters in their first name. For a typical class, what is the range of the data? What is the mean and median?

The name with the smallest sum that I could find?

ABE → 1 + 2 + 5 = 8

The name with the largest sum?

CHRISTOPHER → 3 + 8 + 18 + 9 + 19 + 20 + 15 + 16 + 8 + 5 + 18 = 139

The Social Security Administration provides a nice resource for investigation, Popular Baby Names. Using a randomly selected set of 2,000 names and an Excel spreadsheet, I found the mean name sum to be 62.49, and 96% of the names had sums less than 100. Of the 80 names with sums greater than 100, many (such as Christopher, Timothy, Gwendolyn, Jacquelyn) have shortened forms (Chris, Tim, Gwen, Jackie) for which the sum is less than 100.

As it turns out, the frequency with which letters occur in first names differs from their frequency in common English words. The most common letter in English words is e, but the most common letter in names is a. The chart below shows the frequency with which letters occur in first names.

Because of this distribution, the average value of a letter within a first name is 10.54, which is slightly less than the 13.50 you might expect. This is because letters at the beginning of the alphabet, which contribute smaller values to the name sum, occur more often in names than letters at the end of the alphabet.

The chart below shows the distribution for the number of letters within first names. The mean number of letters within first names is 5.92 letters, and the median is 6. (In the data set of 2,000 names from which this chart is derived, no name contained more than 11 letters.)

Do you know a name that has more than 11 letters or has a name sum greater than 139 or less than 8? Let me know in the comments.

Inquiring minds want to know, so here are answers to questions that you’ve surely been pondering.

Q: If one man can wash one stack of dishes in one hour, how many stacks of dishes can four men wash in four hours?
A: None. They’ll all sit down together to watch football.

Q: Why don’t members of the Ku Klux Klan study Calculus?
A: Because they don’t like to integrate.

Q: What did the circle say to the tangent line?
A: “Stop touching me!”

Q: Why did the statistician cross the interstate?
A: To analyze data on the other side of the median.

The Math Jokes 4 Mathy Folks blog is an online extension to the book Math Jokes 4 Mathy Folks. The blog contains jokes submitted by readers, new jokes discovered by the author, details about speaking appearances and workshops, and other random bits of information that might be interesting to the strange folks who like math jokes.

## MJ4MF (offline version)

Math Jokes 4 Mathy Folks is available from Amazon, Borders, Barnes & Noble, NCTM, Robert D. Reed Publishers, and other purveyors of exceptional literature.