What (Math) is in a Name?
October 3, 2011 at 12:12 pm 4 comments
One of my favorite online tools is the Mean and Median app from Illuminations. This tool allows you to create a data set with up to 15 elements, plot them on a number line, investigate the mean and median, and consider a box-and-whisker plot based on the data. Perhaps the coolest feature is that you can copy an entire set of data, make some changes, and compare the modified set to the original set. For example, the box-and-whisker plots below look very different, even though the mean and median of the two sets are the same.
It’s a neat tool for learning about mean and median, and I plan to use this tool in an upcoming presentation.
- Exceptional, Free Online Resources for the Middle Grades Classroom
G. Patrick Vennebush
Thursday, October 20, 12:30-2:00pm
Room 401 (Atlantic City Convention Center)
For classroom use, I like to use this app with real sets of data. However, the app requires all elements of a data set to be integers from 1-100. Can you think of a data set with a reasonable spread that has no (or at least few) elements greater than 100? If so, leave a comment.
Recently, and rather accidentally, I found a data set that works well. Do the following:
Assign each letter of the alphabet a value as follows: A = 1, B = 2, C = 3, and so on. Find the sum of the letters in your name; e.g., BOB → 2 + 15 + 2 = 19.
Now imagine that every student in a class finds the sum of the letters in their first name. For a typical class, what is the range of the data? What is the mean and median?
The name with the smallest sum that I could find?
ABE → 1 + 2 + 5 = 8
The name with the largest sum?
CHRISTOPHER → 3 + 8 + 18 + 9 + 19 + 20 + 15 + 16 + 8 + 5 + 18 = 139
The Social Security Administration provides a nice resource for investigation, Popular Baby Names. Using a randomly selected set of 2,000 names and an Excel spreadsheet, I found the mean name sum to be 62.49, and 96% of the names had sums less than 100. Of the 80 names with sums greater than 100, many (such as Christopher, Timothy, Gwendolyn, Jacquelyn) have shortened forms (Chris, Tim, Gwen, Jackie) for which the sum is less than 100.
As it turns out, the frequency with which letters occur in first names differs from their frequency in common English words. The most common letter in English words is e, but the most common letter in names is a. The chart below shows the frequency with which letters occur in first names.
Because of this distribution, the average value of a letter within a first name is 10.54, which is slightly less than the 13.50 you might expect. This is because letters at the beginning of the alphabet, which contribute smaller values to the name sum, occur more often in names than letters at the end of the alphabet.
The chart below shows the distribution for the number of letters within first names. The mean number of letters within first names is 5.92 letters, and the median is 6. (In the data set of 2,000 names from which this chart is derived, no name contained more than 11 letters.)
Do you know a name that has more than 11 letters or has a name sum greater than 139 or less than 8? Let me know in the comments.
Entry filed under: Uncategorized. Tags: data, elements, frequency, letters, mean, median, name, postaweek2011, sum.
1.
capnc | October 3, 2011 at 8:46 pm
Cab, as in Cab Calloway, comes in at 6. I’ve heard of “Babs” as a nickname for Barbara, but I don’t think “Bab” cuts it…
2.
venneblock | October 4, 2011 at 6:20 am
His full name was Cabell, but I’d still accept Cab. Nice one!
3.
.mau. | October 4, 2011 at 3:46 am
I know a longer name, but it’s in Italian: Massimiliano (there are also compounded names like Pierfrancesco, but maybe they are ruled out 🙂 )
4.
venneblock | October 4, 2011 at 6:22 am
I was certainly being geocentric when thinking of names, and I pulled my data set from the U.S. Social Security Agency. No reason to restrict foreign names, though. And compound names seem okay to me, too, since names like Brookelynn and Annemarie were in the data set I used.