I’ll close this series on averages with a quick look at the mode. Unlike the other “averages”, this doesn’t always exist, and when it is, it is not always unique. In fact, as we’ll see, sometimes we can’t be sure whether there is no mode, or many modes. How do we handle these odd cases?
No mode, multiple modes
Our first question is from a 7th grade teacher in 1996, with two main questions, both arising from the sort of question we started this series with (why we need different kinds of average):
Mean, Median, and Mode in Real Life I was teaching about mean, median and mode the other day, and everything was going well. The students seemed to understand what we were doing, although it was not clear why we needed to know three different methods for finding central tendencies. The question came up about how to determine the mode if there was more than one number in a list that repeated. My suggestion was that there could be more than one mode, and an answer in the teacher's guide also indicated the same thing. The other question asked how to figure out the mode if no numbers in the list repeated. Again, my answer (from the guide) was that there was no mode. Was I correct? Please explain why. If I was incorrect, please explain the correct responses.
Brian is right on all counts, but wants a deeper understanding. Doctor Ceeks answered:
Identifying "the" mode is an attempt to answer the question: "Which occurs most frequently?" Sometimes there's a clear answer, other times there are a few answers, and sometimes everything comes up the same amount and there's no point in asking or answering that question. So your answers are fine. You could also say that all the items in a list where no item is repeated are modes, but there's not much point. Anyway, the answer to "what is the mode" in such a non-repeating list is a technical matter, and it's more important to understand that the question itself becomes rather inapplicable in such situations.
These are points we’ll be digging deeper into as we proceed. The last statement will, in fact be our bottom line: It matters more that we understand when the question shouldn’t be asked, than that we always answer it correctly!
He went on to deal with the general question, about the need for three “averages”, which I’ll skip.
What if everything occurs twice? Take your pick
Next, one from 2000:
Mode of a Uniform Distribution I am the father of a 6th-grade student who uses the text _Everyday Mathematics_ from the University of Chicago School of Mathematics Project. My question is: what is the mode of a set of numbers, if none of the numbers repeat themselves? Here is the data set: 338, 324, 270, 229, 209, 193, 170, 168, 154, 140 What about a set like: 1, 1, 2, 2, 3, 3 How do I explain this so that it makes sense?
Here we have two similar, yet quite different, examples. In the first, every value occurs once; in the second, every value occurs twice. Do they have the same answer, or opposite answers?
Doctor TWE answered, focusing, I think, on the second example:
Hi David - thanks for writing to Dr. Math. Your question is a good one. After consulting with several colleagues and college professors (including my wife, who teaches graduate-level statistics courses), I find that the consensus is that there is no consensus. The problem with the definition of mode is that it doesn't explicitly say what to do in the case of a uniform discrete distribution. The definition says that the mode is "the most frequently occurring value in a sequence of numbers." By a strict interpretation, this means that in a uniform distribution (a sequence in which all values occur with equal frequency) such as your series, all values are modes, since there is no value that occurs more often. However, some references say that if all elements in a data set have the same frequency, then the data is said to be of no mode.
I suspect that most non-mathematicians might see no problem here: there is no “most” when everything is the same, so there is no mode. But to a mathematician, we tend to think of “most” as meaning “nothing is greater”, rather than “everything else is less”. Here, nothing is more frequent than a given value, so each value is a mode; but on the other hand it makes sense to say that there is no peak, and so no mode!
But whenever I see that professionals have no consensus, I conclude that it probably doesn’t matter, because nothing important rides on it. Both points of view make sense, so at best we’ll have an arbitrary declaration.
My wife says that she would accept either "all values are modes" or "there is no mode" as an answer for that problem. The software package she uses in her classes, "Adventures in Statistics," only accepts the answer "there is no mode." Personally, I am a stickler for exactness in definitions and definition interpretation, so I would say any uniform discrete distribution is multimodal with all values being a mode. (But that's just my opinion.)
So we have three different answers here, from wife, software, and husband!
You didn’t know how different mathematicians are from statisticians, did you? The importance of definitions is more prominent, perhaps, in a mathematician who focuses on proof; a statistician may tend to be more involved with real data and practical understanding (though “mathematical statistics” can be very theoretical).
A case on the edge
But what if almost all values are the same?
My wife pointed out another interesting situation. Consider the set produced by taking the absolute value of all of the integers. In this set, the values 1, 2, 3, ... occur twice each, but the value 0 occurs only once. Therefore, it is not a uniform distribution. Does this make it multimodal with all values except 0 being modes? If so, does removal of the 0 cause it to have no mode?
Properly speaking, he is not talking about a set, but about a “sequence”, as he stated before, or a “multiset”, in which the same element can occur more than once, as is typical of “data sets”. The set of integers contains, say, both 2 and -2, so the “set” of absolute values contains 2 twice – but it contains 0 only once. This example shows how close “no mode” and “many modes” can be, and therefore how unstable our answer is.
As to explaining it to a sixth-grader, I would stick with a simple explanation. Ask either "what value occurs more than the others?" (the answer would be "none"), or "which value or values occur the most often?" (the answer would be "all of them"). You might then ask "which answer - none or all - is more useful?" This is an opportunity to get him or her thinking more deeply about math as a tool for understanding other things.
This is the bottom line: What words we use are far less important than understanding, and definitions are not set in stone, but are intended to fill a need. Which answer is more useful? If the group doesn’t agree, then maybe it really doesn’t matter!
Is the book wrong?
Next, a very similar question from 2002:
More Than One Mode? I teach one section of statistics to advanced math students and we came upon an answer we did not agree with. We would like your help in determining if the answer is a misprint. And if it is not a misprint, we would like an explanation for the answer! By definition, a statistical mode is the value(s) that occur most frequently in the data set. What is the mode of the data set: 13, 13, 14, 14, 15, 15? We say 'none' because no number appears more frequently than any other number. The answer key said 13, 14 and 15. What do you say??
Multimodal data
Doctor Achilles answered with an anecdote, focusing on the general idea of multiple modes:
Hi Robin, Thanks for writing to Dr. Math. The other day, I was running a race against two other people. The other two people finished ahead of me, but at EXACTLY the same time. So they BOTH got first place and I got third place (there was no second place). By definition, the winner of a race is (are) the runner(s) who finish with the fastest time. If two runners tie for first, they both win. If three, four, or more runners all get the EXACT same time, then they all win. So the way to decide the winner of the race is to give each runner a time, like this: Runner A: 1min23sec Runner B: 1min25sec Runner C: 1min23sec The winner(s) is (are) the runner(s) who has (have) the fastest time. To figure that out, we find the fastest time: 1min23sec So EVERY runner with a time of 1min23sec is a winner.
In particular, corresponding to Robin’s question, if everyone had the same time, then everyone would be a winner!
The times in the race parallel frequencies in a data set:
In the data set: 1, 1, 1, 2, 2, 2, 3 1 and 2 both have the most occurrences, so they are both modes. Since they tie for first place, they both get it. 3 is not a mode because it occurs less frequently. So the way to find a mode is to give each number a score based on how many times it occurs. 1: occurs three times 2: occurs three times 3: occurs one time The mode(s) is (are) the value(s) that occur(s) the most. The most any value appears is: three times So every number that occurs three times is a mode.
But what if everything occurs twice?
RobinĀ replied, wanting a specific answer to the question:
Thanks for the quick reply to my question. I totally agree with your comments and completely understand them. But you didn't really address my question. If your data set is: 13, 13, 14, 14, 15, 15, is there a mode, and what is it?
Doctor Achilles responded, extending the analogy to the case where everyone is a winner:
Hi Robin, Thanks for writing back to Dr. Math. In that data set, the modes are: 13, 14, and 15. In the data set: 1, 13, 13, 14, 14, 15, 15 The modes are also: 13, 14, and 15. You can have multiple modes in a data set. Hope this helps. If you'd like to talk about this some more, please write back.
This probably didn’t quite satisfy Robin, as it didn’t deal with their argument for “none”: “no number appears more frequently than any other number”. When everyone is a winner, is there really any winner at all? Why not just say no one won? I might have referred to Doctor TWE’s 2000 answer, explaining that the answer really doesn’t matter. But Doctor Sarah added a response that provided specific answers:
Hi Robin - thanks for writing to Dr. Math. When no number occurs more than once in a data set, there is no mode. If each of two numbers occurs twice, we say the set is bimodal. Your set is trimodal. See: Statistics - edHelper.com http://www.edhelper.com/statistics.htm Mean, Median, and Mode Discussion - Shodor Education Foundation http://www.shodor.org/interactivate/discussions/sd1.html
Both links, a little surprisingly, are still active. The first says this:
The mode of a set of data values is the number in the set that appears most frequently. For example, the number 5 appears three times in 1, 2, 5, 5, 5, 8, 8, 9. Since the number 5 appears the most times, it is the mode. A set of numbers can have more than one mode, as long as the number appears more than once. In the data set 1, 2, 2, 3, 3, 4, 5. The mode is 2 and 3. We also can say that this data set is bimodal.
If no number appears more than once, then the data set has no mode.
So, just following “official” definitions, the multimodal answer seems right. “No mode” means no repetition. Robin was satisfied:
Thank you for your answer! What a neat resource to have access to. Now that I have found it - I'm sure my students will take advantage of it and try to come up with interesting questions to try to stump you all.
Ummm … that’s not really our purpose! But we do like interesting questions!
Why it doesn’t really matter
Here’s a slightly different question from 2007, focusing on the “no mode” case:
Is There a Mode when All Data Points Occur Equally? If you have a set of data such as 2, 67, 39, 20, 15, and 56, what would the "mode" be?
Here, nothing repeats; so, as we know, there is no mode.
Doctor Rick answered, first referring to that last answer from Doctor Achilles:
Hi, TJ. Take a look at this answer in the Dr. Math Archive: More Than One Mode? http://mathforum.org/library/drmath/view/61375.html At the end you'll see a reference stating that a set in which no element occurs more than once has no mode. You could also argue that, in any set in which all the elements occur the same number of times (including once), every element is a mode. It really doesn't matter.
So although the usual answer is “no mode”, you could just as well say “everyone’s a winner”. We just don’t. But why doesn’t it matter, then?
In real life, we wouldn't bother calculating statistical measures for such a small set; those measures are meant to distill a *large* set of numbers into a small set of numbers (mean, median, mode(s), standard deviation, ...) that characterize the distribution. We study small sets like your example in order to see up close how statistical measures work, but we shouldn't focus on these toy examples too much, or on the issues that arise in dealing with them. The larger the set, the less likely it is that every number will appear exactly the same number of times. If the number of occurrences is *nearly* the same for all numbers, then the mode (or modes) is meaningless; what is significant is that the distribution of the data is nearly flat.
So, ultimately, the reason it doesn’t matter whether we say “no mode” or “all modes” is that this won’t happen in real life, except in cases where the apparent mode (a barely-more-common value) represents nothing more than a tiny fluctuation in frequency, or a random measurement error. Mode is most important in frequency distributions:
If each number appears only once, it suggests that you'd get more interesting results by binning the numbers to increase the number of occurrences in each bin. (For instance, total the counts for numbers 1 to 10 in one bin, the counts for 11 to 20 in the next bin, etc.) You may well find that a mode (or modes) will become evident when you do this. Thus there is no point in worrying about what to call the mode when each number appears only once; in practice you won't continue to work with the data in that form.
Histograms, based on binned (grouped) data are better at showing significant patterns. This is a point that is probably missed most of the time when modes of data sets are discussed: The mode of the distribution might be reflected not in identical values, but in many values being close to one another, so that they would fall into the same bin. For example, here is a set of data that have a single mode that is rather meaningless:
Here I have binned them as described, and we see a much more meaningful modal class:
Mode(s) of a distribution
Finally, here’s one more question, from 2000, about such binned data, as reflected in a frequency distribution or histogram:
Local Modes and Bimodal Distributions What does it mean if a frequency distribution shows two peaks? I know that if a distribution is sharp and peaked, then the standard deviation is small. So does that mean that the values tend to increase then decrease and increase again? Would this be the answer?
Doctor TWE answered:
Hi Laura - thanks for writing. In general, yes, it would indicate that the values tend to increase then decrease and increase again. We call the smaller of the peaks a "local mode." If the two peaks are the same height we say that the distribution is "bimodal" or "multimodal." But you have to be careful that you're not over-scrutinizing the data. A small increase in a generally decreasing pattern may just be a glitch, an aberration of how the data were sampled, and may not be significant. Here are some examples: Single Modal Distribution | * | * * | * * | * * | * * | * * | * * | * * | *** *** |**** **** +------------------------- Single Modal Distribution with Local Mode | * | * * | * * | * * * | * * * * | * * * * | * ** * | * * | *** ** |**** **** +------------------------------ Bimodal Distribution | * * | * * * * | * * * * | * * * * | * * * * | * * * * | * ** * | * * | *** *** |**** **** +---------------------------------- Single Modal Distribution with a "glitch" (probably just a glitch) | * | * * | * * | * * | * * | * * | * * | * * * | *** *** * |**** **** +---------------------------
Clearly some of these distinctions will be subjective: How close to the same frequency do two peaks have to be before we stop calling one a mere local mode, and say that it is bimodal? How small does a local mode have to be before you decide it is a “glitch”, and has no meaning? In keeping with the theme of this post, the important thing is not what you call it, but its significance to the phenomenon you are studying, which requires deeper study.