(An archive problem of the week)
A couple weeks ago, in discussing the value of estimates, I included one example of a (very simple) Fermi problem: one in which it is necessary to invent the data as well as the method of solution. Today, I will examine one answer in which we dug deeper into a much more elaborate question, and see what it says about estimation.
The problem
Here is the initial question, from Kenneth in 2002:
Estimation and Fermi Questions I'm currently learning about Estimation techniques similar to those used by the famous scientist Enrico Fermi, who proposed the question, "How many piano tuners are there in Chicago?" The question I invented was the following: If there are 12,000 students who attend a certain college, how many professors are employed by the college?
For the benefit of future readers, I gave several references, of which the only one that is still accessible is this nice introduction to Fermi Problems, by Austin Gleeson via Eric Smith, including the classic Piano Tuner Problem. (Today, of course, you can also just look it up in Wikipedia.) Quoting from that page,
Fermi was uncontestably one of the most important research physicists of this century, and a great many of the working tools of the modern physicist were invented by him. He was also for many years a professor at the University of Chicago, who had a reputation for asking his students outrageous and seemingly impossible questions, and then showing them that they had the necessary knowledge and tools to answer them, which is why this kind of problem came to be named Fermi problems.
The point of Fermi problems is in part to show how much use can be made of commonly available knowledge by the person willing to be resourceful and make approximate simple calculations, but it is more to illustrate the difference between estimation and guessing. The fabled canonical Fermi problem was the question “how many piano tuners are there in Chicago?”. Faced with such a question without warning in a physics lecture hall, one response would simply be to declare that you don’t know and cannot know, and if forced to produce an answer to simply guess at one that is “plausible”. The primary disadvantage of this guessing is not that it yields imprecise answers, because any question answered with incomplete or not-well measured information necessarily yields imprecise answers. And in fact, even guesses about common experiences are often plausible, precisely because they are constrained by an intuitive touch with reality. One would not even guess that the number of piano tuners in Chicago is something comparable to the number of people living in Chicago, because that would violate the experience that most people one meets are not piano tuners.
The problem with guessing is that one does not know how much confidence to place in the answer, because the constraints from which it follows have not been clearly identified, and so even if estimations have been made at an intuitive level, the degree of imprecision in the estimations has no way to be identified.
Kenneth went on to show his own work on the question, for which he guessed both what information would be useful to solve it, and what values would be reasonable:
I came up with the following estimation. Can you tell me if my reasoning is reasonable? Thanks very much! 1) The average professor teaches about 1 hour a day, so in one week (Monday - Friday) he teaches 5 hours. 2) Each class takes about an hour, so in one week he teaches 5 classes. 3) But he doesn't teach 5 different classes in one week; the same classes are held 2-3 times a week (either Monday-Wednesday-Friday classes or Tuesday-Thursday classes). To make it simpler, let's say a professor teaches each class 2 times a week (assume only Tuesday-Thursday classes exist). Therefore, he sees the same class 2 times a week, meaning every half week he sees the same students, but that also means every week he only sees the same students because the classes repeat. 4) If a professor teaches for 2.5 hours per half week (5 hours per week), where each class takes about an hour, and assuming that a typical class consists of 50 students, then he sees 2.5 classes x 50 students = 125 students per week. 5) Since there are 12,000 students for all professors to lecture, then there are probably 12,000 / 125 = close to 100 professors on campus. From a scale of 1-10, how would you rate this estimation? Any suggestions or comments are greatly appreciated.
There are, of course, two ways to judge his work: the logic, and the data. At the time, I focused only on the former, which is appropriate in our context; today, having become an adjunct professor at a community college two years after this question, I have to laugh at the numbers he used. Of course, colleges differ, and he is imagining one at the other end of the spectrum from mine, probably a major university. Our full-time professors typically teach at least 5 courses (3-4 hours each), not just 5 hours, per week, because our emphasis is on teaching rather than research; we have around 12,000 students (depending on how you count them), with 295 full-time and 710 part-time instructors according to the website. Looking at the other end of the scale, I find that Yale University, near where I grew up, also has about 12,000 students, with 4,410 faculty members (not differentiated as full- or part-time, but including many who don’t teach). Clearly Kenneth has underestimated; we’ll see at the end if I got it any better!
Looking only at the reasoning, I immediately saw a gap:
It seems to me that you have left out one important factor: each student attends more than one class. This can be tricky to describe clearly: in your model each class has 50 students, and each student has, say, 5 classes. So is it 50 students per class, or 5 classes per student? One way to avoid this trouble is to think of a specific name for the relation of a student to a class. One I've thought of is "seat." Each class has 50 seats (students in the class); each student has 5 seats (in different classes). You can diagram this: Student ------> Seat <------ Class <------ Prof 1 5 50 1 2.5 1
I’ll explain the source of this diagram later; I think the idea of “seats” is standard in describing a college, as “bed” is in a hospital. (The terms aren’t meant to dehumanize, but to make it possible to talk about such ideas!) But as I’ve said elsewhere, a key to problem-solving is to find a good representation so that you can see relationships, and this is it.
Kenneth asked for clarification:
Each student has 5 seats. There are 50 seats in one class. But does that mean each class holds only 10 students? I'm not sure I understand this concept. And I thought that the total number of students (12,000) attending the college would matter in the estimation. Can you explain this further?
To which I replied, showing the actual work:
Maybe you can get a better understanding if you try to tell me what role the number of classes each student takes should play in your estimate. It takes a bit of wrestling on your own before you can quite pin this idea down. Obviously I didn't say that each class has only 10 students; but if each student took only one class of ten students, you would need the same number of professors, so in a sense it is equivalent to that. Or, instead of 12,000 students taking 5 courses each, you could have 60,000 students taking one course each; that would require the same number of professors, which is five times as many as you estimated. Does that help? And I didn't say the number of students doesn't matter; my diagram (a variety of Entity Relationship Diagram) only shows relative numbers, not absolute numbers. It says that for each student there will be 5 seats (in different classes), each of which is 1/50 of a class, each of which needs 1/2.5 of a professor, so 12,000 students will need 5 seats 1 class 1 professor 12,000 students * --------- * -------- * ----------- = 480 professors 1 student 50 seats 2.5 classes There's a lot of useful math and logic in this question!
So there’s my answer (based on his assumptions). It’s actually not too far off the number for my own school; if I change the assumptions to 25 students per class and 5 classes per professor, according to my own experience, we get the very same result!
5 seats 1 class 1 professor 12,000 students * --------- * -------- * ----------- = 480 professors 1 student 25 seats 5 classes
That’s a bit low if I consider an adjunct equivalent to a little less than half a full-timer, so that we have the equivalent of 650 faculty. But it’s not bad for a rough estimate.
On the other hand, the estimate is extremely low for Yale. I suspect that if we drop non-teaching faculty (listed as “research”) and count part-time faculty appropriately, the number may be more like 2000 rather than 4000, but it’s still a lot more than 480. To increase the number, we’d need smaller classes and fewer classes per professor; dropping it to 1 class each, we get 1200, which is still low. I imagine some are counted as teaching faculty even though they only work with a couple individual students. But that’s not the main point of our discussion!
Kenneth had a last question: How does one learn to think this way?
Thanks very much, I understand your solution. One last general question on how I can sort of think the way you do, so to speak. What you said about how I missed a link between the students and the class, where we're not sure if each student has 5 classes or if each class has 50 students, makes me ponder about how you realized and picked that up. Have you worked with problems such as these repeatedly, so you know all the nitty gritty details? Or is it because I'm illogical or have less ability to reason that I totally neglected that fact. If so, is there a particular math course I can take to help open up my mind to incorporate more reason and logic into my mathematical thinking? I feel incompetent when I see other students being able to answer such questions while I'm struggling. Can you give some advice?
I had a little special knowledge, but mostly just experience:
An interesting question! I suspect a lot comes from experience - and that experience is probably what Fermi was trying to develop with his questions. Logical ability is not just something you are born with (though it may be that some of us naturally gravitate toward it, and therefore develop the skills without having to be forced into it); I think everyone has to develop it by practice. In this case, as with the piano tuner problem, a lot of the thinking needed depends on specific knowledge of the subject matter. You have to picture a university and know that there are classes and professors and so on, or picture a piano tuner's job and see that he will do more than one piano a day, and they will be in different homes, and so on. In this case, I just thought about what factors would play a role, saw that each student would be in several classes, and expected to see that somewhere in your analysis. When I didn't, a red flag went up. No logic, just visualization. But I probably would have noticed it anyway, by going through your presentation in order and falling off the end when you didn't mention what each student does. Maybe you can call that "follow-through" - you can't stop your logic just because you've made contact with the goal, but have to keep thinking until you can't think any further. You did fine until then; you just didn't take it all the way. Assuming that "problem-domain" knowledge, you have to be able to think through the problem, both in a straight line (each professor teaches N classes; in each class there are N students; ...), and also sometimes coming at it from all sides, just brainstorming to think of all the relevant factors. The former requires the ability to stay focused and think in an orderly way; the latter requires defocusing and letting wild ideas come in. Both have their place, and some of us are probably better at one than the other. In this case, to find my own answer, I just went through it sequentially, starting at the student (since I knew the hard part was at that end). Now I happen to have an advantage over you in doing a Fermi problem, one that I haven't seen mentioned in discussions of them. I am a computer programmer, and part of my work involves designing relational databases, where you might have one table listing all the students, another listing all the classes, and so on. That's where my Entity Relationship Diagram came from - it's a tool used to see how these tables relate to one another, and design additional tables that capture the information needed for these relations. While I was thinking about how to explain the "5 classes per student and 50 students per class" problem, that method popped into my mind - I suppose that is an example of the non-linear type of thinking, pulling a tool out of my toolbox because the kind of thinking I was doing reminded me of it, not because it was the next thing to think of. That's not essential for this kind of problem, but it can always be handy. Speaking of databases, if I had actually been trying to design one, I would have based my thinking on the paperwork involved in a university. Each of the 12,000 students has a course schedule, listing his or her 5 subjects. Each of those 60,000 subject lines corresponds to one seat in one class; 50 of them together form one of 1200 classes. Each professor has a schedule listing 2 or 3 courses, among which the 1200 classes will be found. In the database, you would add a table containing the information from all the students' schedules, which relates students to classes. (That's why it's called a "relational" database.) This image of actual pieces of paper can often make the abstract ideas of numerical ratios more concrete, and help in logical thinking. Where can you build these skills? Many math classes will incorporate them implicitly, in geometric proofs or word problems, for example. Other fields need it too, from Fermi's physics to law school. There are all sorts of puzzles you can find in books or elsewhere.
The great thing about Fermi problems is that they are challenging puzzles that make you bring together all sorts of ideas. We have discussed a number of such questions over the years, not always using the term “Fermi problem”. Here are a few pages of interest:
Estimation in 3rd Grade (Doctor Tom, 1996, listing Fermi problems) Estimating Seating Capacity (Doctor Jesse, 2000) Billions Taste the Rainbow Thousands of Billions of Billions of Times (Doctor Ian, 2014)