Order of Operations: Why These Rules?

Last time we looked at some questions about why we need rules for Order of Operations at all, with some hints in the answers as to why the rules we use make sense. This time I want to survey some deeper explanations.

Why not just left to right?

First, an overview of reasons for the rules. This comes from 2008:

Why Do We Use Order of Operations?

Why is it necessary to use order of operations?  Why can't you just write a calculation left to right? 

It seems redundant to follow all those rules, when every equation can be re-written to work left-to-right.  The only useful part of the order of operations in my opinion is putting brackets first, but the rest seems useless.

For example:

5 + 2 ^ 3 x 2
= 5 + 8 x 2
= 5 + 16
= 21

Using the order of operations just makes this confusing, as I have to calculate all over the place.  Left-to-right would be so much easier, and could be written with the same numbers, just rearranged, like so:

2 ^ 3 x 2 + 5
= 8 x 2 + 5
= 16 + 5
= 21

This also makes it neater because all the calculations would be solved left-to-right, without any hassle.  Wouldn't that be much easier?

It’s not quite so simple.

I replied:

No, not really.

You're right that any expression can be written regardless of what rules we follow--but only if you use parentheses pretty heavily.  For example, what we write as

  2*3 + 4*5 = 26

would have to be

  2*3 + (4*5)

There is no way to write this using left-to-right evaluation without parentheses.

Martin’s example was one that could be rearranged to give the desired result without parentheses, because it didn’t involve any operations between two intermediate results (like my two multiplications). My example, on the other hand, is quite common, and PEMDAS makes it simpler.

Making polynomials easy to write

Any “grammar” of algebra will need parentheses for some expressions; more important is the question of what sort of expression we most want to keep simple. Within algebra, polynomials loom large:

One big advantage of the real rules is that they let us write polynomials, which are very important in algebra, without parentheses:

  2x^2 - 3x + 5

in your form would have to be

  2(x^2) - (3x) + 5

since without the parentheses you would first multiply 2x, then square that, then subtract 3, then multiply by x, then add 5, like this:

  ((2x)^2 - 3)x + 5

The normal rules were probably invented largely to make it easy and natural to write polynomials.

To evaluate this polynomial, we have to do the squaring first, then multiply the result by 2; in pure left-to-right grammar, we could write that as $x^2\cdot 2$. But then we have to multiply x by 3 before the next addition, so there is no way to get around putting that in parentheses: $x^2\cdot 2 – (3\cdot x) + 5$. So we need at least one pair of parentheses. Well, on second thought …

[As an aside, to be honest, there actually IS a way to write polynomials without parentheses in your form:

  2x-3x+5

This is equivalent to what we would actually write as

  (2x - 3)x + 5 = 2x^2 - 3x + 5

This  is a very efficient way to evaluate polynomials, commonly used on computers and closely related to synthetic division.  However, this notation hides such important things as the degree of the polynomial and the degree of each term, which makes it awkward for actual use, especially when there are missing terms.]

In this form, $2\cdot x\ -\ 3\cdot x + 5$, we would multiply 2 times x, then subtract 3 from the result, then multiply that by x, then add 5. This method of evaluation (very efficient because you never have to raise a number to a power, and you use the fewest multiplications possible) is sometimes called synthetic substitution, or Horner’s method (not to be confused with other meanings of the term). But as good as it is for evaluation, it is not good at all for displaying what the polynomial really is:

As a side-effect of this fact, our notation makes it possible for us to talk about "terms" and "coefficients" in an expression.  In either of the alternative forms above for the polynomial 2x^2 - 3x + 5, there is no such thing as terms; they are either obscured by required parentheses, or not present at all.  The very idea of terms and coefficients is built on the idea that a polynomial is a SUM of PRODUCTS, in which multiplication is done before addition.  A term is a product of coefficients and variables, like 2x^2; these are all first calculated and then added together.  Our rules are designed to make this easy to write.

This has gradually become my usual brief explanation of the rules: Everything is to be thought of as basically a sum of products of powers.

I imagine if the left-to-right rule had been used from the beginning, we'd have different ideas of what it means to simplify, and of what kinds of expressions are most interesting to work with; but the notation that was developed in the 1500's was what they needed in order to talk about what they already were interested in even before there were nice symbols (namely polynomials), and it's worked well ever since.

It is worth noting that algebra was done before modern symbols were invented, and trying to say everything in words was a mess!

Visual aspects of the rules

The precedence of multiplication is also closely tied to the notation itself; I’m not sure which is the cause, and which is the effect:

It's also worth mentioning that our notation is partly designed to make the rules feel natural.  First, we write multiplication without a symbol, making "2x" feel like a single quantity and "2x + 3y" look like you should first multiply and then add.  [In your notation, you probably would not want to allow this way to indicate multiplication, but would require use of the symbol, since multiplication would not play a special role.]  Second, we write exponents as superscripts, setting them apart and imparting an asymmetry to the notation that exactly fits the non-commutativity of that operation; as long as you remember that the exponent is attached only to the item just before it, it is natural to see "2x^2" as the product of 2 and x^2.  This takes away some of the force of your objection that the rules are confusing and hard to follow.

I suspect we write multiplication like $2x$ in part because we want that to have top priority; and similarly an exponent, $2^x$, is thought of as something attached to a particular symbol, applied to that before anything else.

Another very important advantage is that the standard rules fit very well with the properties of numbers, making algebraic manipulations feel natural.  For example, addition is commutative, so we can swap the order of terms, like "2x + 3y" becoming "3y + 2x".  But in your notation, the former would have to be written as 2x+(3y), and when you commute the addition you would have to remember to add parentheses around the term you move to the end: 3y+(2x). It's no longer just a matter of moving symbols around; you'd have to think more!

This is an idea we’ll be delving into below.

One more example: What does the distributive property look like in your notation?  Here are two examples in standard notation:

  2(3 + 4) = 2*3 + 2*4      (2 + 3)4 = 2*4 + 3*4

In left-to-right notation, the first of these would have to be written as

  2(3 + 4) = 2*3 + (2*4)

You have to remember the parentheses on the right side.  But the second example is worse.  Here's how you'd write it:

  2 + 3 * 4 = 2*4 + (3*4)

You'd no longer have to use parentheses on the left side, so distribution would not be thought of as a way to eliminate parentheses, and would not consistently be applied across parentheses.

The very way we think of distribution involves removal of parentheses; that is because multiplying a sum requires parentheses in our notation. In a different notation, we could still distribute, but it wouldn’t tie in to the appearance of the expression.

Do you see now that the combination of notation and rules that we have is just about perfect for algebra?  You don't see it when you're just working with numbers, but as soon as you start using variables and moving things around, you find that the way we write expressions makes it really easy to do what algebra requires.

Now imagine going back before the 1500's, when algebra problems were written out in words (in Latin or Arabic)!  No wonder math and science made such huge strides in the 1600s, giving us calculus, the orbits of planets, and worldwide navigation.

Martin was humbled:

Thanks a lot.  I guess that things are smartest the way they are, especially with millions of people constantly looking over and criticizing them (cough, cough).  Keep up the good work!

I had to correct his conclusion that there is nothing better than the status quo:

Just to make sure I haven't overstated my case, not ALL notation in math is perfect yet!  There are several areas (such as logarithms) where our notation is still at least a little awkward, and you should not give up on asking whether they can be improved.  Sometimes nobody complains just because we're used to it and know that it would be next to impossible to get everyone to change; yet notation does keep changing, at least in little ways, and there's always a chance that some new idea will take over because it is clearly superior.

The commutative property (I)

Above, I mentioned that the order of operations “plays nicely with” the commutative and associative properties. Let’s look at two further explorations of each of these. First, a question from 1995:

Why Order of Operations?

Recently some 7th-grade math students were introduced to the concept of "order of operations."  Several of the students had questions about WHY we use the order of operations at all... why not just set up problems to read from left-to-right?  (One student even came in with a demonstration of "reverse Polish notation"!)

Sound familiar? I’ll just quote part of one answer, from Doctor Ken:

Here's one reason we might think that our normal system is so neat.  You know that the operations of addition and multiplication are _commutative_, i.e. they give you the same result no matter what order you write them in: 2*3 = 3*2.  In conventional notation, that property is clearly reflected, and you get lots of options for how you want to write your expressions:  2*3+5 = 3*2+5 = 5+2*3 = 5+3*2 = 11.  In calculator-order notation, for example, you only get the first two, 2*3+5 and 3*2+5.  In calculator-order notation the third evaluates to 21, and the fourth evaluates to 16.  So the main reason for using conventional order of operations is the flexibility it gives you in writing down mathematical expressions.  It's important to remember, though, that it really is just a "conventional" system, i.e. it's a convention that we use it (albeit a pretty good one in my opinion).

By grouping operations as sums of products (that is, terms), PEMDAS allows us to commute terms without needing to add parentheses.

The commutative property (II)

Next, consider this question from 2006:

Why Do We Have the Order of Operations?

I have always hated the order of operations.  It seems like a stupid way to do things.  Why was it made, and who and when was it made?  

I understand that it is a convention, but why don't people just simply calculate from left to right?  It seems much more logical.  If I had to answer scientific formulas, could I use the order of operations?  I mean, does jumping around from number to number work in real life equations?

I answered with two main reasons related to formulas:

I can suggest a couple examples to illustrate why it seemed natural to the early developers of algebraic notation.

First, think about one of the better known formulas:

  A = pi r^2

for the area of a circle. What does that mean?  It says to multiply pi times the square of the radius.  Do you see how the order of operations is necessary to interpret even such a simple formula correctly?  If you just took it from left to right, it would tell you to first multiply pi times r, and then square the result, which is not what you need to do to get the area.  In order to make it read correctly, you would have to either be very careful about order, and write

  A = r^2 pi

or to use parentheses:

  A = pi (r^2)

So one of the very simplest formulas is much easier to write when E comes before M.

Now consider a common type of expression in algebra, the polynomial. This is a sum of terms, each of which can be a product of a number and some power of the variable:

  3x^2 - 4x + 2

If we just went left to right, we couldn't write it that simply, but would have to use parentheses something like this:

  3(x^2) - (4x) + 2

The standard order of operations is designed to make it easy to write these types of expressions!

We need follow the natural hierarchy in order to write such expressions neatly.

Now I brought in the commutative property:

Finally, think about the properties of operations, such as the commutative property of addition.  This lets us rewrite something of the form

  a + b

as

  b + a

if it helps.  For example, we can rewrite

  2x + 3y + 4x

as

  2x + 4x + 3y

so that we can combine the x terms and have

  6x + 3y

But without the order of operations that says to do each multiplication first, you just couldn't do that; you'd always have to be very careful about the order of things, unless you used parentheses all over the place.

The natural order of operations treats expressions as sums of chunks, which can therefore be moved around freely.

So the order of operations really does make algebra easier, not harder.  That's why it's used in ALL algebraic equations and formulas, from the familiar to the technical.  Without such a universal standard, we couldn't be sure what anything meant; and with a different one, the most useful expressions would be the hardest to read and write, rather than the easiest (as is true in reality).

The distributive property (I)

But why put addition last; why not make expressions behave like products of sums (which, as things are, require parentheses) rather than sums of products?

Here’s a question from 1998:

Ordering the Operations

In my math class we are studying PEMDAS. Why does the order of operations have to be in that order? Who made up PEMDAS?

I answered, starting with the “who”:

People generally say that the order of operations is nothing more than an arbitrary convention - that is, there had to be some rule so everyone would read an expression the same way, so they just chose a rule. I don't think any one person made the decision, but it just gradually developed as the modern symbols for algebra and arithmetic developed. But I think there is a good reason that the traditional order was agreed upon without any arguments.

I chose one feature to focus on.

That reason is the distributive rule, which we write as:

   a * (b + c) = a * b + a * c

If we reversed the order of operations, doing addition before multiplication, we would write it this way:

   a * b + c = (a * b) + (a * c)

Do you see the difference? In our usual form, we can say that the multiplication distributes over the terms in parentheses. The parentheses are required because the addition has to be done first. But in the reversed form, the parentheses aren't needed there, so the distribution isn't nearly as obvious.

There is a natural hierarchy, and requiring parentheses around a sum being multiplied exactly fits it.

For the same reasons, polynomials would be more awkward to write, since each term would require parentheses.

To put it more simply, we do multiplication before addition because multiplication distributes over addition; multiplication is in some sense "more powerful" by nature.

Similarly, exponentiation distributes over multiplication, so we do that first:

   (a * b)^c = a^c * b^c

would be written as:

   a * b ^ c = (a^c) * (b^c)

if we did multiplication before exponents, and that isn't as clear.

I often draw a diagram like this: $$E\\M-D\\A-S$$ Each layer distributes over the layer immediately below it. Also, when each operation is applied to a power, it is equivalent to applying the operation below it to the exponent (e.g. raising $a^m$ to a power, $\left(a^m\right)^n$) amounts to multiplying the exponent, $a^{mn}$; and multiplying two powers, $a^m\cdot a^n$ amounts to adding the exponents, $a^{mn}$). Our notation just expresses this hierarchy.

The distributive property (II)

Finally, here’s a question from 2004:

Why Does Order of Operations Work the Way It Does?

In order of operations, it is understandable that brackets are focused on initially since they represent individual terms.  However, why do exponents then come before the other possible operations?  And most importantly, why are multiplication and division done before addition and subtraction?  What would happen otherwise?

What are the reasons and logic for this, if any?  I imagine that mathematicians a long time ago didn't just decide on it for the sake of agreement.

I answered again, first referring to various pages we’ve looked at; then I summarized some of the ideas we’ve seen:

The basic idea, as I see it, is that exponentiation is in a sense more powerful than multiplication, in that it distributes over multiplication; and likewise multiplication is more powerful than addition.  Another aspect is that the order we use makes the most interesting expressions, such as polynomials, as easy as possible to write.  Finally, there is a good chance that the order just arose naturally out of the way human language tends to express arithmetic, as the symbolism developed gradually from abbreviations of Latin or other languages; for example, "two cats and three dogs" naturally means (2c) + (3d) rather than 2(c + 3)d!

In sum: The way we order operations (a) makes polynomials and other important expressions as easy as possible to write; (b) fits well with the commutative and distributive properties, making manipulations using them natural; and (c) meshes with our notation for multiplication and exponentiation. There is a natural hierarchy to the operations, and PEMDAS just reflects this.

Order of Operations: Why These Rules?

Why not just left to right?

Making polynomials easy to write

Visual aspects of the rules

The commutative property (I)

The commutative property (II)

The distributive property (I)

The distributive property (II)

1 thought on “Order of Operations: Why These Rules?”

Leave a Comment Cancel Reply

Have a question of your own?

Search Blog

Meta