(An archive question of the week)
The indeterminate nature of 0/0, which we looked at last time, is an essential part of the derivative (in calculus): every derivative that exists is a limit of that form! So it is a good idea to think about how these ideas relate.
Here is a question from 2007:
Derivative Definition and Division by Zero Hi, I'm having a bit of trouble with the concept behind first principles. Even though one aims to cancel out the denominator, which is technically zero, proofs that show that 1 = 2 may perform the same act (canceling out a denominator that was 0) and these proofs are considered invalid. Why is First Principles considered valid then? My thoughts are that perhaps the solution lies in the fact that in First Principles, the denominator approaches 0 as opposed to actually being 0, although I figured these two things are more or less equivalent since usually at the end of a First Principles problem one substitutes in 'h' as 0, even though it is stated that h is only approaching 0. Any thoughts?
Adrian never actually stated what he was doing, probably because in his context, “first principles” has always been used in the phrase “finding a derivative from first principles”: that is, directly applying the definition of the derivative to differentiate a function. We have seen enough such questions to know what he meant; but “first principles” can apply to the fundamental concepts or basic definitions in any field, so it really should have been stated.
Adrian is comparing the process of finding the limit in a derivative to the sort of false proof explained here in our FAQ (which I should have a post about soon, because that is an interesting subject of its own):
False Proofs, Classic Fallacies
A typical false proof using division by zero looks like this (from the FAQ):
a = b
a2 = ab
a2 – b2 = ab-b2
(a-b)(a+b) = b(a-b)
a+b = b
b+b = b
2b = b
2 = 1
Here we started with the assumption that two variables a and b are equal, and derived from that the “fact” that 2 = 1; so something must be wrong. The puzzle is to figure out what. The error occurred on the fifth line, when we divided both sides by (a – b), which, on our assumption, is zero. Dividing both sides of an equation by 0 is invalid, because the result is undefined. What is actually being done is to remove a common factor from both sides; but we can’t really conclude from \(0x = 0y\) that \(x = y\).
The key idea there is that if you divide by zero in a proof, the result can’t be trusted. So why can it, here?
Doctor Rick took the question:
In order to make sense of what you've written, I have to add some words. You're talking about CALCULATING A DERIVATIVE from first principles (that is, the definition of a derivative as a limit), aren't you? I just used a word that makes all the difference: the derivative is defined as a LIMIT. We never actually divide by zero; rather, we divide by a very small number, and we find the limit of that quotient as h approaches zero.
This is more or less what Adrian had in mind when he contrasted approaching zero to being zero; but because a typical last step in finding the limit is to actually replace a variable with zero, he is still unsure. So, what is the difference?
For example, let's find the derivative of f(x) = 2x^2 at x=3: 2(x+h)^2 - 2x^2 4hx + 2h^2 lim --------------- = lim ---------- = lim 4x + 2h = 4x h->0 h h->0 h h->0 The quantity whose limit we are taking is undefined at h = 0, for just the reason you say. But the LIMIT at h = 0 exists.
If we just plugged in 0 for h at the start, we would get 0/0, which does not have a single defined value; this is why the expression 0/0 is undefined, and we can’t actually let h equal zero in that expression. But here 0/0 is only a form: a shorthand notation for the fact that the function is a quotient of functions, each of which is approaching zero. In such a case, we have to find an alternative to evaluating it as it stands. As Doctor Rick has said, it’s a sign saying “bridge out — road closed ahead”, that forces us to take a detour to get to our goal.
In this case our detour is to simplify the expression, dividing numerator and denominator by h, resulting in an expression that can be evaluated at h = 0. What has happened is that we have found a new function that is defined at h = 0, but is equal to the original function for all other values, so that it has the same limit. But this function is continuous, so that its limit is its value at that point, and we can just substitute.
For a slightly more complex example of proving a derivative from the definition, see
Proof of Derivative for Function f(x) = ax^n
The key is that the simplification is valid as long as h is not zero; and the limit relates only to values of h other than zero:
Maybe it will help you if you go back to the first-principles definition of a limit: (this is from my old calculus textbook) Suppose f is a function defined for values of x near a. (The domain of f need not include a, although it may.) We say that L is the limit of f(x) as x approaches a, provided that, for any epsilon > 0 there corresponds a deleted neighborhood N of a such that L - epsilon < f(x) < L + epsilon whenever x is in N and in the domain of f. Notice that the domain of f need not include the point at which the limit is to be taken, and that the condition only needs to hold when x is in the domain of f. This shows that a function, such as our function of h, can have a limit at a point (h=0 in our case) where the function itself is undefined. There is no contradiction there.
Adrian wanted to be sure he understood clearly, so he wrote back:
Thanks Doctor Rick, it is a bit clearer though I still have one small problem. Sticking with the example you gave, you eventually arrive at lim 4x + 2h = 4x. h->0 What is confusing is that before reaching this step, we were unable to simply sub in h as 0 and I was under the impression that this was because h didn't really equal zero, but rather a number extremely close to zero. However, once we have eliminated the problem of getting 0/0 if we were to sub in h as zero, it seems like it's perfectly alright to treat h as if it is exactly zero, as we can sub it into the equation as zero in order to arrive at 4x. Is it possible to think of 'h' as being BOTH extremely close to zero and exactly zero, and therefore limits to be kind of a way of sidestepping the rule of not being able to divide by zero, or am I totally missing the concept?
Doctor Rick responded by saying something like what I said above (which is worth saying more than once!).
What's going on here is that we have replaced a function f(h) that has no value at h=0, with a CONTINUOUS function g(h) = 4x + 2h. The two functions f(h) and g(h) are equal for every h EXCEPT h=0; therefore they have the same limit at h=0. The limit at point a of a function g(x) that is CONTINUOUS at a is the value g(a). (That's the definition of a continuous function.) Therefore we easily find the limit of 4x + 2h as h approaches 0, namely 4x; and this must also be the limit of f(h) as h approaches 0.
The original function f had a “hole”, a point where it was not defined. The new function g is exactly the same function with the hole “filled in”. And since the limit is not affected by the hole, it is the same for both.
Does that help? We are not sidestepping the "rule" (that division by zero is undefined) in the sense of violating it in any way. We are finding the LIMIT of a function with h in the denominator as h approaches 0; the limit, by definition, never invokes division by EXACTLY zero. We evaluate the limit by finding a function that has the same values for every x NEAR zero (but not EXACTLY zero); this function does not involve division by h, so there is nothing wrong with evaluating it at exactly zero. You could say we are "getting around" division by zero in the literal sense that we are working in the immediate neighborhood of zero without going exactly there. It may be tricky, but it's perfectly legal.
A good case could be made that this “sidestepping” is exactly why limits were invented. In the beginning of calculus, one had to simultaneously think of our h as being zero and yet not being zero — very small, but still legal to divide by. Limits were the way to allow us to make sense when we talk about derivatives.
1. With the function y = x^2 consider both x+h and x-h
Then the derivative is {(x+h)^2 – (x-h)^2} / 2h = 4xh / 2h = 2x as the limit
Interestingly, with this function, whatever the value of ‘h’ (bar zero) the slope of the line is always 2x
2. Alternatively consider the result of x+h and x-h taken separately, giving derivatives of 2x+h and 2x-h. In the limit these approach x from both ‘above and below’, which I find satisfying.
If valid, of course!
The first comment is an application of an alternative definition for the derivative, discussed here:
The Symmetric Derivative
It isn’t always identical to the derivative, but it happens to be very convenient in this specific case, as you end up taking the limit of a constant function. It doesn’t, however, really bypass the issue under discussion, as you still had to cancel h, and before that you did have the form 0/0.
The symmetry you observe in the left- and right-hand difference quotients is equivalent to your observation about the constancy of the symmetric difference quotient, as the latter is the average of the former, which is always 2x.