(A new question of the week)
A recent question from a student working beyond what he has learned led to an interesting discussion of alternative methods for solving a minimization problem, both with and without calculus.
The problem
The question came from Kurisada a couple months ago:
f(x, y) = x2 – 4xy + 5y2 – 4y + 3 has a min value. Find the value of x and y when f(x, y) is minimum.
This is the first time I met this question, and here is my way:
First I made f'(x) = 2x – 4y = 0. (I’m not sure if I can make it to f(x) while it is actually f(x, y).)
Therefore x = 2y.
Then I input x = 2y to f(x, y) = y2 – 4y + 3 = (y – 2)2 – 1
Thus min value = -1
y = 2
x = 4
Apparently this student has not done multivariable calculus, but has invented some of its basic concepts, particularly partial derivatives. The method is nonstandard, but correct. Now we need to show why!
Partial derivatives
Doctor Rick replied:
From your parenthetical comment, it appears that you may not have learned about partial derivatives; but what you have done is perfectly valid.
The partial derivative of f(x, y) with respect to x, ∂f/∂x, is what you get when you differentiate while interpreting y as a constant rather than a variable. The minimum of a function of two variables must occur at a point (x, y) such that each partial derivative (with respect to x, and with respect to y) is zero. (Of course there are other possibilities akin to those in calculus of one variable — if the derivative is not defined, etc. They don’t apply here.)
You found the locus of points on which ∂f/∂x = 0, then wrote a function in one variable representing the value of f(x, y) on that locus, and minimized that function. It worked — good job!
Kurisada just hoped that it would be valid to temporarily pretend that y was constant; in effect, this amounted to slicing the surface defined by function f with a plane \(y = k\), and finding the minimum point of the resulting curve. This is what a partial derivative does: it gives the slope of such a curve.
Here is a graph of the surface defined by f (light blue), showing the intersection with the (arbitrarily chosen) plane \(y = 3.5\) (red) and its minimum point, A. The slope of this curve at any point is \(\frac{\partial f}{\partial x}\). M is the absolute minimum we are seeking.
Kurisada had found that the ordered pairs for which \(\frac{\partial f}{\partial x} = 0\) satisfy the equation \(x = 2y\), which is the equation of a plane; so the low points on the y-slices lie on the intersection of this plane with the surface. This curve is the locus (set of points) Doctor Rick referred to. Replacing \(x\) with \(2y\) in the equation \(z = x^2 – 4xy + 5y^2 – 4y + 3\) yielded \(z = (2y)^2 – 4(2y)y + 5y^2 – 4y + 3 = y^2 – 4y + 3\), which is the equation of the projection (“shadow”) of the locus on the yz-plane.
Here I have added in the intersection of the surface with the plane \(x = 2y\) (blue), showing how our point A lies on this locus; M is the minimum point of the locus, and therefore the minimum point on the surface.
The usual calculus method
Kurisada had questions about the reference to each partial derivative:
Is it possible if I regard y as the variable and x as the constant?
Does it mean that actually I need to do both the partial derivatives with respect to x and with respect to y?
Or is it only done to check whether the answer is true?
Doctor Rick replied first to the suggestion to treat only y as the variable:
That should work also. That is, find the locus of points at which ∂f/∂y = 0, then minimize f(x, y) constrained to this locus.
Let’s try that, using Kurisada’s method with y rather than x. We have $$f(x, y) = x^2 – 4xy + 5y^2- 4y + 3,$$ so $$\frac{\partial f}{\partial y} = -4x + 10y – 4,$$ which is zero when $$y = \frac{2x+2}{5}.$$ Putting this into \(f(x,y)\), we get $$f(x, y) = x^2 – 4x\left(\frac{2x+2}{5}\right) + 5\left(\frac{2x+2}{5}\right)^2 – 4\left(\frac{2x+2}{5}\right) + 3.$$ This simplifies to $$\frac{1}{5}(x-4)^2 – 1,$$ whose minimum again is -1, when \(x = 4\) and \(y = \frac{2(4)+2}{5} = 2.\)
Here we have the intersection of the plane \(x = 7\) (green) with the surface, showing its minimum (B), and the locus of these minima (purple), which again passes through M:
As to whether the method using both partials is needed:
What you did is sufficient. You minimized the function in two directions: the x direction, and along the “valley” you had identified.
My description says that you can also choose to minimize the function in the x and y directions. If you want to try solving the problem this way, use your equation that says ∂f/∂x = 0, and write another that says ∂f/∂y = 0, then solve these two equations simultaneously. This is no more difficult than what you did.
I like this description of the locus of what we might call “east-west” minima as a “valley”; the solution is the lowest point along this valley.
The usual method, as explained here, is to set both partial derivatives to zero, and combine the two resulting equations. The equations, as we’ve seen, are \(2x – 4y = 0\) and \(-4x + 10y – 4 = 0\). Multiplying the first by 2 and adding, we get \(2y – 4 = 0\), giving \(y = 2\); putting this into the first equation, we get \(x = 4\). So the minimum is at \((4, 2)\), and \(f(4, 2) = 4^2 – 4(4)(2) + 5(2)^2 – 4(2) + 3 = -1\) yet again. In effect, here we are finding the intersection of two “valleys” (curves along which we find minima in the east-west and north-south directions).
Here is a picture of this method, showing the two loci of minima intersecting at M:
Solving without calculus
Doctor Rick continued, referring to a previous question from Kurisada that specifically asked for multiple ways to solve a problem (an excellent idea!):
Now, in the spirit of your last question, I’ll point out that the problem can also be solved without calculus! We know that a square is minimum when the quantity being squared is zero, since a square can’t be negative. Thus, if you can rewrite the function as a sum of squared quantities and a constant, the minimum will be that constant, and will be attained when each of the squared quantities is zero. See what you can do with this idea!
Kurisada had already done this with \(y^2 – 4y + 3\), and now observed that completing the square on the first two terms “reduced the problem to one already solved”, as we say:
I changed it to (x – 2y)2 + y2 – 4y + 3
And because (x – 2y)2 to make it minimum, it becomes y2 – 4y + 3
It is the same to my result! (This way makes me understand more about something I don’t really understand before!)
Doctor Rick finished the work, putting everything together:
Thus far you have changed the function-defining equation
f(x, y) = x2 – 4xy + 5y2 – 4y + 3
to
f(x, y) = (x – 2y)2 + y2 – 4y + 3
Now we want to complete the square on the last three terms. But you have already done this! You said in your original posting on this thread that
y2 – 4y + 3 = (y – 2)2 – 1
That’s exactly what we need. Putting this into the function definition, we get our final result
f(x, y) = (x – 2y)2 + (y – 2)2 – 1
Now we can see immediately that the least possible value of f(x, y) is –1, attained when both squared quantities are zero, that is, when the following system of equations is satisfied:
x – 2y = 0
y – 2 = 0
The rest is easy.
We have often recommended trying multiple methods as a way to learn math more deeply. As Kurisada mentioned, the non-calculus method helped to see the problem and its answer from a new perspective. I hope my addition of the graphs (made with GeoGebra) adds yet another dimension to your understanding.
I like this technique.
FYI: There is a small error in this line (which did not copy well)
f(x,y)=x2–4x(2x+25)+5(2x+25)y2–4(2x+25)+3.
The y snuck in
Thanks for the correction! Let me know if you find any more.
What if x and y are related to each other by some function? (and not just a polynomial)
Hi, Narges.
That changes the problem to a constrained optimization problem, looking for the greatest or least value of the function f(x, y) given that x and y satisfy another equation, say g(x, y) = 0. This amounts to finding the minimum value of f along a curve in the xy plane.
Sometimes you can do this by solving the equation for y as a function of x, substituting that into function f, and minimizing the resulting function of one variable. (In fact, this is the standard approach in introductory calculus, where we may want to minimize, say, the area of a field subject to a constraint involving the amount of fencing. Here is a lesson on this sort of problem.)
Alternatively, an excellent method is Lagrange multipliers. This works well as long as the functions are suitably differentiable. Here is a lesson on the topic, which we have referred students to.
We have not yet discussed either method in the blog; but I may make posts about them in the near future, now that you’ve reminded me! Or you may have some questions of this sort to ask at our Ask a Question page. When you show us your own attempt, we will be able to determine what method is appropriate for you.