Least Squares - Math 414 Spring 2023 (Narcowich)

Least Squares Problems

Least squares problems and inner products One of the standard problems in 3D geometry is to find the distance from a point to a plane. Many problems encountered in applications can be put into a similar "geometric" form, with the point being replaced by a vector v and the plane by a subspace W of an inner product space V. Specifically, what we wish to discuss here is called the least-squares problem. The object is to find both the distance of v to W, which is precisely the minimum of || v - w ||, where w is any vector in W, as well as any minimizer w₀ in W. The key to solving the problem is this.

Theorem. Let V be a vector space with an inner product < u, v >, and let V₀ be a subspace of V. A vector v₀ in minimizes the distance || v - w || if and only if v₀ satisfies the equation,
(∗) < v - v₀, w > = 0,
which holds for all w in V₀. In addition, v₀ is unique.

The significance of this theorem is that it provides a way to actually calculate the minimizer. If V₀ has an orthonormal basis E = {e₁,..., e_n}, then, since v₀ is in V₀, we recall that v₀ = α₁e₁ + ... + α_ne_n, where α_k = < w₀, u_k >, k = 1,..., n. By (∗), we have <v - v₀, e_k > = 0. Thus, <v, u_k > = <v₀, e_k >. From this it follows that α_k = <v, u_k >, and that the minimizer has the explicit form
v₀ = ∑_k <v, e_k > e_k.
The important feature of this formula is that we can calculate &appha;_k = < v₀, e_k > without knowing what v₀ is. Here are two examples

Least squares fitting of a function. We want to find the quadratic polynomial that gives the best least least squares fit for the function f(x) = e^2x on the interval [-1,1]. In this case, the inner product and norm are
< f , g > = ∫₋₁¹ f(x)g(x)dx and ||f|| = (∫₋₁¹ f(x)²dx)^½.
Since we want to use quadratics, we will take W = P₃. The basis that we will use is E = {p₀(x), p₁(x), p₃(x)}, where

p₀(x) = 2^-1/2, p₁(x) = (3/2)^1/2x, and p₂(x)= (5/8)^1/2(3x²-1).

These polynomials are called normalized Legendre polynomials and they are orthonormal: <p_i, p_j> = δ_ij. In this basis, the quadratic polynomial that is the best least squares fit to e^2x has the form

p(x) = α₁p₀(x) + α₂p(x) + α₃p₂(x). From our discussion above, we have

α₁ = < f, p₀> = ∫₋₁¹ e^2xp₀(x)dx = 8^-1/2(e² - e^-2)
&alpha:₂ = < f, p₁> = ∫₋₁¹ e^2xp₁(x)dx = (3/32)^1/2(e² + 3e^-2)
α₃ = < f, p₂> = ∫₋₁¹ e^2xp₂(x)dx = (5/128)^1/2(e² - 13e^-2)

The quadratic polynomial that is the best least squares fit to e^2x is
p(x) = (1/4)(e² - e^-2) + (3/8)(e² + 3e^-2)x + (5/32)(e² - 13e^-2)(3x²-1).

Both the function and quadratic least squares fit are plotted below.

Least-squares data fitting. Problem: The table below contains data obtained by measuring the concentration of a drug in a person's blood. Find and sketch the straight line that best fits the data in the (discrete) least squares sense.

Log of Concentration
t 0 1 2 3 4

ln(C) − 0.1 − 0.4 − 0.8 − 1.1 − 1.5

**Log of Concentration**
t	0	1	2	3	4
ln(C)	− 0.1	− 0.4	− 0.8	− 1.1	− 1.5

Solution. We want to find coefficients a₁ and a₂ such that y = a₁ + a₂t is the best least-squares straight-line fit to the data. This means that we choose the two constants a₁ and a₂ so that we minimize the sum S = (y₀ − a₁ + a₂·0)² + (y₁ − a₁ + a₂·1)² + ... + (y₄ − a₁ + a₂·4)². If we let
w₁ = [1 1 1 1 1]^T, w₂ = [0 1 2 3 4]^T, and y_d = [-0.1 -0.4 -0.8 -1.1 -1.5]^T,
then we can rewrite the sum above in terms of the inner product and norm for R⁵:
S = || y_d − a₁w₁ - a₂w₂ ||²
Next, let V₀ = span{w₁, w₂}. The minimization problem now can be put in the form discussed earlier:

Find v₀ in V₀ such that || y_d − v₀ || = min_{w ∈ V₀} || y_d − w ||.

It is easy to show that if e₁ =(5)^−½ [1 1 1 1 1]^T and e₂ = (10)^−½;[-2 -1 0 1 2]^T, then E = {e₁, e₂} is an orthonormal basis for V₀. The idea here is to first find v₀ in terms of the E basis, and then change basis to F = {w₁, w₂}. The reason for doing this is that a₁ and a₂ are the just the coordinates of v₀ relative to F. In the E basis, v₀ = < y_d, e₁> e₁ + < y_d, e₂> e₂ = −(1.7441 u₁ + 1.1068 e₂). With a little work, we see that e₁ = 5^−½w₁ and e₂ = 10^−½(w₂ − 2w₁). Substituting these in the expression for v₀ yields
v₀ = −0.0800w₁ − 0.35w₂.
From this, we get a₁ = −0.08 and a₂ = −0.35. The line we want is y = - 0.08 - 0.35t. The data and the line are plotted below.