Linear equations, Least squares solutions and Linear regression

Andrew Thangaraj

Aug-Nov 2020

Recap

  • Vector space \(V\) over a scalar field \(F\)
    • \(F\): real field \(\mathbb{R}\) or complex field \(\mathbb{C}\) in this course
  • \(m\times n\) matrix A represents a linear map \(T:F^n\to F^m\)
    • dim null \(T\) + dim range \(T\) = dim \(V\)
    • Solution to \(Ax=b\) (if it exists): \(u+\) null\((A)\)
  • Four fundamental subspaces of a matrix
    • Column space, row space, null space, left null space
  • Eigenvalue \(\lambda\) and Eigenvector \(v\): \(Tv=\lambda v\)
    • Some linear maps are diagonalizable
  • Inner products, norms, orthogonality and orthonormal basis
    • Upper triangular matrix for a linear map over an orthonormal basis
    • \(V=U\oplus U^{\perp}\) for any subspace \(U\)
    • Orthogonal projection gives closest vector in the subspace

Linear equations

\(A\): \(m\times n\) matrix and \(b\): \(m\times1\) vector

System of linear equations: Find \(n\times1\) vector \(x\) s.t. \[Ax=b\]

Solution exists, if \(b\in\) range \(A\)

When solution exists

\(\lVert Ax-b\rVert=0\)

So, solution to \(Ax=b\) is \(\arg\min_{x}\lVert Ax-b\rVert^2\)

What if \(b\notin\) range \(A\) and there is no solution to \(Ax=b\)?

Least squares solution to linear equations

Least squares solution to \(Ax=b\)

\[\arg\min_{x}\lVert Ax-b\rVert^2\]

Properties

  • If \(b\in\) range \(A\), least squares solution

    coincides with usual solution

  • If \(b\notin\) range \(A\), the solution is still well-defined. Why?

Least squares solution and orthogonal projection

\(A\): \(m\times n\) matrix, \(b\): \(m\times1\) vector

Let \(U=\) range \(A\). Then, \(Ax\in\) range \(A\) and \(b\): vector

\[\arg\min_{x}\lVert Ax-b\rVert^2\leftrightarrow \arg\min_{u\in U}\lVert u-b\rVert\]

Least squares solution is essentially the same as closest vector in a subspace!

Let \(P_U\): orthogonal projection operator onto \(U\)

Least squares solution to \(Ax=b\) is the same as solving

\[Ax=P_Ub\]

Linear regression with IPL data

Orthogonal complement and projection

\(U\): subspace of \(V\), \(P_U\): orthogonal projection

For \(v\in V\), suppose \(u=P_Uv\). Then,

\[u-v\in U^{\perp}\]

\(u\in U\) is the closest vector to \(v\) iff \(u-v\in U^{\perp}\)

Spanning set for \(U\): \(\{u_1,\ldots,u_m\}\)

Then, \(u-v\in U^{\perp}\) iff \(\langle u_i,u-v\rangle=0\)

In other words, \(u=P_Uv\) is the unique solution to

\(\langle u_i,u\rangle = \langle u_i,v\rangle\) for \(i=1,\ldots,m\)

Solving linear regression

\(A\): \(m\times n\) real matrix, \(b\): \(m\times1\) real vector

Let \(U=\) range \(A\). Let \(a_i\) be the \(i\)-th column of \(A\).

Spanning set for \(U\): \(\{a_1,\ldots,a_n\}\)

Closest \(Ax\) to \(b\) satisfies \(Ax-b\in U^{\perp}\)

or

\(\langle a_i,Ax-b\rangle=0\)

In other words,

\[A^T(Ax-b)=0\]

or

\[A^TAx=A^Tb\]

Solution always exists for above equation!

Why? By orthogonal projection!