# Quadratic Forms, Matrix Norms and Optimization

Aug-Nov 2020

## Recap

• Vector space $V$ over a scalar field $F= \mathbb{R}$ or $\mathbb{C}$
• $m\times n$ matrix A represents a linear map $T:F^n\to F^m$
• dim null $T+$ dim range $T=$ dim $V$
• Solution to $Ax=b$ (if it exists): $u+$ null$(A)$
• Four fundamental subspaces of a matrix
• Column space, row space, null space, left null space
• Eigenvalue $\lambda$ and Eigenvector $v$: $Tv=\lambda v$
• There is a basis w.r.t. which a linear map is upper-triangular
• If there is a basis of eigenvectors, linear map is diagonal w.r.t. it
• Inner products, norms, orthogonality and orthonormal basis
• There is an orthonormal basis w.r.t. which a linear map is upper-triangular
• Orthogonal projection: distance from a subspace
• Adjoint of a linear map: $\langle Tv,w\rangle=\langle v,T^*w\rangle$
• null $T=$ $($range $T^*)^{\perp}$
• Self-adjoint: $T=T^*$, Normal: $TT^*=T^*T$
• Eigenvectors corresponding to different eigenvalues are orthogonal
• Complex/real spectral theorem: $T$ is normal/self-adjoint $\leftrightarrow$ orthonormal basis of eigenvectors

Consider real vector spaces for this lecture ($F=\mathbb{R}$)

$A$: $n\times n$ real symmetric matrix

Quadratic form: $x^TAx$ for $x\in\mathbb{R}^n$

Example: $A=\begin{bmatrix}1&4\\4&1\end{bmatrix}$

$x^TAx=x_1^2+8x_1x_2+x_2^2$

Can $x_1x_2$ term be eliminated by changing variables?

Example: Let $x_1=s_1+s_2$, $x_2=s_1-s_2$

$x^TAx=10s_1^2-6s_2^2$

How does the above generalise?

## Using spectral theorem to simplify quadratic forms

$A$: $n\times n$ real symmetric matrix

$\{e_1,\ldots,e_n\}$: orthonormal eigenvector basis

$Ae_i=\lambda_ie_i$

$S=\begin{bmatrix} \vdots&\cdots&\vdots\\ e_1&\cdots&e_n\\ \vdots&\cdots&\vdots \end{bmatrix}$, $S^{-1}=S^T=\begin{bmatrix} \cdots&e^T_1&\cdots\\ \vdots&\vdots&\vdots\\ \cdots&e^T_n&\cdots\\ \end{bmatrix}$

$A=SDS^T$, where $D=\begin{bmatrix} \lambda_1&&0\\ &\ddots&\\ 0&&\lambda_n \end{bmatrix}$

$x^TAx = x^TSDS^Tx = s^TDs$, where $s=S^Tx$

$x^TAx = \lambda_1s_1^2+\cdots+\lambda_ns_n^2$

## Constrained maximisation/minimisation of quadratic forms

$\max\limits_{\lVert x\rVert=1}x^TAx,\quad \min\limits_{\lVert x\rVert=1}x^TAx$

• Without any constraint, max is $0$/$\infty$ and min is $-\infty$/$0$

• $\max\dfrac{x^TAx}{x^Tx}=\max\limits_{\lVert x\rVert=1}x^TAx$

Example

$\max\limits_{x^2+y^2=1}4xy$

• Rectangle of largest area inscribed within a circle

• Various methods to solve it

• How to generalise?

## Spectral theorem and maximisation/minimisation

$A$: $n\times n$ real symmetric matrix

$\{e_1,\ldots,e_n\}$: orthonormal eigenvector basis

$A=\lambda_1e_1e^T_1+\cdots+e_ne^T_n$

$\lambda_1\ge \cdots \ge \lambda_n$

Let $x=x_1e_1+\cdots+x_ne_n$, $\lVert x\rVert^2=x^2_1+\cdots+x^2_n$

$x^TAx=\lambda_1x^2_1+\cdots+\lambda_nx^2_n$

Constrained maximum

$\max\limits_{\lVert x\rVert=1}x^TAx=\lambda_1$ at $x=e_1$

Constrained minimum

$\min\limits_{\lVert x\rVert=1}x^TAx=\lambda_n$ at $x=e_n$

## Orthogonality constraints

$A$: $n\times n$ real symmetric matrix

$\{e_1,\ldots,e_n\}$: orthonormal eigenvector basis

$A=\lambda_1e_1e^T_1+\cdots+e_ne^T_n$

$\lambda_1\ge \cdots \ge \lambda_n$

Additional constraint: $\langle x,e_1\rangle=0$

$\max\limits_{\lVert x\rVert=1,\langle x,e_1\rangle=0}x^TAx$

Let $x=x_2e_2+\cdots+x_ne_n$, $\lVert x\rVert^2=x^2_2+\cdots+x^2_n$

$x^TAx=\lambda_2x^2_2+\cdots+\lambda_nx^2_n$

Constrained maximum

$\max\limits_{\lVert x\rVert=1,\langle x,e_1\rangle=0}x^TAx=\lambda_2$ at $x=e_2$

## Matrix norm

$A$: $m\times n$ matrix

Norm of $A$, denoted $\lVert A\rVert$, is defined as $\lVert A\rVert=\max\limits_{\lVert x\rVert=1}\lVert Ax\rVert$

• $\lVert A\rVert=\max\limits_{x}\dfrac{\lVert Ax\rVert}{\lVert x\rVert}$

• Norm: maximum increase in norm from $x$ to $Ax$

• Also interesting: $\min\limits_{x}\dfrac{\lVert Ax\rVert}{\lVert x\rVert}$

## Matrix norm and quadratic forms

$\lVert A\rVert^2=\max\limits_{\lVert x\rVert=1}\lVert Ax\rVert^2$

$\lVert Ax\rVert^2=\langle Ax,Ax\rangle=x^TA^TAx$

Quadratic form defined by $A^TA$

$A^TA$: symmetric, positive

Largest eigenvalue: $\lambda_{\max}(A^TA)\ge0$

$\lVert A\rVert = \sqrt{\lambda_{\max}(A^TA)}$

## Optimising smooth multivariate functions

Consider $f(x_1,\ldots,x_n)$, $x_i\in\mathbb{R}$

$f$: continuous single and double partial derivatives

Critical points

$(x_1,\ldots,x_n)$ s.t. $\dfrac{\partial f}{\partial x_i}=0$ for all $i$

• At critical points, the function might be a local maximum, local minimum or a saddle point

Hessian matrix

$H=\begin{bmatrix} \dfrac{\partial f}{\partial x_1^2}&\cdots&\dfrac{\partial f}{\partial x_1\partial x_n}\\ \vdots&\ddots&\vdots\\ \dfrac{\partial f}{\partial x_n\partial x_1}&\cdots&\dfrac{\partial f}{\partial x_n^2} \end{bmatrix}$ (symmetric)

• Local maximum at a critical point if Hessian is negative definite

• Local minimum at a critical point if Hessian is positive definite

• Saddle point if Hessian is neither positive nor negative definite