### 31.3 Inner product spaces

Key points

• An inner product (31.117) is a special function that induces a rich geometry on a vector space, including length (31.161), distance (31.168) and angle (31.177).
• An inner product is used to define the symmetry (31.138)-(31.129) and positive (semi)definiteness (31.158)-(31.140) of a linear operator and thus the corresponding matrix.
• The inner product leads to the notion of orthogonality (31.183), which in turn leads to orthogonal projection or best prediction (31.206).

In this section we review basic notions of geometry that hold for inner product spaces [W] and we introduce some fundamental properties of matrices that will play a central role throughout the section.

An inner product space is a vector space (31.1)-(31.2) with an associated inner product . An inner product is a binary operation [W], that is a function that takes as input any two elements in a vector space and outputs a real number, i.e.

 ⟨⋅,⋅⟩:{v,w}∈L×L↦⟨v,w⟩∈R. (31.117)

To be an inner product, (31.117) must display the following properties for any vectors and any scalar :

1.
Symmetry
 ⟨v,w⟩=⟨w,v⟩; (31.118)
2.
Linearity
 ⟨c×v,w⟩=c×⟨v,w⟩; (31.119)
 ⟨v+u,w⟩=⟨v,w⟩+⟨u,w⟩; (31.120)
3.
Positive definiteness
 ⟨v,v⟩≥0 and ⟨v,v⟩=0⇔v=0. (31.121)

A commonly used inner product on the vector space is the dot product [W], which is defined as

 v⋅w≡⟨v,w⟩2≡v'w=∑¯nn=1vnwn, (31.122)

where vectors in are commonly denoted using the bold notation . We want to stress that the dot product (31.122) is one specific inner product on , but there are in fact infinitely many inner products which can be defined on , and more generally in any vector space, see Section 34.3.4 for an example of an inner product in . The only requirement for an inner product is that it needs to satisfy the defining properties (31.118)-(31.119)-(31.120)-(31.121).

Example 31.23. Dot product
We continue from Example 31.12. Consider the vectors (31.29)-(31.30). The dot product of and , (31.122) is calculated as

 ⟨hspr1,2,hbly⟩2=∑3n=1([hspr1,2]n×[hbly]n)=1×1−1×(−2)+0×1=3. (31.123)

Note also that while we have restricted our attention to vector spaces over the real numbers for simplicity, the concepts in this section can be easily extended to vector spaces over arbitrary fields [W], for example the field of complex numbers. This means that the scalars are elements from the chosen field, rather than the real numbers.

#### 31.3.1 Symmetry

Symmetric matrices arise naturally in the context of quadratic forms and inner products. Thanks to their powerful properties, they are widely used in many branches of mathematics, see for example Section 31.5.

The transpose of an matrix (31.67) is the matrix obtained by switching the rows and the columns, obtaining

 a'≡⎛⎜ ⎜ ⎜ ⎜ ⎜⎝a1,1a2,1⋯a¯¯¯m,1a1,2a2,2⋯a¯¯¯m,2⋮⋮⋱⋮a1,¯na2,¯n⋯a¯¯¯m,¯n⎞⎟ ⎟ ⎟ ⎟ ⎟⎠. (31.124)

In compact notation, each entry of the transpose matrix  is determined as

 [a']n,m=[a]m,n. (31.125)

for all and .

Example 31.24. Transpose
We continue from Example 31.22, where we considered the matrix (31.75). Applying the definition (31.125), we see that the transpose is given by

 a'=⎛⎜⎝1−101111−21⎞⎟⎠. (31.126)

Similar to (31.104), if and are two matrices, then their product satisfies

 (ba)'=a'b'. (31.127)

Moreover, if is an invertible matrix (31.101), then the operations of transposition and inversion commute, namely

 (q')−1=(q−1)'. (31.128)

A symmetric matrix, denoted by , is any square matrix (31.69) which is equal to its own transpose

 s=s', (31.129)

where denotes the transpose of (31.124). Therefore, the entries of a symmetric matrix (31.129) are symmetric with respect to the main diagonal.

Example 31.25. Symmetric matrix
Consider the matrix

 s≡(3332). (31.130)

The matrix  is symmetric (31.129), since

 [s]1,2=[s]2,1=3. (31.131)

We generalize the notion of transpose of a matrix (31.124) to the field of complex numbers. The conjugate transpose of an matrix [W] is the matrix obtained by switching the rows and the columns and taking the conjugate [W] of each entry, obtaining

 aH≡⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝¯¯¯a1,1¯¯¯a2,1⋯¯¯¯a¯¯¯m,1¯¯¯a1,2¯¯¯a2,2⋯¯¯¯a¯¯¯m,2⋮⋮⋱⋮¯¯¯a1,¯n¯¯¯a2,¯n⋯¯¯¯a¯¯¯m,¯n⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠. (31.132)

In compact notation, each entry of the conjugate transpose matrix  is determined as

 [aH]n,m=¯¯¯¯¯¯¯[a]m,n. (31.133)

Then generalizing the notion of symmetric matrix (31.129), a Hermitian matrix is any square matrix (31.69) which is equal to its own conjugate transpose (31.132)

 s=sH. (31.134)

Hermitian matrices (31.134) can be understood as the complex extension of real symmetric matrices (31.129).

Example 31.26. Hermitian matrix
Consider the matrix

 s≡(3−ii2). (31.135)

The matrix  is Hermitian (31.134), since

 [s]1,2=¯¯¯¯¯¯[s]2,1=−i. (31.136)

A linear transformation (31.49)-(31.50), where is a vector space (31.17) with inner product  (31.117) is symmetric if

 ⟨Sv,w⟩=⟨v,Sw⟩, (31.137)

for any vectors .

This definition (31.137) can be equivalently restated in terms of matrices. We know that every linear transformation between finite-dimensional vector spaces (31.17) can be represented by a suitable matrix (31.67) and that any finite-dimensional vector space (31.17) is isomorphic to for some (31.18). If the linear transformation is symmetric (31.137), then after fixing an orthonormal basis (31.199) on the domain , this property translates to the structure of the matrix that represents . Indeed, we have

 S symmetric⇔s symmetric, (31.138)

where the matrix represents the linear transformation (31.67).

#### 31.3.2 Positivity

Together with symmetric matrices (31.129), positive (semi)definite matrices [W] are the cornerstones when dealing with quadratic forms and inner products. They also play a crucial role in the spectral theorem, which provides us with a geometric interpretation of the linear transformations represented by these matrices, see Section 31.5. We encounter them often in statistical applications too, since covariance matrices (2b.29) are symmetric and positive (semi)definite.

A square, symmetric matrix (31.129) is a positive definite matrix, denoted , if

 x's2x>0, (31.139)

for any non-zero vector .

A square, symmetric matrix is a positive semidefinite matrix, denoted , if it satisfies the looser condition

 x's2x≥0, (31.140)

for any vector . Note that we use the squared notation to indicate that the matrix is positive (semi)definite, because any positive (semi)definite matrix can be written as the square of a matrix, as we shall see in (31.453).

In the case where , we can write any positive semidefinite matrix in the form

 s2=(s21ϱs1s2ϱs1s2s22), (31.141)

where and .

A symmetric, positive (semi)definite matrix can also be characterized in terms of its eigenvalues (31.342), see Section 31.5.1 for more details.

For further properties of positive (semi)definite matrices, see 56.8 , 56.9 and 56.10 .

Example 31.27. Positive (semi)definite matrix
We continue from Example 31.25. The matrix (31.130) is not positive semidefinite (31.140), since for we have

 x'sx=−1<0. (31.142)

Let us change the off-diagonal entries of (31.130) to define a new matrix as follows

 s2≡(3√2√22). (31.143)

The matrix (31.143) is symmetric (31.129) and positive semidefinite (31.140). To show that it is positive semidefinite (31.140), we consider a generic -dimensional vector and see that 56.7

 x's2x=(x1,x2)(3√2√22)(x1x2)=3(x1+√23x2)2+43x22≥0, (31.144)

for any vector , as required by our definition. Note that is strictly positive definite (31.139). Indeed, we can see that the equality holds in (31.144) if and only if and , i.e. , as required by the definition (31.139).

The set of positive semidefinite matrices (31.140) is a cone (33.105)-(33.106)-(33.107) denoted by (33.92). To support the notation , we show in Section 31.6.6 that every positive semidefinite matrix can be written as (31.452) where is a symmetric matrix

 s2∈S¯n+⇔{s2=sss'=s. (31.145)

Therefore the set of positive semidefinite matrices has dimension

 dim(S¯n+)=¯n(¯n+1)2. (31.146)

We say that a square, symmetric matrix (31.129) is a negative definite matrix, denoted , if its opposite is positive definite (31.139), i.e.

 q≺0⇔s2≡−q≻0. (31.147)

Similarly we say that is a negative semidefinite matrix, denoted , if its opposite is positive semidefinite (31.140), i.e.

 q≼0⇔s2≡−q≽0. (31.148)

A negative (semi)definite matrix can be characterized in terms of its eigenvalues in a similar way to a positive (semi)definite matrix, see Section 31.5.3 for more details.

The dot product (31.122) can be generalized using a symmetric (31.129), positive definite matrix (31.139) as a scaling factor. Indeed, the Mahalanobis inner product is defined as

 ⟨v,w⟩s2≡v'(s2)−1w=(s−1v)'(s−1w), (31.149)

for , where is a symmetric (31.129), positive definite matrix (31.139) and is a full-rank (31.74) matrix such that , for more details see Section 31.6.2. In fact, on finite-dimensional inner product spaces, every inner product can be expressed in terms of a Mahalanobis inner product (31.149) on the vector coordinates, see Section 31.6 for details.

Example 31.28. Mahalanobis inner product
We continue from Example 31.23, where we considered the vectors (31.29)-(31.30). Consider the symmetric (31.129), positive definite matrix (31.139)

 s2=⎛⎜ ⎜⎝1900010002⎞⎟ ⎟⎠, (31.150)

which has inverse

 (s2)−1=⎛⎜ ⎜⎝9000100012⎞⎟ ⎟⎠. (31.151)

The Mahalanobis inner product of and with scaling matrix (31.150), (31.149) is calculated as

 ⟨hspr1,2,hbly⟩s2=∑3n=1([hspr1,2]n×[(s2)−1]n,n×[hbly]n)=1×9×1−1×1×(−2)+0×12×1=11. (31.152)

The quadratic form of a positive definite matrix (31.139) is defined as

 f(x)≡x's2x=∑¯nn,m=1s2n,mxnxm, (31.153)

for . Note that a quadratic form can be defined for any square matrix (31.69), even though for our applications we mostly focus on quadratic form of positive definite matrices (31.153).

The quadratic form (31.153) defines a paraboloid with a unique global minimum value (33.1). This is the basis of numerous applications: the pdf of the multivariate normal distribution (7c.4) and of all elliptical distributions (7c.35) follow from it with slight changes and considerations, quadratic programming (33.72), mean-variance allocation (46a.9), linear regression (8.72), and more.

Example 31.29. Quadratic form of a positive definite matrix

We continue from Example 31.27, where we showed that the symmetric matrix (31.143) is positive definite (31.139). The left plot in Figure 31.5 displays the surface defined by the quadratic form (31.153) associated with (31.144). Note that the surface is a paraboloid with a unique local minimum (33.20).

The iso-contours [W] of the paraboloid (31.153) for a positive definite matrix (31.139)

 Is2(γ)≡{x:x's