### 14.3 Inner product spaces

Key points

• An inner product (14.117) is a special function that induces a rich geometry on a vector space, including length (14.160), distance (14.167) and angle (14.176).
• An inner product is used to define the symmetry (14.138)-(14.129) and positive (semi)definiteness (14.157)-(14.140) of a linear operator and thus the corresponding matrix.
• The inner product leads to the notion of orthogonality (14.182), which in turn leads to orthogonal projection or best prediction (14.205).

In this section we review basic notions of geometry that hold for inner product spaces [W] and we introduce some fundamental properties of matrices that will play a central role throughout the section.

An inner product space is a vector space (14.1)-(14.2) with an associated inner product . An inner product is a binary operation [W], that is a function that takes as input any two elements in a vector space and outputs a real number, i.e.

 ⟨⋅,⋅⟩:{v,w}∈L×L↦⟨v,w⟩∈R. (14.117)

To be an inner product, (14.117) must display the following properties for any vectors and any scalar :

1.
Symmetry
 ⟨v,w⟩=⟨w,v⟩; (14.118)
2.
Linearity
 ⟨c×v,w⟩=c×⟨v,w⟩; (14.119)
 ⟨v+u,w⟩=⟨v,w⟩+⟨u,w⟩; (14.120)
3.
Positive definiteness
 ⟨v,v⟩≥0 and ⟨v,v⟩=0⇔v=0. (14.121)

A commonly used inner product on the vector space is the dot product [W], which is defined as

 v⋅w≡⟨v,w⟩2≡v'w=∑¯nn=1vnwn, (14.122)

where vectors in are commonly denoted using the bold notation . We want to stress that the dot product (14.122) is one specific inner product on , but there are in fact infinitely many inner products which can be defined on , and more generally in any vector space, see Section 17.3.4 for an example of an inner product in . The only requirement for an inner product is that it needs to satisfy the defining properties (14.118)-(14.119)-(14.120)-(14.121).

Example 14.23. Dot product
We continue from Example 14.12. Consider the vectors (14.29)-(14.30). The dot product of and , (14.122) is calculated as

 ⟨hspr1,2,hbly⟩2=∑3n=1([hspr1,2]n×[hbly]n)=1×1−1×(−2)+0×1=3. (14.123)

Note also that while we have restricted our attention to vector spaces over the real numbers for simplicity, the concepts in this section can be easily extended to vector spaces over arbitrary fields [W], for example the field of complex numbers. This means that the scalars are elements from the chosen field, rather than the real numbers.

#### 14.3.1 Symmetry

Symmetric matrices arise naturally in the context of quadratic forms and inner products. Thanks to their powerful properties, they are widely used in many branches of mathematics, see for example Section 14.5.

The transpose of an matrix (14.67) is the matrix obtained by switching the rows and the columns, obtaining

 a'≡⎛⎜ ⎜ ⎜ ⎜ ⎜⎝a1,1a2,1⋯a¯¯¯m,1a1,2a2,2⋯a¯¯¯m,2⋮⋮⋱⋮a1,¯na2,¯n⋯a¯¯¯m,¯n⎞⎟ ⎟ ⎟ ⎟ ⎟⎠. (14.124)

In compact notation, each entry of the transpose matrix  is determined as

 [a']n,m=[a]m,n. (14.125)

for all and .

Example 14.24. Transpose
We continue from Example 14.22, where we considered the matrix (14.75). Applying the definition (14.125), we see that the transpose is given by

 a'=⎛⎜⎝1−101111−21⎞⎟⎠. (14.126)

Similar to (14.104), if and are two matrices, then their product satisfies

 (ba)'=a'b'. (14.127)

Moreover, if is an invertible matrix (14.101), then the operations of transposition and inversion commute, namely

 (q')−1=(q−1)'. (14.128)

A symmetric matrix, denoted by , is any square matrix (14.69) which is equal to its own transpose

 s=s', (14.129)

where denotes the transpose of (14.124). Therefore, the entries of a symmetric matrix (14.129) are symmetric with respect to the main diagonal.

Example 14.25. Symmetric matrix
Consider the matrix

 s≡(3332). (14.130)

The matrix  is symmetric (14.129), since

 [s]1,2=[s]2,1=3. (14.131)

We generalize the notion of transpose of a matrix (14.124) to the field of complex numbers. The conjugate transpose of an matrix [W] is the matrix obtained by switching the rows and the columns and taking the conjugate [W] of each entry, obtaining

 aH≡⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝¯¯¯a1,1¯¯¯a2,1⋯¯¯¯a¯¯¯m,1¯¯¯a1,2¯¯¯a2,2⋯¯¯¯a¯¯¯m,2⋮⋮⋱⋮¯¯¯a1,¯n¯¯¯a2,¯n⋯¯¯¯a¯¯¯m,¯n⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠. (14.132)

In compact notation, each entry of the conjugate transpose matrix  is determined as

 [aH]n,m=¯¯¯¯¯¯¯[a]m,n. (14.133)

Then generalizing the notion of symmetric matrix (14.129), a Hermitian matrix is any square matrix (14.69) which is equal to its own conjugate transpose (14.132)

 s=sH. (14.134)

Hermitian matrices (14.134) can be understood as the complex extension of real symmetric matrices (14.129).

Example 14.26. Hermitian matrix
Consider the matrix

 s≡(3−ii2). (14.135)

The matrix  is Hermitian (14.134), since

 [s]1,2=¯¯¯¯¯¯[s]2,1=−i. (14.136)

A linear transformation (14.49)-(14.50), where is a vector space (14.17) with inner product  (14.117) is symmetric if

 ⟨Sv,w⟩=⟨v,Sw⟩, (14.137)

for any vectors .

This definition (14.137) can be equivalently restated in terms of matrices. We know that every linear transformation between finite-dimensional vector spaces (14.17) can be represented by a suitable matrix (14.67) and that any finite-dimensional vector space (14.17) is isomorphic to for some (14.18). If the linear transformation is symmetric (14.137), then after fixing an orthonormal basis (14.198) on the domain , this property translates to the structure of the matrix that represents . Indeed, we have

 S symmetric⇔s symmetric, (14.138)

where the matrix represents the linear transformation (14.67).

#### 14.3.2 Positivity

Together with symmetric matrices (14.129), positive (semi)definite matrices [W] are the cornerstones when dealing with quadratic forms and inner products. They also play a crucial role in the spectral theorem, which provides us with a geometric interpretation of the linear transformations represented by these matrices, see Section 14.5. We encounter them often in statistical applications too, since covariance matrices (23.32) are symmetric and positive (semi)definite.

A square, symmetric matrix (14.129) is a positive definite matrix, denoted , if

 x's2x>0, (14.139)

for any non-zero vector .

A square, symmetric matrix is a positive semidefinite matrix, denoted , if it satisfies the looser condition

 x's2x≥0, (14.140)

for any vector . Note that we use the squared notation to indicate that the matrix is positive (semi)definite, because any positive (semi)definite matrix can be written as the square of a matrix, as we shall see in (14.452).

A symmetric, positive (semi)definite matrix can also be characterized in terms of its eigenvalues (14.341), see Section 14.5.1 for more details.

For further properties of positive (semi)definite matrices, see E.14.8 , E.14.9 and E.14.10 .

Example 14.27. Positive (semi)definite matrix
We continue from Example 14.25. The matrix (14.130) is not positive semidefinite (14.140), since for we have

 x'sx=−1<0. (14.141)

Let us change the off-diagonal entries of (14.130) to define a new matrix as follows

 s2≡(3√2√22). (14.142)

The matrix (14.142) is symmetric (14.129) and positive semidefinite (14.140). To show that it is positive semidefinite (14.140), we consider a generic -dimensional vector and see that E.14.7

 x's2x=(x1,x2)(3√2√22)(x1x2)=3(x1+√23x2)2+43x22≥0, (14.143)

for any vector , as required by our definition. Note that is strictly positive definite (14.139). Indeed, we can see that the equality holds in (14.143) if and only if and , i.e. , as required by the definition (14.139).

The set of positive semidefinite matrices (14.140) is a cone (16.105)-(16.106)-(16.107) denoted by (16.92). To support the notation , we show in Section 14.6.6 that every positive semidefinite matrix can be written as (14.451) where is a symmetric matrix

 s2∈S¯n+⇔{s2=sss'=s. (14.144)

Therefore the set of positive semidefinite matrices has dimension

 dim(S¯n+)=¯n(¯n+1)2. (14.145)

We say that a square, symmetric matrix (14.129) is a negative definite matrix, denoted , if its opposite is positive definite (14.139), i.e.

 q≺0⇔s2≡−q≻0. (14.146)

Similarly we say that is a negative semidefinite matrix, denoted , if its opposite is positive semidefinite (14.140), i.e.

 q≼0⇔s2≡−q≽0. (14.147)

A negative (semi)definite matrix can be characterized in terms of its eigenvalues in a similar way to a positive (semi)definite matrix, see Section 14.5.3 for more details.

The dot product (14.122) can be generalized using a symmetric (14.129), positive definite matrix (14.139) as a scaling factor. Indeed, the Mahalanobis inner product is defined as

 ⟨v,w⟩s2≡v'(s2)−1w=(s−1v)'(s−1w), (14.148)

for , where is a symmetric (14.129), positive definite matrix (14.139) and is a full-rank (14.74) matrix such that , for more details see Section 14.6.2. In fact, on finite-dimensional inner product spaces, every inner product can be expressed in terms of a Mahalanobis inner product (14.148) on the vector coordinates, see Section 14.6 for details.

Example 14.28. Mahalanobis inner product
We continue from Example 14.23, where we considered the vectors (14.29)-(14.30). Consider the symmetric (14.129), positive definite matrix (14.139)

 s2=⎛⎜ ⎜⎝1900010002⎞⎟ ⎟⎠, (14.149)

which has inverse

 (s2)−1=⎛⎜ ⎜⎝9000100012⎞⎟ ⎟⎠. (14.150)

The Mahalanobis inner product of and with scaling matrix (14.149), (14.148) is calculated as

 ⟨hspr1,2,hbly⟩s2=∑3n=1([hspr1,2]n×[(s2)−1]n,n×[hbly]n)=1×9×1−1×1×(−2)+0×12×1=11. (14.151)

The quadratic form of a positive definite matrix (14.139) is defined as

 f(x)≡x's2x=∑¯nn,m=1s2n,mxnxm, (14.152)

for . Note that a quadratic form can be defined for any square matrix (14.69), even though for our applications we mostly focus on quadratic form of positive definite matrices (14.152).

The quadratic form (14.152) defines a paraboloid with a unique global minimum value (16.1). This is the basis of numerous applications: the pdf of the multivariate normal distribution (20.96) and of all elliptical distributions (20.247) follow from it with slight changes and considerations, quadratic programming (16.72), mean-variance allocation (9a.9), linear regression (28.56), and more.

Example 14.29. Quadratic form of a positive definite matrix

We continue from Example 14.27, where we showed that the symmetric matrix (14.142) is positive definite (14.139). The left plot in Figure 14.5 displays the surface defined by the quadratic form (14.152) associated with (14.143). Note that the surface is a paraboloid with a unique local minimum (16.20).

The iso-contours [W] of the paraboloid (14.152) for a positive definite matrix (14.139)

 Is2(γ)≡{x:x's2x=γ}, (14.153)

for , are ellipsoids. We explain why this is the case in Section 23.2.4, and in Example 23.15 we illustrate how the properties of the positive definite matrix, in particular its eigenvectors and eigenvalues (14.279), correspond to the properties of the associated ellipsoid.

Example 14.30. Iso-contours of the quadratic form of a positive definite matrix
We continue from Example 14.29. The right plot of Figure 14.5 displays the iso-contours (14.153) of the quadratic form associated with the positive definite (14.139) matrix (14.143) for values of . Note that the iso-contours are indeed ellipsoids.

Example 14.31. Chebyshev’s inequality
In Example 23.25, Figure 23.6 displays in red the location-dispersion ellipsoid (23.73) of radius with center and shape determined respectively by the expectation and covariance (which is positive (semi)definite) of the bivariate normal random variable (23.129). This visual representation allows us to quickly notice the features of the distribution, and fits well with the simulated realizations which are displayed as gray dots. Notice how the location-dispersion ellipsoid (23.73) is defined using a quadratic form (14.152) of the (inverse) covariance matrix.

Let us now consider a symmetric linear transformation (14.137), where is a finite-dimensional vector space (14.17), and let us denote with  the inner product (14.117). The linear transformation is positive definite if

 ⟨S2v,v⟩>0, (14.154)

for any non-zero vector .

Similarly, the linear symmetric transformation is positive semidefinite if it satisfies the looser condition

 ⟨S2v,v⟩≥0 (14.155)

for any vector . These two definitions (14.154)-(14.155) can also be equivalently restated in terms of matrices. Bearing in mind what has been done for symmetric linear transformations (14.137) in Section 14.3.1, we have

 S2 positive definite⇔s2 positive definite (14.156)

and

 S2 positive semidefinite⇔s2 positive semidefinite, (14.157)

where the matrix represents the linear transformation (14.67).

In an inner product space , any linear operator (14.49)-(14.50) which has one-dimensional range can be represented uniquely by a vector using the inner product , a result known as the musical isomorphism [W].

Indeed, for any fixed vector , consider the real-valued “flat” linear operator induced by the inner product

 v↦b♭[v]≡