### 14.3 Inner product spaces

Key points

• An inner product (14.116) is a special function that induces a rich geometry on a vector space (14.1)-(14.2), including length (14.159), distance (14.166) and angle (14.175).
• An inner product is used to define the symmetry (14.137)-(14.128) and positive (semi)definiteness (14.156)-(14.139) of a linear operator and thus the corresponding matrix.
• The inner product leads to the notion of orthogonality (14.181), which in turn leads to orthogonal projection or best prediction (14.204).

In this section we review basic notions of geometry that hold for inner product spaces [W] and we introduce some fundamental properties of matrices that will play a central role throughout the section.

An inner product space is a vector space (14.1)-(14.2) with an associated inner product . An inner product is any function that takes as input any two elements in a vector space and outputs a real number, i.e.

 ⟨⋅,⋅⟩:{v,w}∈V×V↦⟨v,w⟩∈R. (14.116)

To be an inner product, (14.116) must display the following properties for any vectors and any scalar :

1.
Symmetry
 ⟨v,w⟩=⟨w,v⟩; (14.117)
2.
Linearity
 ⟨c×v,w⟩=c×⟨v,w⟩; (14.118)
 ⟨v+u,w⟩=⟨v,w⟩+⟨u,w⟩; (14.119)
3.
Positive definiteness
 ⟨v,v⟩≥0 and ⟨v,v⟩=0⇔v=0. (14.120)

A commonly used inner product on the vector space is the dot product [W], which is defined as

 v⋅w≡⟨v,w⟩2≡v'w=∑¯nn=1vnwn, (14.121)

where recall vectors in are commonly denoted using the bold notation . We want to stress that the dot product (14.121) is one specific inner product on , but there are in fact infinitely many inner products which can be defined on , and more generally in any vector space, see Section 16.3.5 for an example of an inner product in . The only requirement for an inner product is that it needs to satisfy (14.117)-(14.118)-(14.119)-(14.120).

Example 14.23. Dot product.
We continue from Example 14.12. Consider the vectors (14.25)-(14.26). The dot product of and , (14.121) is calculated as

 ⟨hspr1,2,hbly⟩2=∑3n=1([hspr1,2]n×[hbly]n)=1×1−1×(−2)+0×1=3. (14.122)

Note also that while we have restricted our attention to vector spaces over the real numbers for simplicity, the concepts in this section can be easily extended to vector spaces over arbitrary fields [W], for example the field of complex numbers. This means that the scalars are elements from the chosen field, rather than the real numbers.

#### 14.3.1 Symmetry

Symmetric matrices arise naturally in the context of quadratic forms and inner products. Thanks to their powerful properties, they are widely used in many branches of mathematics, see for example Section 14.5.

The transpose of an matrix (14.67) is the matrix obtained by switching the rows and the columns, obtaining

 a'≡⎛⎜ ⎜ ⎜ ⎜ ⎜⎝a1,1a2,1⋯a¯¯¯m,1a1,2a2,2⋯a¯¯¯m,2⋮⋮⋱⋮a1,¯na2,¯n⋯a¯¯¯m,¯n⎞⎟ ⎟ ⎟ ⎟ ⎟⎠. (14.123)

In compact notation, each entry of the transpose matrix  is determined as

 [a']n,m=[a]m,n. (14.124)

for all and .

Example 14.24. Transpose.
We continue from Example 14.22, where we considered the matrix (14.74). Applying the definition (14.124), we see that the transpose is given by

 a'=⎛⎜⎝1−101111−21⎞⎟⎠. (14.125)

Similar to (14.103), if and are two matrices, then their product satisfies

 (ba)'=a'b'. (14.126)

Moreover, if is an invertible matrix (14.100), then the operations of transposition and inversion commute, namely

 (a')−1=(a−1)'. (14.127)

A symmetric matrix is any square matrix (14.67) which is equal to its own transpose

 a=a', (14.128)

where denotes the transpose of (14.123).

Example 14.25. Symmetric matrix
Consider the matrix

 a≡(3332). (14.129)

The matrix  is symmetric (14.128), since

 [a]1,2=[a]2,1=3. (14.130)

Generalizing (14.123), the conjugate transpose of an matrix [W] is the matrix obtained by switching the rows and the columns and taking the conjugate [W] of each entry, obtaining

 aH≡⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝¯¯¯a1,1¯¯¯a2,1⋯¯¯¯a¯¯¯m,1¯¯¯a1,2¯¯¯a2,2⋯¯¯¯a¯¯¯m,2⋮⋮⋱⋮¯¯¯a1,¯n¯¯¯a2,¯n⋯¯¯¯a¯¯¯m,¯n⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠. (14.131)

In compact notation, each entry of the conjugate transpose matrix  is determined as

 [aH]n,m=¯¯¯¯¯¯¯[a]m,n. (14.132)

Then generalizing (14.128), a Hermitian matrix is any square matrix (14.67) which is equal to its own conjugate transpose

 a=aH, (14.133)

where denotes the conjugate transpose of (14.131).

Example 14.26. Hermitian matrix
Consider the matrix

 b≡(3−ii2). (14.134)

The matrix  is Hermitian (14.133), since

 [b]1,2=¯¯¯¯¯¯[b]2,1=−i. (14.135)

Let us now consider a linear transformation (14.49)-(14.50), where is a finite-dimensional vector space (14.17) and let us denote with  the inner product (14.116). The linear transformation is symmetric if

 ⟨Av,w⟩=⟨v,Aw⟩, (14.136)

for any vectors . Property (14.136) can be equivalently restated in terms of matrices. We know that every linear transformation between finite-dimensional vector spaces (14.17) can be represented by a suitable matrix (14.67) and that any finite-dimensional vector space (14.17) is isomorphic to for some (14.18). If the linear transformation is symmetric (14.136), then after fixing an orthonormal basis (14.199) on the domain , this property translates to the structure of the matrix that represents . Indeed, we have

 A symmetric⇔a symmetric, (14.137)

where the matrix represents the linear transformation (14.67).

#### 14.3.2 Positivity

Together with symmetric matrices (14.128), positive (semi)definite matrices [W] are the cornerstones when dealing with quadratic forms and inner products. They also play a crucial role in the spectral theorem, which provides us with a geometric interpretation of the linear transformations represented by these matrices, see Section 14.5. We encounter them often in statistical applications too, since covariance matrices (21.33) are symmetric and positive (semi)definite.

A square, symmetric matrix (14.128) is a positive definite matrix, denoted , if

 x'σ2x>0, (14.138)

for any non-zero vector .

A square, symmetric matrix is a positive semidefinite matrix, denoted , if it satisfies the looser condition

 x'σ2x≥0, (14.139)

for any vector . Note that we use the squared notation to indicate that the matrix is positive (semi)definite, because any positive (semi)definite matrix can be written as the square of a matrix, as we shall see in (14.440).

A symmetric, positive (semi)definite matrix can also be characterized in terms of its eigenvalues (14.272)-(14.273), see Section 14.5.1 for more details.

For further properties of positive (semi)definite matrices, see E.14.8 , E.14.9 and E.14.10 .

The set of positive semidefinite matrices (14.139) is a cone (17.52)-(17.53)-(17.54) denoted by (17.60). To support the notation , we show in (14.439) that every positive semidefinite matrix can be written as where is a symmetric matrix

 σ2∈S¯n+⇔{σ2=σσσ'=σ. (14.140)

Therefore the set of positive semidefinite matrices, which is a cone denoted by , has dimension

 dim(S¯n+)=¯n(¯n+1)2. (14.141)

We say that a square, symmetric matrix (14.128) is a negative definite matrix, denoted , if its opposite is positive definite (14.138), i.e.

 a≺0⇔σ2≡−a≻0. (14.142)

Similarly we say that is a negative semidefinite matrix, denoted , if its opposite is positive semidefinite (14.139), i.e.

 a≼0⇔σ2≡−a≽0. (14.143)

A negative (semi)definite matrix can be characterized in terms of its eigenvalues in a similar way to a positive (semi)definite matrix, see Section 14.5.1 for more details.

Example 14.27. Positive (semi)definite matrix.
We continue from Example 14.25. The matrix (14.129) is not positive semidefinite (14.139), since for we have

 x'ax=−1<0. (14.144)

Let us change the off-diagonal entries of (14.129) to define a new matrix as follows

 σ2≡(3√2√22). (14.145)

The matrix (14.145) is symmetric (14.128) and positive semidefinite (14.139). To show that it is positive semidefinite (14.139), we consider a generic -dimensional vector and see that E.14.7

 x'σ2x=(x1,x2)(3√2√22)(x1x2)=3(x1+√23x2)2+43x22≥0, (14.146)

for any vector , as required by our definition. Note that is strictly positive definite (14.138). Indeed, if we can see that the equality holds in (14.146) if and only if and , i.e. , as required by the definition (14.138).

The dot product (14.121) can be generalized using a symmetric (14.128), positive definite matrix (14.138) as a scaling factor. Indeed, the Mahalanobis inner product is defined as

 ⟨v,w⟩s2≡v'(s2)−1w=(s−1v)'(s−1w), (14.147)

for , where is a symmetric (14.128), positive definite matrix (14.138) and is a full-rank (14.73) matrix such that , for more details see Section 14.6.2. In fact, on finite-dimensional inner product spaces, every inner product can be expressed in terms of a Mahalanobis inner product (14.147) on the vector coordinates, see Section 14.6 for details.

Example 14.28. Mahalanobis inner product.
We continue from Example 14.23, where we considered the vectors (14.25)-(14.26). Consider the symmetric (14.128), positive definite matrix (14.138)

 s2=⎛⎜ ⎜⎝1900010002⎞⎟ ⎟⎠ (14.148)

which has inverse

 (s2)−1=⎛⎜ ⎜⎝9000100012⎞⎟ ⎟⎠. (14.149)

The Mahalanobis inner product of and with scaling matrix (14.148), (14.147) is calculated as

 ⟨hspr1,2,hbly⟩s2=∑3n=1([hspr1,2]n×[(s2)−1]n,n×[hbly]n)=1×9×1−1×1×(−2)+0×12×1=11. (14.150)

For a square, positive definite matrix (14.138), we can define a quadratic form

 f(x)≡x'σ2x=∑¯nn,m=1σ2n,mxnxm, (14.151)

for . The quadratic form (14.151) defines a paraboloid with a unique global minimum value (17.8). This is the basis of numerous applications: the pdf of the multivariate normal distribution (18.95) and of all elliptical distributions (18.242), quadratic programming (17.72), mean-variance allocation (9a.9), linear regression (25.47), and more.

Example 14.29. Quadratic form of a positive definite matrix.

We continue from Example 14.27, where we showed that the symmetric matrix (14.145) is positive definite (14.138). The left plot in Figure 14.5 displays the surface defined by the quadratic form (14.151) associated with (14.146). Note that the surface is a paraboloid with a unique local minimum (17.6).

The iso-contours [W] of the paraboloid (14.151) for a positive definite matrix (14.138)

 Iσ2(γ)≡{x:x'σ2x=γ}, (14.152)

for , are ellipsoids. We explain why this is the case in Section 21.2.4, and in Example 21.12 we illustrate how the properties of the positive definite matrix, in particular its eigenvectors and eigenvalues (14.269), correspond to the properties of the associated ellipsoid.

Example 14.30. Iso-contours of the quadratic form of a positive definite matrix.
We continue from Example 14.29. The right plot of Figure 14.5 displays the iso-contours (14.152) of the quadratic form associated with the positive definite (14.138) matrix (14.146) for values of . Note that the iso-contours are indeed ellipsoids.

Example 14.31. Location-dispersion ellipsoid.
In Example 21.22, Figure 21.6 displays in red the location-dispersion ellipsoid (21.74) of radius with center and shape determined respectively by the expectation and covariance (which is positive (semi)definite) of the bivariate normal random variable (21.130). This visual representation allows us to quickly notice the features of the distribution, and fits well with the simulated realizations which are displayed as gray dots. Notice how the location-dispersion ellipsoid (21.74) is defined using a quadratic form (14.151) of the (inverse) covariance matrix.

Let us now consider a linear symmetric transformation (14.136), where is a finite-dimensional vector space (14.17), and let us denote with  the inner product (14.116). The linear transformation is positive definite if

 ⟨Av,v⟩>0, (14.153)

for any non-zero vector .

Similarly, the linear symmetric transformation is positive semidefinite if it satisfies the looser condition

 ⟨Av,v⟩≥0 (14.154)

for any vector . Properties (14.153)-(14.154) can also be equivalently restated in terms of matrices. Bearing in mind what has been done for symmetric linear transformations (14.136) in Section 14.3.1, we have

 A positive definite⇔a positive definite (14.155)

where the matrix represents the linear transformation (14.67) and

 A positive semidefinite⇔a positive semidefinite, (14.156)

where, again, the matrix represents the linear transformation (14.67).

In an inner product space , any linear operator (14.49)-(14.50) which has one-dimensional range can be represented uniquely by a vector using the inner product , a result known as the musical isomorphism [W].

Indeed, for any fixed vector , consider the real-valued “flat” linear operator induced by the inner product

 a♭[v]≡⟨a,v⟩, (14.157)

for all . The symmetry (14.117) and linearity of the inner product (14.118)-(14.119) imply that indeed the flat linear operator (14.157) is a linear operator.

The reverse is also true: for any linear operator (14.49)-(14.50) from an inner product space to the real numbers , there exists a unique “sharp” vector such that the following identity holds E.14.11

 A[v]≡⟨A♯,v⟩, (14.158)

for all .

The identification (14.158) between linear operators which map to the real line and vectors generalizes to the Riesz representation theorem (16.88), which is discussed in more detail in Section 16.3.3. This result lies at the foundation of linear pricing theory (0b.32), which is covered in depth in Chapter 0b.

#### 14.3.3 Length, distance and angle

Given an inner product (14.116), we can define the associated length, which we introduce in more generality later in (14.226), or norm, of a generic vector as

 ∥v∥≡√⟨v,v⟩. (14.159)

In our simple example of a real vector space with the dot product (14.121), this length corresponds to the standard Euclidean norm

 ∥v∥2≡√∑¯nn=1v2n. (14.160)

Example 14.32. Standard Euclidean norm.
Continuing from Example 14.28, the standard Euclidean norm (14.160