- An inner product (14.116) is a special function that induces a rich geometry on a vector space (14.1)-(14.2), including length (14.159), distance (14.166) and angle (14.175).
- An inner product is used to define the symmetry (14.137)-(14.128) and positive (semi)definiteness (14.156)-(14.139) of a linear operator and thus the corresponding matrix.
- The inner product leads to the notion of orthogonality (14.181), which in turn leads to orthogonal projection or best prediction (14.204).
In this section we review basic notions of geometry that hold for inner product spaces [W] and we introduce some fundamental properties of matrices that will play a central role throughout the section.
An inner product space is a vector space (14.1)-(14.2) with an associated inner product . An inner product is any function that takes as input any two elements in a vector space and outputs a real number, i.e.
To be an inner product, (14.116) must display the following properties for any vectors and any scalar :
A commonly used inner product on the vector space is the dot product [W], which is defined as
where recall vectors in are commonly denoted using the bold notation . We want to stress that the dot product (14.121) is one specific inner product on , but there are in fact infinitely many inner products which can be defined on , and more generally in any vector space, see Section 16.3.5 for an example of an inner product in . The only requirement for an inner product is that it needs to satisfy (14.117)-(14.118)-(14.119)-(14.120).
Note also that while we have restricted our attention to vector spaces over the real numbers for simplicity, the concepts in this section can be easily extended to vector spaces over arbitrary fields [W], for example the field of complex numbers. This means that the scalars are elements from the chosen field, rather than the real numbers.
Symmetric matrices arise naturally in the context of quadratic forms and inner products. Thanks to their powerful properties, they are widely used in many branches of mathematics, see for example Section 14.5.
The transpose of an matrix (14.67) is the matrix obtained by switching the rows and the columns, obtaining
In compact notation, each entry of the transpose matrix is determined as
for all and .
Similar to (14.103), if and are two matrices, then their product satisfies
Moreover, if is an invertible matrix (14.100), then the operations of transposition and inversion commute, namely
A symmetric matrix is any square matrix (14.67) which is equal to its own transpose
where denotes the transpose of (14.123).
The matrix is symmetric (14.128), since
In compact notation, each entry of the conjugate transpose matrix is determined as
where denotes the conjugate transpose of (14.131).
The matrix is Hermitian (14.133), since
Let us now consider a linear transformation (14.49)-(14.50), where is a finite-dimensional vector space (14.17) and let us denote with the inner product (14.116). The linear transformation is symmetric if
for any vectors . Property (14.136) can be equivalently restated in terms of matrices. We know that every linear transformation between finite-dimensional vector spaces (14.17) can be represented by a suitable matrix (14.67) and that any finite-dimensional vector space (14.17) is isomorphic to for some (14.18). If the linear transformation is symmetric (14.136), then after fixing an orthonormal basis (14.199) on the domain , this property translates to the structure of the matrix that represents . Indeed, we have
where the matrix represents the linear transformation (14.67).
Together with symmetric matrices (14.128), positive (semi)definite matrices [W] are the cornerstones when dealing with quadratic forms and inner products. They also play a crucial role in the spectral theorem, which provides us with a geometric interpretation of the linear transformations represented by these matrices, see Section 14.5. We encounter them often in statistical applications too, since covariance matrices (21.33) are symmetric and positive (semi)definite.
A square, symmetric matrix (14.128) is a positive definite matrix, denoted , if
for any non-zero vector .
for any vector . Note that we use the squared notation to indicate that the matrix is positive (semi)definite, because any positive (semi)definite matrix can be written as the square of a matrix, as we shall see in (14.440).
The set of positive semidefinite matrices (14.139) is a cone (17.52)-(17.53)-(17.54) denoted by (17.60). To support the notation , we show in (14.439) that every positive semidefinite matrix can be written as where is a symmetric matrix
Therefore the set of positive semidefinite matrices, which is a cone denoted by , has dimension
Similarly we say that is a negative semidefinite matrix, denoted , if its opposite is positive semidefinite (14.139), i.e.
A negative (semi)definite matrix can be characterized in terms of its eigenvalues in a similar way to a positive (semi)definite matrix, see Section 14.5.1 for more details.
Let us change the off-diagonal entries of (14.129) to define a new matrix as follows
for any vector , as required by our definition. Note that is strictly positive definite (14.138). Indeed, if we can see that the equality holds in (14.146) if and only if and , i.e. , as required by the definition (14.138).
for , where is a symmetric (14.128), positive definite matrix (14.138) and is a full-rank (14.73) matrix such that , for more details see Section 14.6.2. In fact, on finite-dimensional inner product spaces, every inner product can be expressed in terms of a Mahalanobis inner product (14.147) on the vector coordinates, see Section 14.6 for details.
which has inverse
For a square, positive definite matrix (14.138), we can define a quadratic form
for . The quadratic form (14.151) defines a paraboloid with a unique global minimum value (17.8). This is the basis of numerous applications: the pdf of the multivariate normal distribution (18.95) and of all elliptical distributions (18.242), quadratic programming (17.72), mean-variance allocation (9a.9), linear regression (25.47), and more.
We continue from Example 14.27, where we showed that the symmetric matrix (14.145) is positive definite (14.138). The left plot in Figure 14.5 displays the surface defined by the quadratic form (14.151) associated with (14.146). Note that the surface is a paraboloid with a unique local minimum (17.6).
for , are ellipsoids. We explain why this is the case in Section 21.2.4, and in Example 21.12 we illustrate how the properties of the positive definite matrix, in particular its eigenvectors and eigenvalues (14.269), correspond to the properties of the associated ellipsoid.
Example 14.30. Iso-contours of the quadratic form of a positive definite
We continue from Example 14.29. The right plot of Figure 14.5 displays the iso-contours (14.152) of the quadratic form associated with the positive definite (14.138) matrix (14.146) for values of . Note that the iso-contours are indeed ellipsoids.
Example 14.31. Location-dispersion ellipsoid.
In Example 21.22, Figure 21.6 displays in red the location-dispersion ellipsoid (21.74) of radius with center and shape determined respectively by the expectation and covariance (which is positive (semi)definite) of the bivariate normal random variable (21.130). This visual representation allows us to quickly notice the features of the distribution, and fits well with the simulated realizations which are displayed as gray dots. Notice how the location-dispersion ellipsoid (21.74) is defined using a quadratic form (14.151) of the (inverse) covariance matrix.
Let us now consider a linear symmetric transformation (14.136), where is a finite-dimensional vector space (14.17), and let us denote with the inner product (14.116). The linear transformation is positive definite if
for any non-zero vector .
for any vector . Properties (14.153)-(14.154) can also be equivalently restated in terms of matrices. Bearing in mind what has been done for symmetric linear transformations (14.136) in Section 14.3.1, we have
where the matrix represents the linear transformation (14.67) and
where, again, the matrix represents the linear transformation (14.67).
In an inner product space , any linear operator (14.49)-(14.50) which has one-dimensional range can be represented uniquely by a vector using the inner product , a result known as the musical isomorphism [W].
for all .
The identification (14.158) between linear operators which map to the real line and vectors generalizes to the Riesz representation theorem (16.88), which is discussed in more detail in Section 16.3.3. This result lies at the foundation of linear pricing theory (0b.32), which is covered in depth in Chapter 0b.
In our simple example of a real vector space with the dot product (14.121), this length corresponds to the standard Euclidean norm