- An inner product (31.117) is a special function that induces a rich geometry on a vector space, including length (31.161), distance (31.168) and angle (31.177).
- An inner product is used to define the symmetry (31.138)-(31.129) and positive (semi)definiteness (31.158)-(31.140) of a linear operator and thus the corresponding matrix.
- The inner product leads to the notion of orthogonality (31.183), which in turn leads to orthogonal projection or best prediction (31.206).
In this section we review basic notions of geometry that hold for inner product spaces [W] and we introduce some fundamental properties of matrices that will play a central role throughout the section.
An inner product space is a vector space (31.1)-(31.2) with an associated inner product . An inner product is a binary operation [W], that is a function that takes as input any two elements in a vector space and outputs a real number, i.e.
To be an inner product, (31.117) must display the following properties for any vectors and any scalar :
A commonly used inner product on the vector space is the dot product [W], which is defined as
where vectors in are commonly denoted using the bold notation . We want to stress that the dot product (31.122) is one specific inner product on , but there are in fact infinitely many inner products which can be defined on , and more generally in any vector space, see Section 34.3.4 for an example of an inner product in . The only requirement for an inner product is that it needs to satisfy the defining properties (31.118)-(31.119)-(31.120)-(31.121).
Note also that while we have restricted our attention to vector spaces over the real numbers for simplicity, the concepts in this section can be easily extended to vector spaces over arbitrary fields [W], for example the field of complex numbers. This means that the scalars are elements from the chosen field, rather than the real numbers.
Symmetric matrices arise naturally in the context of quadratic forms and inner products. Thanks to their powerful properties, they are widely used in many branches of mathematics, see for example Section 31.5.
The transpose of an matrix (31.67) is the matrix obtained by switching the rows and the columns, obtaining
In compact notation, each entry of the transpose matrix is determined as
for all and .
Similar to (31.104), if and are two matrices, then their product satisfies
Moreover, if is an invertible matrix (31.101), then the operations of transposition and inversion commute, namely
A symmetric matrix, denoted by , is any square matrix (31.69) which is equal to its own transpose
The matrix is symmetric (31.129), since
We generalize the notion of transpose of a matrix (31.124) to the field of complex numbers. The conjugate transpose of an matrix [W] is the matrix obtained by switching the rows and the columns and taking the conjugate [W] of each entry, obtaining
In compact notation, each entry of the conjugate transpose matrix is determined as
The matrix is Hermitian (31.134), since
for any vectors .
This definition (31.137) can be equivalently restated in terms of matrices. We know that every linear transformation between finite-dimensional vector spaces (31.17) can be represented by a suitable matrix (31.67) and that any finite-dimensional vector space (31.17) is isomorphic to for some (31.18). If the linear transformation is symmetric (31.137), then after fixing an orthonormal basis (31.199) on the domain , this property translates to the structure of the matrix that represents . Indeed, we have
where the matrix represents the linear transformation (31.67).
Together with symmetric matrices (31.129), positive (semi)definite matrices [W] are the cornerstones when dealing with quadratic forms and inner products. They also play a crucial role in the spectral theorem, which provides us with a geometric interpretation of the linear transformations represented by these matrices, see Section 31.5. We encounter them often in statistical applications too, since covariance matrices (2b.29) are symmetric and positive (semi)definite.
A square, symmetric matrix (31.129) is a positive definite matrix, denoted , if
for any non-zero vector .
for any vector . Note that we use the squared notation to indicate that the matrix is positive (semi)definite, because any positive (semi)definite matrix can be written as the square of a matrix, as we shall see in (31.453).
In the case where , we can write any positive semidefinite matrix in the form
where and .
Let us change the off-diagonal entries of (31.130) to define a new matrix as follows
for any vector , as required by our definition. Note that is strictly positive definite (31.139). Indeed, we can see that the equality holds in (31.144) if and only if and , i.e. , as required by the definition (31.139).
The set of positive semidefinite matrices (31.140) is a cone (33.105)-(33.106)-(33.107) denoted by (33.92). To support the notation , we show in Section 31.6.6 that every positive semidefinite matrix can be written as (31.452) where is a symmetric matrix
Therefore the set of positive semidefinite matrices has dimension
Similarly we say that is a negative semidefinite matrix, denoted , if its opposite is positive semidefinite (31.140), i.e.
A negative (semi)definite matrix can be characterized in terms of its eigenvalues in a similar way to a positive (semi)definite matrix, see Section 31.5.3 for more details.
for , where is a symmetric (31.129), positive definite matrix (31.139) and is a full-rank (31.74) matrix such that , for more details see Section 31.6.2. In fact, on finite-dimensional inner product spaces, every inner product can be expressed in terms of a Mahalanobis inner product (31.149) on the vector coordinates, see Section 31.6 for details.
which has inverse
The quadratic form of a positive definite matrix (31.139) is defined as
The quadratic form (31.153) defines a paraboloid with a unique global minimum value (33.1). This is the basis of numerous applications: the pdf of the multivariate normal distribution (7c.4) and of all elliptical distributions (7c.35) follow from it with slight changes and considerations, quadratic programming (33.72), mean-variance allocation (46a.9), linear regression (8.72), and more.
We continue from Example 31.27, where we showed that the symmetric matrix (31.143) is positive definite (31.139). The left plot in Figure 31.5 displays the surface defined by the quadratic form (31.153) associated with (31.144). Note that the surface is a paraboloid with a unique local minimum (33.20).