- An inner product (27.123) is a special function that induces a rich geometry on a vector space (27.3), including length (27.131), distance (27.136) and angle (27.144).
- The inner product leads to the notion of orthogonality (27.150), which in turn leads to orthgonal projection or best prediction (27.173).
In this section we review basic notions of geometry that hold for inner product spaces [W].
An inner product space is a vector space (27.3) with an associated inner product . An inner product takes as input any two elements in the vector space and outputs a real number, i.e.
To be an inner product, (27.123) must display the following properties for any vectors and any scalar :
A commonly used inner product on the vector space is the dot product [W], which is defined as
where recall vectors in are commonly denoted using the bold notation .
We can generalize the dot product to the Mahalanobis inner product by introducing a full-rank (27.65) scaling matrix so that
for . In fact, on finite-dimensional inner product spaces, every inner product can be expressed in terms of a Mahalanobis inner product (27.130) on the vector coordinates, see Section 27.6 for details.
Note also that while we have restricted our attention to vector spaces over the real numbers for simplicity, the concepts in this section can be easily extended to vector spaces over arbitrary fields [W], for example the field of complex numbers. This means that the scalars are elements from the chosen field, rather than the real numbers.
In our simple example of a real vector space with the dot product (27.128), this length corresponds to the standard Euclidean norm
for , where is the standard Euclidean norm (27.132) and the full-rank scaling matrix. On finite-dimensional inner product spaces, any norm induced by an inner product (27.131) can be expressed as the Mahalanobis norm (27.135) of the vector coordinates, see Section 27.6 for details.
for . In the example of a real vector space with the dot product, this distance (27.136) corresponds to the standard Euclidean distance
for , where is the standard Euclidean norm (27.132) and the full-rank scaling matrix. Note that the Mahalanobis distance (27.139) reduces to the standard Euclidean distance (27.137) when is the identity matrix. The Mahalanobis distance (27.139) allows us to generalize the absolute z-score (35.40) in the multivariate framework of an -dimensional random variable to the multivariate absolute z-score (35.41), as discussed in Section 35.2.
Example 27.30. Continuing from Example 27.29, we can verify the polarization identity (27.140) for the pair of vectors (27.24)-(27.25). Indeed, from our previous calculations of the norms (27.133) and (27.134) and the distance (27.138)
which agrees with our previous calculation of the inner product (27.129).
where equality holds if and only if the vectors are linear dependent.
Example 27.31. Continuing from Example 27.30, we can verify the Cauchy-Schwarz inequality (27.142) for the pair of vectors (27.24)-(27.25). Indeed, from our previous calculations of the norms (27.133) and (27.134) and the inner product (27.129), we see that
as the Cauchy-Schwarz inequality (27.142) states.
This is well-defined since the Cauchy-Schwarz inequality (27.142) guarantees that .
Example 27.32. Continuing from Example 27.31, we can calculate the standard Euclidean angle between the pair of vectors (27.24)-(27.25). Using the previously calculated inner product (27.129) and lengths (27.133) and (27.134), the angle between and is calculated as (27.144)
Care must be taken when determining the orthogonality of vectors using their coordinates, for details see Section 27.6.1.
Then, for a vector to be orthogonal to a linear subspace , it must be perpendicular to the surface defined by .
From the definition of the span (27.8), we can write any vector as a linear combination of and ,
Then the inner product of the vector with any vector is
Therefore the vector is orthogonal to the subspace (27.150).
The notion of orthogonality to a space (27.150) naturally yields the definition of the orthogonal projection of a vector onto the linear space as the only vector in that gives rise to an orthogonal residual
for all .
In particular, if is spanned by vectors , the projection equation (27.156) implies that
Example 27.35. Continuing from Example 27.34, we consider the linear subspace (27.152) spanned by the linearly independent vectors (27.24) and (27.25). We can compute the orthogonal projection of the vector (27.18) onto using the formula (27.159). Applying it to our problem, we first obtain