Quora noscript

27.3 Inner product spacesPIC

Key points

  • An inner product (27.123) is a special function that induces a rich geometry on a vector space (27.3), including length (27.131), distance (27.136) and angle (27.144).
  • The inner product leads to the notion of orthogonality (27.150), which in turn leads to orthgonal projection or best prediction (27.173).

In this section we review basic notions of geometry that hold for inner product spaces [W].

An inner product space is a vector space (27.3) with an associated inner product . An inner product takes as input any two elements in the vector space and outputs a real number, i.e.

(27.123)

To be an inner product, (27.123) must display the following properties for any vectors and any scalar :

1.
Symmetry
(27.124)
2.
Linearity
(27.125)
(27.126)
3.
Positive definiteness
(27.127)

A commonly used inner product on the vector space is the dot product [W], which is defined as

(27.128)

where recall vectors in are commonly denoted using the bold notation .

PICExample 27.27. We continue from Example 27.11. Consider the vectors (27.24)-(27.25). The dot product of and , (27.128) is calculated as

(27.129)

We can generalize the dot product to the Mahalanobis inner product by introducing a full-rank (27.65) scaling matrix so that

(27.130)

for . In fact, on finite-dimensional inner product spaces, every inner product can be expressed in terms of a Mahalanobis inner product (27.130) on the vector coordinates, see Section 27.6 for details.

Note also that while we have restricted our attention to vector spaces over the real numbers for simplicity, the concepts in this section can be easily extended to vector spaces over arbitrary fields [W], for example the field of complex numbers. This means that the scalars are elements from the chosen field, rather than the real numbers.

27.3.1 Length, distance and angle

Given an inner product (27.123), we can define the associated length, which we introduce in more generality later in (27.194), or norm, of a generic vector as

(27.131)

In our simple example of a real vector space with the dot product (27.128), this length corresponds to the standard Euclidean norm

(27.132)

PICExample 27.28. Continuing from Example 27.27, we can calculate the standard Euclidean norm (27.132) of each of the vectors (27.24)-(27.25). The calculation for is

(27.133)

and similarly

(27.134)

We can similarly define the more general Mahalanobis norm on , induced by the Mahalanobis inner product (27.130) as in (27.131), giving

(27.135)

for , where is the standard Euclidean norm (27.132) and the full-rank scaling matrix. On finite-dimensional inner product spaces, any norm induced by an inner product (27.131) can be expressed as the Mahalanobis norm (27.135) of the vector coordinates, see Section 27.6 for details.

An inner product also induces a distance, which we introduce in more generality later in (27.211), between two generic vectors via the length (27.131) of their difference, explicitly

(27.136)

for . In the example of a real vector space with the dot product, this distance (27.136) corresponds to the standard Euclidean distance

(27.137)

PICExample 27.29. Continuing from Example 27.28, we can calculate the standard Euclidean distance (27.137) between the vectors (27.24)-(27.25). The distance between and is calculated as

(27.138)

We can generalize the Euclidean distance (27.137) to the Mahalanobis distance on , induced by the Mahalanobis inner product (27.130) as in (27.136), giving

(27.139)

for , where is the standard Euclidean norm (27.132) and the full-rank scaling matrix. Note that the Mahalanobis distance (27.139) reduces to the standard Euclidean distance (27.137) when is the identity matrix. The Mahalanobis distance (27.139) allows us to generalize the absolute z-score (35.40) in the multivariate framework of an -dimensional random variable to the multivariate absolute z-score (35.41), as discussed in Section 35.2.

On finite-dimensional inner product spaces, any distance induced by an inner product (27.136) can be expressed as a Mahalanobis distance (27.139) on the coordinates, see Section 27.6 for details.

From the relationship between length (27.131) and inner product (27.123) we have the polarization identity [WE.32.7 

(27.140)

PICExample 27.30. Continuing from Example 27.29, we can verify the polarization identity (27.140) for the pair of vectors (27.24)-(27.25). Indeed, from our previous calculations of the norms (27.133) and (27.134) and the distance (27.138)

(27.141)

which agrees with our previous calculation of the inner product  (27.129).

The inner product (27.123) and length (27.131) also satisfy the Cauchy-Schwarz inequality [W]

(27.142)

where equality holds if and only if the vectors are linear dependent.

PICExample 27.31. Continuing from Example 27.30, we can verify the Cauchy-Schwarz inequality (27.142) for the pair of vectors (27.24)-(27.25). Indeed, from our previous calculations of the norms (27.133) and (27.134) and the inner product  (27.129), we see that

(27.143)

as the Cauchy-Schwarz inequality (27.142) states.

From the inner product (27.123) and the length (27.131) we can define the angle [W] between two generic vectors as follows

(27.144)

This is well-defined since the Cauchy-Schwarz inequality (27.142) guarantees that .

PICExample 27.32. Continuing from Example 27.31, we can calculate the standard Euclidean angle between the pair of vectors (27.24)-(27.25). Using the previously calculated inner product  (27.129) and lengths (27.133) and (27.134), the angle between and is calculated as (27.144)

(27.145)

The notion of angle (27.144) between two vectors naturally yields the notion of orthogonality: two vectors are orthogonal with respect to the inner product (27.123) if their inner product is zero

(27.146)

Care must be taken when determining the orthogonality of vectors using their coordinates, for details see Section 27.6.1.

PICExample 27.33. Continuing from Example 27.32, we recall that the inner product of (27.24)-(27.25) is (27.129). Therefore and are not orthogonal (27.146).
Consider the vector (27.19). Then

(27.147)

therefore these two vectors are orthogonal (27.146). Note that their inner product (27.147) implies that the angle between these vectors is .

In an inner product space, we can define an orthonormal set , as a set of vectors which are orthogonal (27.146) to each other and have length (27.131) equal to , or more succinctly

(27.148)

Note that an orthonormal set (27.148) is also linearly independent (27.9), in fact in more generality E.32.9 

(27.149)

27.3.2 Orthogonal projection

A vector is orthogonal to a linear subspace of (27.32) [W] if it is orthogonal to every vector in that subspace, i.e.

(27.150)

Note that the orthogonality (27.146) of two vectors and the definition of the angle (27.144) is equivalent to the vectors being perpendicular to each other

(27.151)

Then, for a vector to be orthogonal to a linear subspace , it must be perpendicular to the surface defined by .

PICExample 27.34. Continuing from Example 27.33, we consider the vectors (27.24)-(27.25)-(27.19). We can define a linear subspace (27.32) of as

(27.152)

From the definition of the span (27.8), we can write any vector as a linear combination of and ,

(27.153)

Then the inner product of the vector with any vector is

(27.154)

Therefore the vector is orthogonal to the subspace (27.150).

The notion of orthogonality to a space (27.150) naturally yields the definition of the orthogonal projection of a vector  onto the linear space as the only vector in that gives rise to an orthogonal residual

(27.155)

To write the orthogonal projection (27.155) of a vector onto a linear subspace (27.32) explicitly, we observe that the definition (27.155) is equivalent to the below projection equation

(27.156)

for all .

In particular, if is spanned by vectors , the projection equation (27.156) implies that

(27.157)

where , also known as projection coefficients, is the solution of the beta conditions

(27.158)

Note that the spanning vectors do not need to be linearly independent (27.9). If they are, then (27.158) has a unique solution via the inverse Gramian

(27.159)

If the spanning vectors are not independent, then (27.158) has infinite solutions, but they all identify the same vector (27.157), because the orthogonal projection (27.155) is unique .

PICExample 27.35. Continuing from Example 27.34, we consider the linear subspace (27.152) spanned by the linearly independent vectors (27.24) and (27.25). We can compute the orthogonal projection of the vector (27.18) onto using the formula (27.159). Applying it to our problem, we first obtain

(27.160)

Then, we use the previous calculated results (27.129)-(27.133)-(27.134) to calculate the inverse Gramian