Quora noscript

28.3 Inner product spacesPIC

Key points

  • An inner product (28.133) is a special function that induces a rich geometry on a vector space (28.1)-(28.2), including length (28.148), distance (28.153) and angle (28.161).
  • The inner product leads to the notion of orthogonality (28.167), which in turn leads to orthgonal projection or best prediction (28.190).

In this section we review basic notions of geometry that hold for inner product spaces [W].

An inner product space is a vector space (28.1)-(28.2) with an associated inner product . An inner product takes as input any two elements in a vector space and outputs a real number, i.e.

(28.133)

To be an inner product, (28.133) must display the following properties for any vectors and any scalar :

1.
Symmetry
(28.134)
2.
Linearity
(28.135)
(28.136)
3.
Positive definiteness
(28.137)

A commonly used inner product on the vector space is the dot product [W], which is defined as

(28.138)

where recall vectors in are commonly denoted using the bold notation .

PICExample 28.28. Dot product.
We continue from Example 28.11. Consider the vectors (28.27)-(28.28). The dot product of and , (28.138) is calculated as

(28.139)

We can generalize the dot product to the Mahalanobis inner product by introducing a full-rank (28.75) scaling matrix so that

(28.140)

for . In fact, on finite-dimensional inner product spaces, every inner product can be expressed in terms of a Mahalanobis inner product (28.140) on the vector coordinates, see Section 28.6 for details.

PICExample 28.29. Mahalanobis inner product.
We continue from Example 28.28, where we considered the vectors (28.27)-(28.28). Consider the scaling matrix

(28.141)

Using the formula for the determinant of a diagonal matrix (28.109), we see that

(28.142)

Thus is invertible (28.110) and full-rank (28.75). By matrix multiplication (28.89) we obtain

(28.143)

which has inverse

(28.144)

The Mahalanobis inner product (28.140) of and with scaling matrix , (28.138) is calculated as

(28.145)

Note also that while we have restricted our attention to vector spaces over the real numbers for simplicity, the concepts in this section can be easily extended to vector spaces over arbitrary fields [W], for example the field of complex numbers. This means that the scalars are elements from the chosen field, rather than the real numbers.

In an inner product space , any linear operator (28.51)-(28.52) can be expressed uniquely using the inner product . First note that the symmetry (28.134) and linearity of the inner product (28.135)-(28.136) imply that for any fixed vector , the inner product is a linear function of , i.e.

(28.146)

for . This suggests how we may use the inner product to represent a linear operator that maps to .

Indeed, for any linear operator (28.51)-(28.52) from an inner product space to the real numbers , there exists a unique vector such that E.32.7 

(28.147)

for all .

The identification between inner products and linear operators which map to the real line (28.147) generalizes to the Riesz representation theorem (30.59), which is discussed in more detail in Section 30.3.3. This result lies at the foundation of linear pricing theory (21a.26), which is covered in depth in Chapter 21.

28.3.1 Length, distance and angle

Given an inner product (28.133), we can define the associated length, which we introduce in more generality later in (28.211), or norm, of a generic vector as

(28.148)

In our simple example of a real vector space with the dot product (28.138), this length corresponds to the standard Euclidean norm

(28.149)

PICExample 28.30. Standard Euclidean norm.
Continuing from Example 28.28, the standard Euclidean norm (28.149) of the vector (28.27) is

(28.150)

Similarly, for (28.28) we obtain

(28.151)

We can similarly define the more general Mahalanobis norm on , induced by the Mahalanobis inner product (28.140) as in (28.148), giving

(28.152)

for , where is the standard Euclidean norm (28.149) and the full-rank scaling matrix. On finite-dimensional inner product spaces, any norm induced by an inner product (28.148) can be expressed as the Mahalanobis norm (28.152) of the vector coordinates, see Section 28.6 for details.

PICExample 28.31. Mahalanobis norm.
Continuing from Example 28.29, the Mahalanobis norm (28.152) of the vector (28.27) is

Similarly, for (28.28) we obtain

An inner product also induces a distance, which we introduce in more generality later in (28.228), between two generic vectors via the length (28.148) of their difference, explicitly

(28.153)

for . In the example of a real vector space with the dot product, this distance (28.153) corresponds to the standard Euclidean distance

(28.154)

PICExample 28.32. Standard Euclidean distance.
Continuing from Example 28.30, we calculate the standard Euclidean distance (28.154) between the vectors (28.27)-(28.28) as

(28.155)

We can generalize the Euclidean distance (28.154) to the Mahalanobis distance on , induced by the Mahalanobis inner product (28.140) as in (28.153), giving

(28.156)

for , where is the standard Euclidean norm (28.149) and the full-rank scaling matrix. Note that the Mahalanobis distance (28.156) reduces to the standard Euclidean distance (28.154) when is the identity matrix. The Mahalanobis distance (28.156) allows us to generalize the absolute z-score (36.41) in the multivariate framework of an -dimensional random variable to the multivariate absolute z-score (36.42), as discussed in Section 36.2.

On finite-dimensional inner product spaces, any distance induced by an inner product (28.153) can be expressed as a Mahalanobis distance (28.156) on the coordinates, see Section 28.6 for details.

PICExample 28.33. Mahalanobis distance.
Continuing from Example 28.31, we can calculate the Mahalanobis distance (28.156) between the vectors (28.27)-(28.28) as

From the relationship between length (28.148) and inner product (28.133) we have the polarization identity [WE.32.8 

(28.157)

PICExample 28.34. Polarization identity.
Continuing from Example 28.32, we can verify the polarization identity (28.157) for the pair of vectors (28.27)-(28.28). Indeed, from our previous calculations of the norms (28.150) and (28.151) and the distance (28.155)

(28.158)

which agrees with our previous calculation of the inner product  (28.139).

The inner product (28.133) and length (28.148) also satisfy the Cauchy-Schwarz inequality [W]

(28.159)

where equality holds if and only if the vectors are linear dependent.

PICExample 28.35. Cauchy-Schwarz inequality.
Continuing from Example 28.34, we can verify the Cauchy-Schwarz inequality (28.159) for the pair of vectors (28.27)-(28.28). Indeed, from our previous calculations of the norms (28.150) and (28.151) and the inner product  (28.139), we see that

(28.160)

as the Cauchy-Schwarz inequality (28.159) states.

From the inner product (28.133) and the length (28.148) we can define the angle [W] between two generic vectors as follows

(28.161)

This is well-defined since the Cauchy-Schwartz inequality (28.159) guarantees that .

PICExample 28.36. Angle.
Continuing from Example 28.35, we can calculate the standard Euclidean angle between the pair of vectors (28.27)-(28.28). Using the previously calculated inner product  (28.139) and lengths (28.150) and (28.151), the angle between and is calculated as (28.161)

(28.162)

The notion of angle (28.161) between two vectors naturally yields the notion of orthogonality: two vectors are orthogonal with respect to the inner product (28.133) if their inner product is zero

(28.163)

Care must be taken when determining the orthogonality of vectors using their coordinates, for details see Section 28.6.1.

PICExample 28.37. Angle.
Continuing from Example 28.36, we recall that the inner product of (28.27)-(28.28) is (28.139). Therefore and are not orthogonal (28.163).
Consider the vector (28.22). Then

(28.164)

therefore these two vectors are orthogonal (28.163). Note that their inner product (28.164) implies that the angle between these vectors is .

In an inner product space, we can define an orthonormal set , as a set of vectors which are orthogonal (28.163) to each other and have length (28.148) equal to , or more succinctly

(28.165)

Note that an orthonormal set (28.165) is also linearly independent (28.10), in fact in more generality E.32.10 

(28.166)

28.3.2 Orthogonal projection

A vector is orthogonal to a linear subspace of (28.35) [W] if it is orthogonal to every vector in that subspace, i.e.

(28.167)

Note that the orthogonality (28.163) of two vectors and the definition of the angle (28.161) is equivalent to the vectors being perpendicular to each other