Quora noscript

28.2 Linear least squares regression modelsPIC

Key points

The first of the three main classes of dominant-residual LFM’s (28.23) are regression LFM’s. Regression LFM’s are also known as “macroeconomic” LFM’s, because in some applications the factors are macroeconomic variables, such as interest rates, stock market returns, etc. The purpose of regression models is explanatory: we want to explain a given large or small (even one) number () of target variables , such as some S&P500 stock returns, as much as possible in terms of a large or small number () of observable factors , such as one or more total return indexes. See [Rao and Toutenburg, 1995].

Regression models can be either useful to perform dimension reduction (), e.g. to draw Monte Carlo scenarios of large dimensional variables (see Section 20.7.6); or risk attribution (), e.g. to compute the optimal hedge or attribute the risk in the portfolio P&L () to a few key drivers () (seeStep 8a).

Beyond the financial industry, regression models are extensively implemented in many mathematical fields, most notably supervised learning (Chapters 33-34), as shown in Figure 31.1, and they became very popular in the statistical literature because of their appealing mathematical features, see Section 28.6.13.

For example, it is a common practice to use regression models for prediction and forecasting (Section 32.2), say to understand simple linear casual relationships between the target variables (or dependent variables) , such as the tomorrow’s temperature in London, and the factors (or independent variables) , such as some current measurements in England (average temperature, atmospheric pressure, humidity etc) [W].

PICExample 28.8. Regression model for forecasting
Let us suppose that we are a portfolio manager and we want to perform forecasting of a portfolio of stocks: Aon PLC (AON) and Cablevision Systems Corp (CVC) using the S&P 500 index, similar to Intuition 146.
In this case, the target variable is the portfolio return (6.3) between today and tomorrow

(28.46)

where the portfolio weights are defined in (6.4); and the factor (8a.1) is the same return on the S&P 500 index (28.2), but between yesterday and today

(28.47)

Then we consider a regression linear factor model (28.54) as our predictive model

(28.48)

where the model parameters  are obtained via r-squared maximization (28.56).
Suppose that we have also the estimates for their joint expectation vector (23.26) and covariance matrix (23.32)

(28.49)

We will use the estimates (28.49) as our proxy for the true, unknown parameters.
Then the optimal loadings (28.67) read

(28.50)

and the optimal shift (28.70) reads

(28.51)

Finally, we can use the estimated model (28.48) to perform forecasting: given the today’s realization for the S&P index return (28.47), we forecast the outcome of the portfolio return tomorrow (28.46)

(28.52)

Then we can compute the r-squared (28.77) achieved by the regression model (28.48)

(28.53)

which is not large, as expected in all market prediction applications. Hence, the forecast (28.52) is imprecise.

Regression models can be suitable or not depending on the (hidden) co-dependence relationships among the variables inolved. In particular, regression models, as all the LFM’s in general, are unquely identified by the mean-covariance-class of the variables involved (28.13), meaning that their only first two moments of are necessary for their fit.

The regression prediction (28.74) has dual geometrical interpretation as best linear prediction (28.106) and orthogonal projection (28.107) of the target variables onto the linear span of factors, see Figure 28.7.

Beyond the ordinary least-squares approach (Section 28.2.7), a proper estimation of regression models, as well as for all linear factor models (Figure 28.4), requires many considerations that we expand in Chapter 47. Furthermore, estimation can (and must) be further improved by embedding in the model fit (28.54)-(28.56) constraints (Section 28.6.17) or penalizations as in factor selection (Section 46.4) or any other regularization technique ( Section 44.10).

The remainder of this section is organized as follows.

In Section 28.2.1 we introduce regression models as solution of an r-squared maximization with specific constraints.

In Section 28.2.2 we show the analytical expression of the optimal regression loadings and shifts.

In Section 28.2.3 we show the regression prediction and the maximal r-squared it achieves.

In Section 28.2.4 we verify the systematic and idiosyncratic features of the regression residuals.

In Section 28.2.5 we show the natural scale matrix defining the r-squared in regression models.

In Section 28.2.6 we give a geometrical interpretation of the linear regression prediction via orthogonal projections.

In Section 28.2.7 we discuss the estimation of regression models.

28.2.1 Definition

A regression , or macroeconomic linear factor model (LFM), for an -dimensional target variable is a dominant-residual decomposition (28.23)

(28.54)

or in compact matrix notation

(28.55)

In (28.54) the target variables and factors are observable. In particular, we assume known the mean-covariance equivalence class of their joint distribution (28.13), and thus their joint expectations and covariances. Then, the loadings matrix is constructed in such a way to maximize the r-squared of the given factors.

More precisely, let us start with:

i) a symmetric and positive-definite matrix that defines the r-squared objective (28.21);

ii) and a number of observable factors.

Then a regression LFM is a dominant-residual LFM (28.24)

where is the Riccati root of (14.452), and the constraints are:

i) the factors are given exogenously for a suitable vector ;

ii) the residuals have zero expectation , which with i) implies .

Therefore, the constraints become

(28.57)

Then the matrix is optimized in (28.56) to yield , which maximizes the r-squared.

In the estimation context the dominant-residual regression framework (28.56) becomes the ordinary least squares optimization (28.132).

Regression LFM’s are arguably the most implemented models for supervised learning because of their intuitive features (Section 28.6.13). Furthermore, regression LFM’s can be mapped from the affine format (28.54) to the equivalent linear format , by considering the constant as additional factor, see Section 33.1.4.

PICExample 28.9. Regression model for forecasting
Let us suppose that we are a portfolio manager and we want to perform forecasting of a portfolio of stocks: Aon PLC (AON) and Cablevision Systems Corp (CVC) using the S&P 500 index, similar to Intuition 146.
In this case, the target variable is the portfolio return (6.3) between today and tomorrow

(28.58)

where the portfolio weights are defined in (6.4); and the factor (8a.1) is the same return on the S&P 500 index (28.2), but between yesterday and today

(28.59)

Then we consider a regression linear factor model (28.54) as our predictive model

(28.60)

where the model parameters  are obtained via r-squared maximization (28.56).
Suppose that we have also the estimates for their joint expectation vector (23.26) and covariance matrix (23.32S.25.2 

(28.61)

We will use the estimates (28.61) as our proxy for the true, unknown parameters.

PICExample 28.10. Regression loadings as ordinary least squares

PIC

Figure 28.5: Video

In Figure 28.5 we show intuition and pitfalls behind linear ordinary least squares regression.
Consider a univariate target variable , such as one stock’s compounded return, and one observable factor , such as the compounded return of a different stock. To illustrate, suppose that the variables are jointly bivariate normal (20.95)

(28.62)

We show a large number of joint scenarios and the respective marginal histograms. We also draw the least squares regression line

(28.63)

together with an arbitrary line . The regression line (28.63) is the one that best fits the scenarios among all the possible lines .
Indeed, the area coresponding to the squared errors (yellow) is smaller than the area corresponding to (blue).
Once we have fitted the regression line to the distribution (28.62), the prediction (28.15) is a random variable that takes values on the line (28.63). We can then use the prediction to postulate, or observe, a value for the factor and infer the predicted mean for the target variable.
We can then change the distribution (28.62), for instance by varying the correlation . As the distribution changes, the regression line (28.63) adapts, and maintains the best least square fit (28.56).
To better analyse the fit, we focus on the joint distribution of the prediction and the target variable , and the respective r-square (28.21). As the joint distribution of the target and the factor varies, the prediction is always positively correlated with the target variable.
To better understand the residual , we show the joint distribution of the prediction and the residual . As the fit, or r-square (28.21), increases, the residual decreases. Regardless, the prediction is always uncorrelated with the residual, and therefore the mean-covariance ellipse (23.70) has its principal axes parallel to the reference axes (23.84).
These key features are preserved regardless of the distribution of the target and the factor . For instance, consider the case where and are the prices of the above two stocks, and thus they are jointly lognormal (20.199)

(28.64)

Then again the least squares regression line (28.63) is the one that, among all the possible lines, achieves the best fit in terms of the square residuals (28.56).
As the first two moments of the distribution (28.64) change, the prediction (28.15) provides a linear approximation to a possibly nonlinear relationship between target and factor.
However, the prediction remains positively correlated with the target , and it remains uncorrelated with the residual , and the fit improves as the residual decreases.
The linear least squares regression works also with discrete variables. To illustrate, consider the mixture case (20.519), where the target is binary: if the stock goes up and if it goes down (20.383)

(28.65)

and the factor is conditionally normal (20.524)

(28.66)

The regression line (28.63) allows us to predict the mean of the target variable, which is the probability of the stock going up. Again, the prediction (28.15) is positively correlated with the target , and uncorrelated with the residual .

28.2.2 Solution: factor loadings

Given the constraints (28.57), the key variables to solve in the regression LFM optimization (28.56) are the loadings, which can be computed analytically E.26.1 

(28.67)

or in compact matrix notation

(28.68)

Note that the optimal loadings do not depend on the scale matrix specifying the r-squared (28.21).

Given the optimal loadings (28.67), we obtain from the zero expectation constraint (28.57)

(28.69)

or in compact matrix notation