Quant View -- Investing by the Numbers -- The Quantitative Approach

Click on Topic to Go
 


"Statistics are like lampposts: they are good to lean on, but they don't shed much light."
-- Robert Storm-Petersen

 

OU'RE PROBABLY ALREADY FAMILIAR with the distinction between fundamental and technical analysis. Both can be used as a basis for investment decisions despite being distinctly different from one another.

In essence, investors who look at company fundamentals focus on certain characteristics and ratios such as growth rates or price to earnings ratios. These serve as a means of comparison and basis for investment decisions.

Technical analysis ignores these factors. Instead, it's based on supply and demand in the market itself. Chartists -- as technical analysts are often known -- look for recurring patterns in stock price movements. Certain ones signify buying opportunities while others show weakening trends.

Quantitative analysis -- the basis of our approach -- is a cross between these two methods. Essentially we use statistical means (similar to technical analysis) to determine which particular fundamental factors are most predictive of future stock price performance. We then apply these factors to help select stocks (as in fundamental analysis) and orient portfolios.

Quantitative Theory

OK, so what does all that mean? Like chartists, quants believe relations observed in the past will hold in the future. But unlike chartists who only look at price and market patterns, quants are more interested in relations that can be statistically quantified.

And that's the point; that's why quantitative analysis can help at
This is nothing more than the stuff you thought you'd never use from 8th grade algebra.
the fundamental level. It provides an indication as to what fundamental factors are important when screening stocks or gauging the market.

Underlying the process is the belief that some factors are better predictors of future returns than others. If we look back at each factor, we can compare its behavior to the return of an individual stock or market sector. Those that are consistent and highly related ("correlated" in terms of the statistician) are statistically significant; they're the ones we should pay attention to. We can feel justified in ignoring others that aren't significantly significant.

For example, it shouldn't surprise you that certain fundamental values and ratios are more meaningful for some equity sectors than others. More specifically, we've found that price-to-cash flow is the only statistically significant ratio in evaluating firms in the basic materials sector. You might have thought debt-to-capital or return on capital might have been important, but they aren't. They're important when evaluating other sectors, but not basic materials.

Quantitative Methods

This probably sounds more complicated than it really is. In comparing returns to fundamental values, all we're doing is determining how the return changes in relation to the fundamental factor. This is nothing more than the stuff you thought you'd never use from 8th grade algebra.

The fundamental factor we're considering is the independent variable since it can be independently measured. Returns represent the dependent variable
Pythagoras
(of a2 + b2 = c2 fame) believed everything could be explained in terms of mathematics.
since they are (theoretically) related to the changes in the fundamental factor. By analyzing a set of data points for each, we can generate an equation expressing their relation. This is known as simple regression analysis

As an example, consider again the relationship between price-to-cash flow and stock price movements in the basic materials sector of the S&P 500. Our work indicates it can be expressed as follows:

12 Mo. Pct. Stock Price Change = -0.5877 + 0.0547(Price-to-Cash Flow)

For each unit change in price-to-cash flow, the 12 month percentage price change for the stock rises by .0547%. The 0.0547 is called the coefficient since it is multiplied times the independent variable. The term -0.5877 is a constant that is added to the result to come up with the value of the dependent variable.

In general, this relationship between two variables can be expressed as:

y = a + bx

where:
y = dependent variable
a = constant
b = coefficient
x = independent variable

If other fundamental factors are related to a stock's return, they can also be added to the equation in what's known as a multiple regression analysis. The result takes the same form except now each additional factor is an independent variable and each has its own coefficient. The general equation looks like

y = a + bx1 + cx2 … zxn

where:
y = dependent variable
a = constant
b … z = coefficients
x1 … xn = n independent variables

Once you have the regression equation, you can use the results to find the dependent variables with the highest values given the current or projected values of the independent variables. To do this, you simply plug in the latter and let the equation calculate the former.

Sounds easy enough, doesn't it? Of course there's a little more to it than that.

Correlation

The ancient Greek philosophers believed mathematics was beautiful in its simplicity and elegance. Pythagoras (of a2 + b2 = c2 fame) believed everything could be explained in terms of mathematics. (He's the patron saint of quants.) Unfortunately, while the statistical formulas may work, the results often fail to explain the movement of the dependent variable.

While you can always get a regression relation, the connection between the
…"garbage in, garbage out" still holds.
two variables may not be particularly strong or meaningful. It's important to check this out.

Here's how: When you're creating a simple linear regression, you plot the values of the dependent variable on the y-axis (the vertical one) and the values of the associated value of the independent variable on the x-axis (the horizontal one). We've done this on Chart 1 where we've plotted IBM's closing price vs. the day of the week for each trading day in December 2003.

The regression equation comes from the line that minimizes the distance between the actual y-values and the line itself. In other words, this line represents the "best fit" of the data points to all possible straight lines. It's given by the red line on Chart 1. (There's a really ugly equation used to calculate it, but that's not important here.)

For a strong relation, most of the plotted points fall on or near the regression line. The more dispersed the points, the weaker the relation.

The strength of the relation between the two variables can be measured statistically. To do this, you must find the coefficient of correlation, usually referred to as r. Its value ranges from +1 (perfect positive correlation -- the dependent variable moves to the exact same extent as the independent variable) to -1 (perfect negative correlation -- the dependent variable moves in the exact opposite direction and magnitude as the independent variable). A value of 0 shows no relationship between the two so that a movement in the independent variable has absolutely no relation to movement in the dependent variable. Strong positive correlation occurs at values between +0.5 to +1.0 and strong negative correlation lies in the range of -0.5 to -1.0.

By squaring the r value, you arrive at the coefficient of determination, commonly referred to as R2. This represents the proportion of the
Chart 1
No Real Relation
IBM Closing Price vs. Day of the Week
December 2003
Graph -- IBM Closing Price vs. Day of the Week
Source: Quantview
total variation in the dependent variable (y) explained by the variation in the independent variable (x).

Consider again the relationship between IBM's closing price and the day of the week. When we plotted the values in Chart 1, we got a regression equation -- you can always get one. But look how the plotted points are widely dispersed about the best-fit line. That certainly doesn't indicate there's much of a relation here.

This is confirmed by the R2 which is almost zero (.0007). As you probably already suspected, there is no correlation between IBM's daily closing price and the day of the week. Interpreting the R2, only .07% of the variance in IBM's price is explained by the variance in the day of the week. Just because you can get a regression equation, there isn't necessarily a relation between the two variables.
Chart 2
Positive Relation
IBM Closing Price vs. Intel Closing Price
December 2003
Graph --IBM Closing Price vs. Intel Closing Price, December 2003
Source: Quantview

But IBM does trade like other stocks, especially those in its market sector. Here the relationship is stronger.

To illustrate this point, consider Chart 2 where, we've plotted the daily closing price of both IBM and Intel. Notice how much more tightly the points fall about the regression line.

As before, we got a regression equation using IBM's closing price as the dependent variable, but this time, the R2 was considerably stronger, 0.4159. The best-fit line is upward sloping indicating the relationship is positive, meaning IBM and Intel tend to trade in the same direction. That's essentially what you'd expect from two tech stocks.

On the other hand, stocks from other sectors trade differently. This is illustrated by Chart 3 that compares IBM's closing price to that of Merck, a
Chart 3
Negative Relation
IBM Closing Price vs. Merck Closing Price
December 2003
Graph --IBM Closing Price vs. Merck Closing Price, December 2003
Source: Quantview
pharmaceutical company. Historically the technology and healthcare sectors haven't been highly correlated -- techs tend to be cyclical while healthcare stocks are more defensive in nature.

This time the R2 is a little weaker (0.3175), just what you'd expect from stocks of different sectors. More imprtantly however, is the fact that the two are negatively related as you can see from the downward sloping regression line and the negative coefficient in the regression equation. In essence, this is statistical support for what you already know -- cyclical and defensive stocks tend to trade in opposite directions.

Statistical Significance

But even if the correlation is strong, the relation may still not be statistically significant. You've heard this before in regard to public opinion polls. They always tell you they're accurate down to a specific percentage. That's what statistical significance is all about.

Once you know your regression equation shows a relatively strong relation between the dependent and independent variables, you still have to make sure the equation as a whole and each variable is statistically significant. If it
There's still a good dose of art involved with this science.
fails this test, the best thing to do is to throw the whole thing out and start over again. If particular variables fail, you remove them and test the equation again.

The interestingly named f-test applies to the whole equation while the so-called t-test checks each independent variable. The good thing about this is you don't have to do it yourself. In fact, you don't have to do any of this statistical stuff, we'll do it for you. We'll occasionally refer to these terms, so it's good to know what that's all about.

Don't be fooled by the math here: Although quantitative analysis is mathematically based, it doesn't follow that it's more reliable than other means of stock selection. After all, "garbage in, garbage out" still holds. Testing relationships for strength and significance is an attempt to minimize spurious relationships, but there's still no guarantee.

Quantitatively determined relationships are not stable. Remember they are based on market and company data that can change and evolve over time. That's why these relationships have to periodically be re-verified. There's still a good dose of art involved with this science.

Applying the Results

Once you're comfortable with the results of your regression analysis, you can use them to construct real-world models. Essentially this comes down to using the quantitatively selected fundamental factors to screen for stocks what should have the best potential. Obviously "best potential" is how you define it, but regardless, it's the goal of the modeling process.

We've applied these techniques to create Portfolios 3 - 6. Portfolios 3 and 4 are large cap growth equity models, Portfolios 5 is a style-based equity model, and Portfolio 6 is a balanced model utilizing both stocks and bonds. You can see how quantitative modeling procedures were used in creating them by clicking on the links above, and you can follow their returns both here and on the Home Page.

Finally, the quantitative process is an ongoing one. Models have to be monitored, tested, and refined. All of our portfolios undergo these processes, and their progress is documented in Work in Progress.

Mathematics can help us understand relations in the world around us, yet like everything else pertaining to the financial markets, even a highly quantitative approach is at best an imprecise science. Nevertheless, there's still much that can be learned even when quantitatively derived models fail to behave precisely as predicted.


 

E-mail your comments.

Search this site! Just enter you key word or words:

 

PicoSearch

Get current quotes or follow your own custom portfolio, courtesy of E-Line Financials:
 

Search:TickerName
 

 
Homepage Return to Top