Quant View -- Investing by the Numbers -- Archives: January '10 Work in Progress

Click on Topic to Go
 


January 2010
Old and New
A Funny Thing Happened When Revising P3 and P4's Algorithms -- Only Half Changed

“What we anticipate seldom occurs; what we least expected generally happens.”
-- Benjamin Disraeli (1804 - 1881)

 

EN YEARS AGO WE WERE preparing to release the details of two new quantitative models, Portfolios 3 and 4. Using ten years of data from the decade of the 1990s, we tested nine different independent variables to derive regression equations that (we hoped) were predictive of future performance. The results can be found here. We didn't finish testing the models until in the spring of 2000, so the official launch didn't occur until July 1, 2000. The goal was to leave the underlying algorithms unchanged for at least ten years before making any revisions. This would allow the models to be tested over the proverbial "long term".

Now the long term is up. It's time to again consider the predictive powers of factors and their resulting algorithms. It's time to make changes before waiting yet another ten years.
Chart 1
REGRESSION COEFFICIENTS FOR P3 AND THE SECTORS OF P4
Effective December 14, 2009
Graph -- Regression Coefficients for P3 and the Sectors of P4, Effective December 14, 2009

Once again we used ten years (actually nine years and ten months) worth of data and again ran regression analyses on the stock's annualized one year returns. The only difference this time was that we added a tenth factor, one-year price return. This was included as a measure of short-term momentum and that's why we used price return (capital appreciation) rather than total return (capital appreciation plus current income from dividends. Momentum is reflected in price, but not dividends.

Again we used "stepwise" regressions whereby all variables start out in the analysis, but those that are not statistically significant are removed, one by one. That led to the first surprise: Four of P4's ten sector models could not generate an algorithm because they had no statistically significant independent variables. In other words, none of the ten factors demonstrated sufficient connection with the stocks' returns to be considered significant. A fifth sector, Telecom, failed to have enough stocks to support a regression analysis with ten independent variables.

These were not the expected results and left us with some tough decisions. Do we completely jettison the models for those sectors lacking statistical significance and start over with some new ones? Does this apply across the board or only to those sectors where all factors were eliminated? On the other hand, we could always force a result by using all ten factors and not going through the stepwise procedure. That would guarantee a formula, but would it be meaningful in any regard? And what to do about Telecom? Should we arbitrarily cut down the number of factors so a regression could be run, or was there another alternative? Going into this process we didn't expect any of these issues, but we got them all.

 

Portfolio 3 -- Almost Totally Different Factors
In hindsight, P3 was the calm before the storm. The new regression analysis ran smoothly enough, but the resulting algorithm showed virtually no overlap with its predecessor. The top line of Chart 1 shows the factor weights of the regression equation. To use this chart, simply take the dependent variable appearing in the far left (in this case P3) and place it on the left of an "=" sign. the numbers on the line to the left are the coefficients for the associated factors with those in black being added and those in red being subtracted. P3's new algorithm becomes:

P3 = .3930(1-Year Price Return) - .4500(Debt/Capital) + .0164(Price/Book) - 2.1452(Long-Term Earnings Growth Rate)

There are at least three things that are odd about this result. First, it has only one factor in common with the original algorithm, Price/Book. The comparison of the new regression with the old on the top two lines of Chart 2. Could things have really changed that much in the past ten years? Maybe so, given that the data underlying the original regression analysis came from the bullish large cap growth dominated 1990s while the data for the new came from a period with two bear markets and a value influence. P3 has certainly traded like a growth model so maybe now with new data it will be trading more like a value portfolio.

Then again, r-square is a lot lower for this regression (0.0576) than for the original (.6574). R-squared measures the "fit" of the regression. Essentially the r-squared can be read as the percentage of the movement in the stock price that is explained (predicted) by the independent factors. In this case, the original regression explained over 65% of the movement while the new is roughly ten times lower. Doesn't give you a lot of confidence, does it?
OUR QUANT MODELS
Portfolio 3
  • Top 30 Stocks Based on Stepwise Regression Across All Stocks of the S&P 500
  • No Attempt is Made to Sector-Weight this Portfolio
  • Rebalanced Every 60 Days
  • Stocks Remain in the Portfolio Until Falling Below the Top 100
  • The Highest Rated Stocks Not Already in the Portfolio are Added When Existing Constituents are Removed


Portfolio 4
  • Top Stocks of Each Sector Based on Stepwise Regression of Each Individual Sector of the S&P 500
  • Number of Stocks Selected in Each Sector Determined by Current Sector-Weightings of the S&P 500
  • Rebalanced Every June and December
  • Stocks Remain in the Portfolio for 6 Months Unless Deleted for Special Circumstance e.g. Acquisition
  • Stocks Removed for Mergers and Acquisitions are Replaced by the Next Highest Rated Stocks in Their Specific Sector
  • Benchmark: S&P 500


Portfolio 5
  • Dynamic asset allocation model based on 9 different Growth/Value/Blend and Large/Mid/Small Cap styles as defined by Morningstar's "Stylebox"
  • Index SPDRs and iShares used to represent each component of the Stylebox
  • Stylebox sectors and weightings optimized using Ibbotson's Building Block methodology
  • Reallocated mid-first month of each calendar quarter
  • Benchmark: S&P 500


Portfolio 6
  • Dynamic asset allocation model based on 5 different stock and bond asset classes
  • Index SPDRs and iShares used to represent asset class
  • Classes are rebalanced using a mean-variance optimizing model
  • Reallocated mid-first month of each calendar quarter
  • Benchmarks: (1) Static asset allocation model: 25% Domestic Bonds, 48% Domestic Large Cap Stocks, 21% Domestic Small Cap Stocks, 6% Foreign Stocks, rebalanced quarterly
    (2) Buy-and-Hold model with same asset mix as (1), but no rebalancing.

But perhaps most troubling is that odd coefficient for the Long Term Earnings Growth Rate, -2.1452. What's up with that? Essentially this says the greater the company's Long-Term Growth Rate, the lower it scores -- by a factor of 2. Not only is this counterintuitive, the relatively high coefficient makes this the most heavily weighted factor in the algorithm.

One might be tempted to believe this is an effect of the recent bear market where shares that fell the hardest also seemed to be the ones that recovered the fastest. That could be correct, but then again, the sample period included both bull and bear markets. The fact that the deep bear market was in the recent past shouldn't outweigh the effects of the longer-running bull market that extended from late 2002 through 2007. This factor can't be so easily explained away.
Chart 2
COMPARISON OF 2010 AND 2000 REGRESSION COEFFICIENTS AND ADJUSTED R-SQUARES
P3 and New Sector Regressions for P4
Graph -- Comparison of 2010 and 2000 Regression Coefficients and Adjusted R-Squares

It's always troubling to run an analysis like this only to find statistics that simply don't make sense. The Long-Term Earnings Growth Rate in P3 appears to be such an instance. Results like this aren't so surprising when the regression's adjusted r-square is as low as it is for the new version of P3, confirming that the results may not be that reliable. Over the next ten years, we'll find out.

 

Portfolio 4 -- The 50% Solution
The S&P 500 has ten sectors grouping stocks of firms in like industries. There's little reason to believe the same factors are equally influential to performance across all ten sectors. Consider the Price to Book (P/B) Ratio which divides a firm's share price by its book value (essentially the assets remaining after all debts have been netted out). This is an important factor when assessing financial firms because it provides a good measure of their core strength. But it's a terrible metric for the Technology sector because tech companies have very few assets other than their employees. Obviously P/B doesn't carry the weight for Tech stocks as it does for Financials.

P4 acknowledges this by creating separate algorithms for each of the ten sectors. Like P3, each is based on a regression of annualized returns against the same ten independent factors, but this time one stepwise regression is run for each of the ten sectors. P4 is then comprised of the top stocks in each sector as selected by their particular algorithm. The number chosen for each sector is weighted in accordance with the sector weightings of the benchmark S&P 500 at the time of the evaluation. As a result, P4's excess return (whether positive or negative) comes expressly from the stocks themselves and not their relative weighting versus the benchmark.

When P4 was originally created in 2000, the S&P 500 actually had eleven sectors before transportation stocks were combined with industrials in mid-2001. The fallout of the tech bubble sent a lot of companies (and their stocks) to their death, and many constituents have come and gone from the index in the past ten years. The only major change that had a bearing on the 2009 regression analysis came from the Telecommunications sector which now only houses eight stocks, too few to run a regression with ten independent variables. It was, therefore, impossible to come up with a new algorithm for the Telecom sector.

This was an unexpected result and left little choice but to maintain the 2000 algorithm. While not based on today's components, it was at least derived from stocks in the sector with similar characteristics. This was not the ideal outcome, but was the best -- if not only -- alternative available.
Chart 3
FACTOR FREQUENCY IN 2010 P4 REGRESSIONS
Graph -- Factor Frequency in 2010 P4 Regressions

Even more surprisingly, we encountered similar problems in four other sectors. While all remaining nine had far more than the minimum stocks for the regressions, the Consumer Discretionary, Energy, Healthcare, and Industrials sectors had no statistically significant independent factors. In other words, when we ran their stepwise regressions, each of the ten independent factors was removed from the result because each lacked a significant relation with the stocks' corresponding annualized return. The math was saying there was no real connection.

Again, what to do? We considered using proxies for the individual stocks from these sectors. That would leave P4 with individual stocks from five sectors where we could run regressions, the 2000 algorithm for the Telecom sector, and four ETFs each representing one of the four regression-less sectors. That was an easy solution, but would certainly water down what P4 was created to do -- beat the unmanaged index -- and put all the weight on the six non-ETF sectors. Not an acceptable solution.

Instead, we decided to employ the same approach as for the Telecom sector -- stick with the algorithms from 2000. As with Telecom, these were at least regressions based on stocks form these sectors and not simply sector-tracking vehicles like ETFs. The final column in Chart 1 shows the date of the regressions now being used in P4.

None of the sectors failed to have statistically significant independent factors in 2000, even despite the fact that we used nine rather than ten. So why did the 2009 regressions fail in these instances? Certainly there were a lot of changes over the past ten years, but were they significant enough to render previously related factors totally inert? That seems quite unlikely.

One explanation is that they always were unrelated, we just got spurious results in 2000. One of the problems with regression analyses arises from the possibility of finding apparent connections between completely unrelated factors. A classic example is the relatively high correlation between the population of Canada and the number of television sets sold in the U.S. These are two completely unrelated series of data, yet they appear to be highly related. This is a spurious relation and perhaps that's what the 2000 regressions produced in regard to the four sectors in question.

That's definitely a possibility, but it seems rather unlikely. If it occurred for one sector, it would be more believable, but not for four. Just looking at the independent factors used in the regressions, it's difficult to believe that not even one would have some relation with the sector's long-term results.

Frankly, we're at a loss to explain this. That's why we're not uncomfortable using the 2000 regressions for these four sectors. Again, as with the Telecom sector, time will tell how well this choice plays out.

 

Factors of Choice
Chart 2 shows the factors that were statistically significant in the 2010 regressions, and compares them to those from the 2000 set. The actual coefficients are displayed along with the adjusted r-squares. In each instance, the adjusted r-square was higher in the older models than in the new. (This seems to lend some credence to the argument that the four sectors in which the regressions did not work may have simply fallen even further, yet it's still hard to believe that they could have fallen to such an extent that no one factor was at least marginally significant.)

Once again as with P3, it's remarkable how little overlap there is between the factors for each sector in the 2000 and 2009 regressions. In almost every instance, those now used weren't used before. That's not what you'd expect, but may simply be a reflection of the changing equity environment. Archive Index

Chart 3 shows the frequency of use for each independent factor in the 2010 version of P4. Each was used at least twice and none were used more than four times. Three of the most used (Debt to Capital, Price to Book, and Price to Cash Flow) are traditional value metrics, a definite change from the earlier regressions that displayed such a growth orientation. This is not surprising given that the 1990s were a decade of growth leadership while value led the way in the 2000s.

Among the new algorithms, Consumer Staples has that same odd large negative coefficient for Long-Term Earnings Growth Rate as P3. In fact, Consumer Staples is even larger, -14.7538. That makes absolutely no sense. Not only that, the coefficient is so large, it clearly dominates the two other independent factors in the algorithm. Judging from this, this sector should have some interesting results. If it's any consolation, the Consumer Discretionary sector has had a high negative coefficient for this factor since 2000 and the results weren't too out of line.

On the other hand, the generally negative coefficients for Debt to Capital make a lot of sense -- the lower the leverage the more revenue to the bottom line. The positive coefficients for the 1-Year Price Return are also reasonable. Finally, the large positive coefficients for Long-Term Earnings Growth Rate for Energy and Utilities are much more in line with expectations than the large negative ones noted above.

There's really very little else of note in the new regressions. Time will tell how they will work out with the first indications coming in 2010 as the full year gets underway with the new models. The current mix was selected in mid-December using closing prices and statistics as of December 11. Just looking at the factors in the algorithms, we would expect the new models to be more of a blend of value and growth than their predecessors. In a turbulent market, this may well prove to be the proper mix.


 

E-mail your comments.

Search this site! Just enter you key word or words:

PicoSearch

Get current quotes or follow your own custom portfolio, courtesy of E-Line Financials:
 

Search:TickerName
 

 
Homepage Return to Top