Thursday, May 14, 2009

Monte Carlo Simulation: The Problem is Not Skinny Tails

Monte Carlo simulation is a convenient statistical tool by which an analyst can make inferences over the probability of rare events. Basically, the method takes a model with its estimated parameters and simulates possible sample paths under some distributional assumptions on the errors. The technique is particularly useful for understanding the tail risk of distributions. The method has been widely adopted by banks, investors, and financial advisors. And, because the models failed so horribly in the current downturn, the method is now under attack.

A lot of discussion over the failure of these models has centered on the use of normally distributed errors. The Normal distribution is the mainstay of statistically theory but the distribution places relatively little weight on tail events. In other words, the distribution treats rare events as rare.

The proposed solution is to work with fat-tailed distributions. Distributions that put more weight on tail events. These distributions raise the probability of very rare events, increasing the risk and raising the probability of observing rare events.

Monte Carlo simulation faces a far more severe problem than its distributional assumptions. To run a Monte Carlo simulation, the parameters of the distribution must be known. With a normality assumption, we need know only the mean and the covariance matrix. Yet, even assuming the model is correctly specified, we can never know these parameters. We have at best a good guess.

This problem of parameter uncertainty swamps any distributional assumptions. Rare events are rare. They are not observed in any reasonably-sized sample. Every sample is going to produce slightly different estimates of the model parameters and these slightly different parameters have huge implications for the probability of tail events.

I think the easiest way to demonstrate the problem is with an example. I will run a Monte Carlo simulation to find the probability of Industrial Production recessions. I assume the growth rate of IP is an i.i.d normally distributed process.

But, I estimate the variance and mean of the process over different periods. First, I choose the Great Moderation era, 1984 to 2006. Many statistical models of the economy restrict themselves to this period. This is a reasonable sample if you believe there was a regime shift in the early 1980s. Second, I extend the sample through March 2009. Finally, I use the post-war period and then the full sample from 1921.

The bars in the chart below show the probability that IP will fall below its starting value over any five year period. Essentially, the probability is the odds of a recession occurring over any five-year interval. The number above each bar gives the average interval between recessions.

Between 1984 and 2006, there is a fifty percent chance of a recession in any five year period with a recession occurring every 9.8 years on average. Extending the sample two additional years changes the probability significantly. Using data from 1984 to 2009, a recession occurs once every 7.4 years. This number is strikingly similar to the probability over the entire post-war era, the third bar. Finally using the entire IP sample, a recession occurs every 6.3 years on average and the odds of a recession in any five-year period is over 80 percent.
The average size of recession (shown in the next chart) also changes over the different samples. The ordering is preserved with the Great Moderation having the mildest downturns and the full sample, including the Great Depression, having the harshest recessions.
Again using the Great Moderation sample, the tail risk is also small. The following chart shows the average decline in output in a five percent recession. That is a recession that occurs only five percent of the time. The tail fall in output during the Great Moderation is less than 2 percent.


Takeaway: Monte Carlo simulation suffers far more from parameter uncertainty than from shortcomings over the choice of distribution. Slight changes in the sample used yield drastically different simulations. And, the correct sample is unknowable.

By the way, the problem of parameter uncertainty is endemic in forecasting models. We want to use history to guide our judgment of the future but history is a fickle guide.


1 comment:

Anonymous said...

An equally bad problem comes from the fact that any model we make is just that. Genrally a monte carlo model is a low dimensional stochastic model and and reality is a very dimensional stochastic model. Unmodeled variables can be very significant. Models are not very useful for predicting the exact future but they are essential for understanding phenomena. I think all of science uses modles it is just that they are not all of a mathematical type.