## Saturday, September 8, 2012

### Modeling Gold Returns: A Case Study

In my previous post, I considered the problem of induction. Looking at emeralds and noting that they are green (or grue) is supposed to be a simple example since making color judgements is presumably a simple task. The example could be further complicated if we factor in the vagueness of making color judgements. After all, it's likely the case that the distribution of electromagnetic frequencies differ between one emerald and the next. (And they would differ further depending on the distribution of EM frequencies of the "white light" used upon it.)

In most real world scientific inquiries, these uncertainties are present and need to be dealt with. So instead of focusing on the sorts of examples used in "simple" philosophical thought experiments, I thought I would provide a more detailed example.

The motivation for looking at this came from an interesting paper entitled The Golden Dilemma. I will make frequent reference to Exhibits from this paper. [1]

Part of the goal here will be to consider the way that induction and abduction interplay in a scientific inquiry. Furthermore, I will be examining the theory-laden aspect of scientific inquiry.

I will start with Exhibit 5. I've reproduced a similar chart in Figure 1:

Figure 1 [2]

One thing that needs to be considered is how these quantities were measured. For example, there are multiple series of CPI. I chose CPI-U less food and energy. I actually looked at CPI-U all items but there is slightly better correlation. Furthermore, CPI-U less food and energy is the basis for policy measures by the Fed.

One could easily criticize my choice or even the choice to use any CPI to measure "inflation". For instance, the equation of exchange has a "price level" quantity which is technically a vector. If that's the case, why should we suppose that the price level is a specific quantity (e.g. a "scalar" in mathematical language) when it's not something like that at all? [3]

Which is "Green" and which is "Grue"?

I only ask this question because the suggestion, at least, is this process by which I proceed should (why?) have similarities to the considerations of emeralds that are green (grue, grellow, etc). That question should sit in the back of our minds as we proceed.

The graph looks as though I could draw a function through it and all of the points would be quite close to that function. The question we might ask is what function ought I choose?

With most data sets, it will be unlikely that any function will fit (exactly) all of the points. So one goal is to find a function that is a close approximation to the data points. There are different techniques to do so. I will restrict myself to least squares. [4]

As it turns out, it will be convenient to look at CPI to Gold ratio versus 10 year real returns. The reason will be more clear with the following graph:

Figure 2

Obviously the fit is not perfect but it does a pretty decent job. The coefficient of determination is pretty high (0.8475).

But my goal is not merely to find a function which fits this particular data set but rather will do a good job of fitting any similar data set. What reason do I have for believing that it will fit another data series of CPI to Gold and 10 year real gold returns? Or to give it a more practical nature, what reason do I have for believing it will hold true in the future?

As I noted in The Problem of Induction, there are two avenues to examine this issue. The first is to apply the same formula to a different data set. Unfortunately I don't have a different data sets. But I can split up this data set into two parts. I have chosen to separate them from 1970-1985 and from 1985-2002. This gives me three different models:

Table 1

 Data Set Model Breakeven Gold/CPI 1970-2002 Returns = 0.351  CPI/Gold - 0.137 2.562 1970-1985 Returns = 0.303 CPI/Gold - 0.110 2.758 1985-2002 Returns = 0.577 CPI/Gold - 0.245 2.355

Notice that when looking at the 1985-2002 data set, the slope and intercept are quite different compared to the other two data sets. The other data sets are much closer.

How close do they have to be? This is a question of uncertainty. For example, if I am to judge "all emeralds are green" I have to know how wide my boundaries are on what I classify as "green". That discussion will have to be left for later.

In any event, I'm not sure that I have any inductive evidence that any of these formulas will work well in the future. They have done decently in the past. Consider the formula derived from the 1970-2002 data set:

Figure 3

As Table 1 suggested, the period from 1985 to 2002 turned out differently. As a result, this model didn't do as good of a job predicted returns as it did in the previous period. What about now? Right now the model is predicting returns of about -8%.

There is more that can be done with the inductive approach. For instance, we might examine different models. Furthermore, we might consider the possibility that there are other variables at play here. Perhaps there is some factor that we have not considered that takes effect around 1985. We'll leave those questions aside for the moment.

The other avenue of inquiry is an abductive approach. Do we have theoretical reasons for preferring one model over any other? In The Golden Dilemma, the others suggest a sort of "mean reversion" model. This would suggest that the Gold to CPI ratio ought to settle at some "mean" value. If the ratio is higher than that value, then statistically it will revert to that mean and returns will be (statistically) lower. Likewise, if it is lower than the mean then returns will be (statistically) higher.

This theory, however, does not suggest any particular model. It would not enable me to decide which (if any) of the above models will work in the future. It makes rather vague predictions. Nonetheless it would tell me that given that the Gold/CPI ratio (which is about 7) is high (see Table 2) compared to historical levels, real returns for gold will be much comparatively low.

Table 2

 Data Set Mean Gold/CPI STD 1970-1985 3.187 1.731 1985-2002 2.437 0.689 1970-2012 3.034 1.531

Concluding Remarks

So how does this "real world" example fit up with our grue emeralds? I believe that the various possible functions that relate the variables in our data set represent different "inductive generalizations" like "green", "grue", etc. There are potentially an infinite number of such functions which "fit" the data. Like our emerald example, looking at one data set may not be sufficient to choose which function, if any, will apply in the future.

The approach of splitting up the data sets showed us how we might be rule out (or at least be skeptical of) particular relationships. For example, the large differences in the slopes in the models presented in Table 1 show that we probably do not have the "correct" relationship.

It's also possible that there are other variables that are not present. This would compare to an emerald example if some emeralds were green while others were blue. We might need an additional variable (other than the item being an "emerald") that would enable us to differentiate the two.  Likewise, we might need other variables other than Gold/CPI if we are to explain future gold returns.

Although we did not present any specific theory that might choose one model over another, we do have a more general theory that the Gold/CPI data ought to mean revert. This theory would suggest that the general relationship of high (low) Gold/CPI will result in lower (higher) returns.

In summary, this example provides us with a closer look at the problem of induction and what avenues an inquiring mind has to resolve such issues. The approaches frequently require more research (collecting more data sets, developing theories, etc).

Notes:

[1] All references to "Exhibits" will be the exhibits from The Golden Dilemma. All references to "Figures" will be refer specific tables and charts inserted into this blog.
[2] All data, unless otherwise noted, comes from FRED.
[3] Of course one can be fancy and measure the length (the square root of the transpose dotted with the vector) of P but I'm not sure how meaningful that would be since each price is "weighted" by a given quantity.
[4] There are a few reasons for doing so. First, it's the technique I'm most familiar with. Secondly, EXCEL has some great features that permit me to use it quite easily. Lastly, since many functions can be approximated by a Taylor expansion, one can use least squares linear regression to model a wide variety of functions.