# Revision history [back]

Transformations are generally performed prior to estimation of a model. First differencing in SHAZAM is easy. See the GENR command and the LAG function. So to do the first difference of a variable called 'price' do

genr lagprice = lag(price)


to create a variable called lagprice.

Then subtract this from price as follows:

genr dprice = price - lagprice


You may print these using:

You may print this using

print price lagprice dprice


Note that you should set the sample at the start where the syntax is sample beg end eg.

sample 1 200


Transformations Multicollinearity is usually considered a data deficiency and therefore transformations are generally performed prior to estimation of a model. First differencing in SHAZAM is easy. See the GENR command and the LAG function. So to do To create the first difference of a variable called 'price' dodo:

genr lagprice = lag(price)


to create a variable called lagprice.

Then subtract this from price as follows:

genr dprice = price - lagprice


to create the differenced price variable dprice. You may print these all 3 variables easily using:

You may print this using

print price lagprice dprice


Note that SHAZAM will automatically detect the sample size when loading a dataset but you should can set the sample at the start where the syntax is sample beg end eg. e.g.

sample 1 200


The first difference form is usually expressed as follows:

$$Y_t - Y_{t-1} = \beta_2 (X_{2,t}-X_{2,t-1}) + \beta_3 (X_{3,t}-X_{3,t-1}) + ... + \beta_K (X_{K,t}-X_{K,t-1}) + \varepsilon_t$$

These days the definition of multicollinearity being the exact linear relationship between explanatory variables is broadened to include cases where X variables are "intercorrelated" - being a less exact relationship. Algebraically this says that there is a relationship between the variables, typically expressed as:

$$\alpha_1X_1+\alpha_2X_2+...+\alpha_kX_k=0$$

In this case, one approach is to run a regression with a transformation of the data. e.g. if it is believed that $\beta_3=0.4\beta_2$ then create $X_{z,t}=X_{2,t}+0.4X_{3,t}$ , regress $Y$ on $X_z$ then calculate $\beta_2,\beta_3$ from the postulated relation using the estimate of $\beta_z$ from the regression.

The Ratio Transformation Method can also be used to reduce collinearity in the original variables by dividing through one of the variables. i.e.

$$Y_t/X_{3,t} = \beta_1(1/X_{3,t}) + \beta_2(X_{2,t}/X_{3,t}) + \beta_3 + \varepsilon_t/X_{3,t}$$

Using the same GENR command approach above it is possible to create each of these transformations before using the OLS command to estimate the regression.

The way to generate a constant in SHAZAM for the active sample length is

genr myconst = dum(1)


and then for a variable called $X_3$ transform myconst as follows

genr myconst = myconst / X3


then include it in the regression but note that SHAZAM includes a constant by default so use the NOCONSTANT option when specifiying the OLS command whenever you include your own.

Naturally all transformations come with a cost including loss of one observation through differencing (there is no lag of the first observation) and the effect on the residual - does transforming it make it serially correlated (or break other assumptions of the CLR model).