Chapter 18 – ARIMA Models

johnkane

Chapter 18 – ARIMA Models

The econometric models discussed in previous chapters were all based, to a greater or lesser extent, on economic theory. All of the equations discussed in earlier chapters were either structural or reduced-form equations corresponding to some economic model. These models are very well suited for performing hypothesis tests involving model parameters. As noted earlier, these models may also be used for forecasting purposes. Some of the large macroeconomic models designed for forecasting purposes consist of hundreds of equations. Economists have discovered, however, that a simpler model specification, known as an autoregressive integrated moving average (ARIMA) model, often performs as well or better for forecasting purposes than many of these elaborate structural models. An introduction to ARIMA models is contained in this chapter.
\section{Overview}
As developed by Box and Jenkins (1976), an ARIMA model is based on the assumption that future values of economic time series can often be well explained by current and past values of the series and the error terms in the series. As noted above, the full name for this type of model is: \textbf{% autoregressive integrated moving average (ARIMA) model}. The autoregressive models discussed in Chapter \ref{auto.chap} are special cases of this more general form of time-series model.\footnote{% In this earlier discussion, autoregressive models were applied to the residual in a regression model. In the discussion that follows, however, these models will be directly applied to a time series variable. The techniques developed in this chapter, however, may also be used to explain the time path of a regression residual.}
Most economic time-series variables exhibit a substantial amount of autocorrelation. ARIMA modelling involves an attempt to use this pattern of correlation to predict future outcomes from current and past information.
Initially, the discussion will focus on \textbf{stationary time-series processes}. As noted in Chapter~\ref{unitroots.chap}, a time-series variable, $Y_t$, is said to be stationary if: \begin{itemize}
\item the mean of the series is constant across time, \item the variance of the series is constant across time, and \item the covariance between $Y_t$ and $Y_{t-s}$ depends only on $s$ and is unaffected by $t$.
\end{itemize}
If a time-series variable is stationary, each outcome can be thought of as represented as a drawing from a random process that is the same in all time periods. An extensive literature exists dealing with the analysis of stationary time-series processes.
Four classes of stationary time-series models are discussed in this chapter: \begin{itemize}
\item white-noise error processes,
\item autoregressive models,
\item moving-average models, and
\item autoregressive moving average models.
\end{itemize}
Before discussing each model in detail, it will be helpful to examine the definition of each of these models.
\subsection{White-noise error processes}
A white-noise error process, $\epsilon _t$, is a random variable with a mean of zero, a constant variance, and is uncorrelated across time. In mathematical terms, this means that:
\begin{equation*}
\text{ }
\begin{array}{l}
E(\epsilon _t)=0 \\
E(\epsilon _t^2)=\sigma _\epsilon ^2 \\
E(\epsilon _t\epsilon _s)=0\text{ for }t\neq s% \end{array}%
\end{equation*}
If an error process consists of white noise, past error terms contain no information about the current error term. As noted in earlier chapters, the assumptions of the classical regression process require that the error terms in a regression model be white noise.
More generally, if a variable follows a white-noise process, current and past values of the series cannot be used to generate useful forecasts of future outcomes. The best forecast of a white-noise process is equal to zero (the expected value of the process); this forecast is the same for all possible past realizations of the process.
\subsection{Autoregressive models — AR($p$)} In an \textbf{autoregressive model}, the current value of some time series variable is assumed to be a function of past values of the series. For example, suppose that the variable $Y_t$ can be described by a $p$th order autoregressive process. In this case, the variable $Y_t$ can be expressed as:
\begin{equation}
Y_t=\phi _o+\phi _1Y_{t-1}+\phi _2Y_{t-2}+\cdots +\phi _pY_{t-p}+\epsilon _t \label{ar.p.arima}
\end{equation}
A model of this form is referred to as an AR($p$) process.
The error terms in equation \ref{ar.p.arima} are assumed to be a white-noise error process. In an autoregressive process, the current value of a time series is determined solely by past values of the series ($Y_{t-i}$) and the current period’s error term ($\epsilon _t$).
\subsection{Moving average models — MA($q$)} In a moving average process, the current value of a time series variable is assumed to be a function of current and past random shocks. A $q$th-order moving average process may be stated as:
\begin{equation}
Y_t=\epsilon _t-\theta _1\epsilon _{t-1}-\theta _2\epsilon _{t-2}-\cdots -\theta _q\epsilon _{t-q} \label{ma.q.arima} \end{equation}
As equation \ref{ma.q.arima} indicates, in a moving average model, the current value of the dependent variable is a weighted average of current and past error terms ($\epsilon _t$). These error terms are assumed to satisfy the conditions for a white-noise error process.
\subsection{Autoregressive moving average models — ARMA($p,q$)} An ARMA($p,q$) model involves a combination of the AR($p$) and MA($q$) models described above. Under this specification, the current value of the dependent variable is assumed to be affected by $p$ lagged values of the dependent variable and the current and $q$ lagged values of a white-noise error term. In mathematical terms, this model may be expressed as: \begin{equation}
Y_t=\phi _o+\phi _1Y_{t-1}+\phi _2Y_{t-2}+\cdots +\phi _pY_{t-p} \label{arma.pq.arima}
\end{equation}
\begin{equation*}
+\epsilon _t-\theta _1\epsilon _{t-1}-\theta _2\epsilon _{t-2}-\cdots -\theta _q\epsilon _{t-q}
\end{equation*}
Note that the AR($p$) and MA($q$) models are special cases of this more general specification. In particular, the AR($p$) model can also be stated as an ARMA($p,0$) model, while the MA($q$) model can also be described as an ARMA($0,q$) model.
\section{The ACF\ and PACF\ functions}
In practice, econometricians generally do not know whether a time-series random variable follows an AR($p$), MA($q$) or ARMA($p$, $q$) process. Each of these processes, however, can be described by its autocorrelation function (ACF) and a partial autocorrelation function (PACF). AR, MA, and ARMA\ processes generate ACF and PACF\ functions with different characteristic features. Under the Box-Jenkins modelling procedure, the selection of a model is guided, in part, by a comparison of sample estimates of these functions with the theoretical ACF\ and PACF functions for alternative processes.
Let’s examine each of these functions.
\subsection{The autocorrelation function (ACF)} The \textbf{autocorrelation function (ACF)} is an important tool in ARIMA modelling. The autocorrelation function associated with a time-series variable $Y_t$ is defined as:
\begin{equation*}
\rho (k)=\frac{E(y_ty_{t-k})}{E(y_t)^2}
\end{equation*}
\begin{equation*}
\text{where: }y_t=Y_t-E(Y_t)
\end{equation*}
Thus, the autocorrelation function simply provides the theoretical correlations that exist between the current and lagged values of the series for each possible lag, $k$. As will be discussed below, AR($p$), MA($q$) and ARMA($p$, $q$) models will generate ACF\ functions with different characteristics.
The sample ACF is defined as:
\begin{equation*}
\hat{\rho}(k)=\frac{\underset{t=k+1}{\overset{N}{\sum }}\left( Y_t-\overline{% Y}\right) \left( Y_{t-k}-\overline{Y}\right) }{\underset{t=1}{\overset{N}{% \sum }}\left( Y_t-\overline{Y}\right) ^2} \end{equation*}
As will be shown below, a comparison of the sample ACF function with its theoretical counterparts will help to select an appropriate model specification.
\subsection{The partial autocorrelation function (PACF)} A second function, known as the \textbf{partial autocorrelation function (PACF)} is also needed to identify the form of an ARIMA\ model. The partial autocorrelation of a time series variable, $y_t$, at lag $k$ is defined to be the correlation between $y_t$ and $y_{t-k}$ after controlling for the effect of $y_{t-1},y_{t-2},\ldots ,y_{t-k-1}$. In other words, the partial autocorrelation at lag $k$ is a measure of the effect of $y_{t-k}$ on $y_t$ after controlling for the effect of more recent lags of $y_t.$ Consider the following set of equations:
\begin{equation}
y_t=\alpha _{11}y_{t-1}+u_{1t} \label{yw.arima} \end{equation}
\begin{equation*}
y_t=\alpha _{21}y_{t-1}+\alpha _{22}y_{t-2}+u_{2t} \end{equation*}
\begin{equation*}
\vdots
\end{equation*}
\begin{equation*}
y_t=\alpha _{m1}y_{t-1}+\alpha _{m2}y_{t-2}+\cdots +\alpha _{mm}y_{t-m}+u_{mt}
\end{equation*}
This equation system consists of a sequence of autoregressive models for the variable $y_t$. The first equation in \ref{yw.arima} is a first-order autoregressive model; the second equation is a second-order autoregressive model. In general, the $i$th equation in this system is an autoregressive process of order $i$.
Suppose that each of the equations appearing in equation system \ref% {yw.arima} is estimated using a least-squares estimation procedure. In each equation, the estimated parameter $\hat{\alpha}_{ii}$ represents the portion of $y_t$ that is explained by $y_{t-i}$ after controlling for the effect of more recent lagged values of $y_t$. When these equations are estimated: \begin{itemize}
\item $\hat{\alpha}_{11}$ is a measure of the effect of $y_{t-1}$ on the level of $y_t$ (since there are no “more recent” lags available), \item $\hat{\alpha}_{22}$ is a measure of the effect of $y_{t-2}\,$ on $y_t$ after controlling for the effect of $y_{t-1}$, \item $\hat{\alpha}_{33}$ is a measure of the effect of $y_{t-3}$ on $y_t$ after controlling for the effect of $y_{t-1}$and $y_{t-2}$, $\ldots $, and \item $\hat{\alpha}_{mm}$ is a measure of the effect of $y_{t-m}$ on $y_t$ after controlling for the effect of $y_{t-1}$, $y_{t-2}$, $\ldots $, $% y_{t-(m-1)}$.
\end{itemize}
Thus, each of the estimated $\hat{\alpha}_{ii}$ coefficients serve as estimates of the PACF at lag $i$.\footnote{% For a more detailed discussion of the estimation of the partial autocorrelation function, see Box and Jenkins (1976), p. 65.} Thus, the first $m$ lags of the estimated PACF are given by: \begin{equation*}
\text{PACF(1) = }\hat{\alpha}_{11}
\end{equation*}
\begin{equation*}
\text{PACF(2) = }\hat{\alpha}_{22}
\end{equation*}
\begin{equation*}
\vdots
\end{equation*}
\begin{equation*}
\text{PACF(}m\text{) = }\hat{\alpha}_{mm} \end{equation*}
\section{Identification\label{ident_section_arima}} Under the Box-Jenkins procedure, the specific form of ARMA model is selected by an examination of the estimated ACF and PACF. Each type of ARMA model generates a particular form of the autocorrelation and partial autocorrelation functions. The process of selecting a model that is most consistent with the observed ACF\ and PACF functions is known as the \textbf{% identification process}. (Note that this concept of “identification” is quite different than the identification problem associated with simultaneous equation models.) Let’s examine the criteria used for this purpose.
\subsection{Autoregressive processes}
\subsubsection{AR(1) models}
Before considering the general case of a $p$th-order autoregressive process, it will be helpful to consider the simpler case of an AR(1) model. This model may be expressed as:
\begin{equation}
y_t=\rho y_{t-1}+\epsilon _t \label{ar1.arima} \end{equation}
In period $t-1$ this relationship may be expressed as: \begin{equation}
y_{t-1}=\rho y_{t-2}+\epsilon _{t-1} \label{lag.ar1.arima} \end{equation}
Substituting equation \ref{lag.ar1.arima} into equation \ref{ar1.arima} results in:
\begin{equation*}
y_t=\rho ^2y_{t-2}+\rho \epsilon _{t-1}+\epsilon _t \end{equation*}
Repeating this procedure for $k$ successive lags of $y_t$, this becomes: \begin{equation*}
y_t=\rho ^{k+1}y_{t-(k+1)}+\rho ^k\epsilon _{t-k}+\rho ^{k-1}\epsilon _{t-(k-1)}+\cdots +\rho ^2\epsilon _{t-2}+\rho \epsilon _{t-1}+\epsilon _t \end{equation*}
Allowing for an infinite time horizon, this can be expressed as: \begin{equation}
y_t=\sum_{i=0}^\infty \rho ^i\epsilon _{t-i} \label{invert.arima} \end{equation}
Thus, as equation \ref{invert.arima} indicates, an AR(1) process can always be written in an equivalent form as an MA($\infty $) model.
An AR(1) model is stationary if $\left| \rho \right| <1$. As equation \ref% {invert.arima} indicates, in a stationary AR(1) process the effect of past random shocks gradually becomes smaller as time passes (since $\rho ^i$ tends to zero when $i$ increases if $\left| \rho \right| <1$).
Using equation \ref{invert.arima}, the variance of an AR(1) process can be expressed as:
\begin{equation}
var(y_{t})=E\left[ (\sum_{i=0}^{\infty }\rho ^{i}\epsilon _{t-i})(\sum_{j=0}^{\infty }\rho ^{j}\epsilon _{t-j})\right] \label{var.ar1.arima}
\end{equation}%
\begin{equation*}
=\sum_{i=0}^{\infty }\sum_{j=0}^{\infty }\rho ^{i}\rho ^{j}E(\epsilon _{t-i}\epsilon _{t-j})
\end{equation*}%
But:
\begin{equation*}
E(\epsilon _{t-i}\epsilon _{t-j})=0\text{ for }i\neq j \end{equation*}%
and
\begin{equation*}
E(\epsilon _{t-i}\epsilon _{t-j})=\sigma _{\epsilon }^{2}\text{ for }i=j \end{equation*}%
Thus, the variance of $y_{t}$ can be stated as:\footnote{% Note that if the variance in equation \ref{paq.arima} would be infinite if $% \left\vert \rho \right\vert >1$ since the summation involves terms that get larger as $i$ increases. (This is why $\left\vert \rho \right\vert $ must be less than one for the series to be stationary.)} \begin{equation}
var(y_{t})=\sigma _{\epsilon }^{2}\sum_{i=0}^{\infty }\left( \rho ^{2}\right) ^{i} \label{paq.arima}
\end{equation}%
As long as $\left\vert \rho \right\vert <1$, this can be simplified to:% \footnote{%
The sum appearing in equation \ref{paq.arima} can be written as: \begin{equation*}
S=1+\rho ^{2}+\rho ^{4}+\rho ^{6}+\rho ^{8}+\cdots \hspace{1in}(1) \end{equation*}%
Multiplying both sides of this equation by $-\rho $ results in: \begin{equation*}
-\rho ^{2}S=-\rho ^{2}-\rho ^{4}-\rho ^{6}-\rho ^{8}-\rho ^{10}+\cdots \hspace{1in}(2)
\end{equation*}%
Adding equations (1) and (2) results in:
\begin{equation*}
S-\rho ^{2}S=1
\end{equation*}%
Simplifying:
\begin{equation*}
S=\frac{1}{1-\rho ^{2}}
\end{equation*}%
}
\begin{equation*}
var(y_{t})=\frac{\sigma _{\epsilon }^{2}}{1-\rho ^{2}} \end{equation*}
Let’s examine the ACF for an AR(1) process. For $k=1$, this is defined as: \begin{equation} \label{acf.ar1.1.arima} ACF(1)=\frac{E(y_ty_{t-1})}{E(y_t^2)}
\end{equation}
The numerator of this expression can be restated as: \begin{equation} \label{abc.arima}
E(y_ty_{t-1})=E\left[ \left( \rho y_{t-1}+\epsilon _t\right) y_{t-1}\right] \end{equation}
\begin{equation*}
=\rho E(y_{t-1})^2+E(\epsilon _ty_{t-1})
\end{equation*}
Since past values of $y_{t-1}$ are independent of the current error term in an AR model, $E(\epsilon _ty_{t-1})$ = 0. Thus, equation \ref{abc.arima} reduces to:
\begin{equation} \label{cov.ar1.arima}
E(y_ty_{t-1})=\rho E(y_{t-1})^2
\end{equation}
Substituting equation \ref{cov.ar1.arima} into equation \ref{acf.ar1.1.arima} results in:
\begin{equation*}
ACF(1)=\frac{\rho E(y_{t-1})^2}{E(y_t^2)} \end{equation*}
Since it is assumed that the variance of this process is constant across time, $E(y_{t-1})^2=E(y_t)^2$. Thus, at lag 1, the ACF for an AR(1) process is:
\begin{equation*}
ACF(1)=\rho
\end{equation*}
Using the same procedure for each lag, it can be easily shown that the ACF for an AR(1) process is given by:
\begin{equation*}
ACF(k)=\rho ^{k}
\end{equation*}%
Thus, for a stationary AR(1) process, the value of the ACF asymptotically approaches zero as the lag length increases. If $\rho $ is positive the ACF will always be positive; the ACF will alternate in sign, however, if $\rho $ is negative. Figure~\ref{acf_ar1_g_arima} illustrates this relationship.
\begin{center}
\FRAME{ftbpFU}{4.9078in}{4.9502in}{0pt}{\Qcb{ACF\ for an AR(1) process}}{% \Qlb{acf_ar1_g_arima}}{fig18-1.gif}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 4.9078in;height 4.9502in;depth 0pt;original-width 4.8542in;original-height 4.8957in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/Fig18-1.gif’;file-properties “XNPEU”;}} \end{center}
Consider the PACF for an AR(1) process. As noted above, the PACF is a measure of the correlation between $Y_{t}$ and $Y_{t-k}$ after controlling for the effect of $Y_{t-i}$ (for $i<k$). For $k=1$, the PACF will, by definition, always equal the value of the ACF. After controlling for the effect of $Y_{t-1}$, however, the partial correlation between $Y_{t-k}$ and $% Y_{t}$ equals zero for $k>1$. Thus, the PACF equals zero for $\dot{k}>1$.
Figure~\ref{pacfar1_g_arima} contains a graph of the PACF for an AR(1) model.
\begin{center}
\FRAME{ftbpFU}{4.8974in}{5.0652in}{0pt}{\Qcb{PACF\ for an AR(1) process}}{% \Qlb{pacfar1_g_arima}}{fig18-2.gif}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 4.8974in;height 5.0652in;depth 0pt;original-width 4.8438in;original-height 5.0107in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/Fig18-2.gif’;file-properties “XNPEU”;}} \end{center}
\subsubsection{AR($p$) model}
The results for higher-order autoregressive processes are quite similar to those for an AR(1) model.\footnote{%
A derivation of these results is beyond the scope of the current text. A more detailed discussion may be found in Box and Jenkins (1976), Granger (1980), Vandaele (1983), or Harvey (1993).} Under a stationary AR($p$) model, the ACF gradually approaches zero. For lags less than or equal to $p$ the PACF will be nonzero. If the lag is greater than $p$, however, the value of the PACF will equal zero (since only the first $p$ lags of $Y_{t}$ have a direct effect on the current value of $Y_{t})$.
Thus, for a $p$th-order autoregressive process: \begin{itemize}
\item the ACF function will gradually taper off; and \item the PACF equals zero for lags greater than $p$.
\end{itemize}
\subsection{Moving average processes}
\subsubsection{MA(1)}
Before considering the general case of an MA($q$) model, it will be useful to consider the simpler case of an MA(1) process. In this case, the model is given by:
\begin{equation}
Y_t=\epsilon _t-\theta \epsilon _{t-1} \label{ma.1.arima} \end{equation}
\begin{equation*}
\text{where: }\epsilon _t\text{ is a white-noise error process} \end{equation*}
Since the $E(Y_t$) =0, the variance of $Y_t$ is equal to: \begin{equation*}
var(Y_t)=E(\epsilon _t-\theta \epsilon _{t-1})^2 \end{equation*}
\begin{equation*}
=E(\epsilon _t^2-2\theta \epsilon _t\epsilon _{t-1}+\theta ^2\epsilon _{t-1}^2)
\end{equation*}
\begin{equation*}
=E(\epsilon _t^2)-2\theta E(\epsilon _t\epsilon _{t-1})+\theta ^2E(\epsilon _{t-1}^2)
\end{equation*}
Since $\epsilon _t$ is a white-noise process, $E(\epsilon _t\epsilon _{t-k})=0$ (for $t\neq k$) and $E(\epsilon _t^2)=$ $E(\epsilon _{t-1}^2)=\sigma ^2$.Thus, the variance of a MA(1) process is given by: \begin{equation*}
var(Y_t)=\sigma ^2(1+\theta ^2)
\end{equation*}
Let’s examine the autocorrelation function for an MA(1) process. The autocorrelation at lag $k$ is given by:
\begin{equation*}
ACF(k)=\frac{E(Y_tY_{t-k})}{E(Y_t^2)}
\end{equation*}
\begin{equation*}
=\frac{E\left[ \left( \epsilon _t-\theta \epsilon _{t-1}\right) \left( \epsilon _{t-k}-\theta \epsilon _{t-k-1}\right) \right] }{\sigma ^2(1+\theta ^2)}
\end{equation*}
Thus,
\begin{equation} \label{xvz.arima}
ACF(k)=\frac{E(\epsilon _t\epsilon _{t-k})-\theta E(\epsilon _t\epsilon _{t-k-1})-\theta E(\epsilon _{t-1}\epsilon _{t-k})+\theta ^2E(\epsilon _{t-1}\epsilon _{t-k-1})}{\sigma ^2(1+\theta ^2)} \end{equation}
If $k=1$, equation \ref{xvz.arima} reduces to: \begin{equation*}
ACF(1)=\frac{-\theta E(\epsilon _{t-1}^2)}{\sigma ^2(1+\theta ^2)} \end{equation*}
\begin{equation*}
=\frac{-\theta }{1+\theta ^2}
\end{equation*}
It is interesting to note that for $k=1$, the sign of the moving average parameter is the opposite of the sign of the ACF function. For $k>1$, however, the ACF equals zero. (The proof of these propositions is left to the reader as an exercise.) Thus, for an MA(1) process, the ACF is nonzero for $k=1$, but equals zero for $k>1$.
To investigate the PACF for an MA(1) model, it will be helpful to transform the MA(1) process into an equivalent autoregressive process. As stated in equation \ref{ma.1.arima}, an MA(1) process can be stated as: \begin{equation}
Y_{t}=\epsilon _{t}-\theta \epsilon _{t-1} \label{ma.1.1.arima} \end{equation}%
Since this result is assumed to hold in each time period: \begin{equation}
Y_{t-1}=\epsilon _{t-1}-\theta \epsilon _{t-2} \label{ma.1.2.arima} \end{equation}%
\begin{equation}
Y_{t-2}=\epsilon _{t-2}-\theta \epsilon _{t-3} \label{ma.1.3.arima} \end{equation}%
\begin{equation*}
\vdots
\end{equation*}%
\begin{equation}
Y_{t-k}=\epsilon _{t-k}-\theta \epsilon _{t-(k+1)} \label{ma.1.k.arima} \end{equation}%
Equation \ref{ma.1.2.arima} may be restated as: \begin{equation}
\epsilon _{t-1}=Y_{t-1}+\theta \epsilon _{t-2} \label{ma.2.arima} \end{equation}%
Substituting equation \ref{ma.2.arima} into equation \ref{ma.1.1.arima} results in:
\begin{equation*}
Y_{t}=\epsilon _{t}-\theta \left( Y_{t-1}+\theta \epsilon _{t-2}\right) \end{equation*}%
or:
\begin{equation}
Y_{t}=\epsilon _{t}-\theta Y_{t-1}-\theta ^{2}\epsilon _{t-2} \label{ma.2.2.arima}
\end{equation}%
Solving equations \ref{ma.1.3.arima} through \ref{ma.1.k.arima} for $% \epsilon _{t-2,}\epsilon _{t-3},\ldots ,\epsilon _{t-k}$ and progressively substituting these values into equation \ref{ma.2.2.arima} results in: \begin{equation*}
Y_{t}=\epsilon _{t}-\theta Y_{t-1}-\theta ^{2}Y_{t-2}-\theta ^{3}Y_{t-3}-\cdots -\theta ^{k}Y_{t-k}-\theta ^{k+1}\epsilon _{t-(k+1)} \end{equation*}%
Allowing for an infinite time horizon results in: \begin{equation}
Y_{t}=\epsilon _{t}-\theta Y_{t-1}-\theta ^{2}Y_{t-2}-\theta ^{3}Y_{t-3}-\cdots -\theta ^{k}Y_{t-k}-\cdots \label{ma.2.inf.arima} \end{equation}
As equation \ref{ma.2.inf.arima} indicates, a MA(1) model can always be expressed in the form of an equivalent AR($\infty $) model. This autoregressive model will be stationary as long as $\left| \theta \right| <1. $ A moving average model that may be transformed into a stationary autoregressive process is said to be \textbf{invertible}. Thus, an invertible moving average process may be expressed in an equivalent form as a stationary autoregressive process of infinite order.\footnote{% Since the effect of past values of $Y_{t}$ decline under a stationary autoregressive process, however, an invertible moving average process can generally be adequately represented using an AR process of finite order.} In this model $Y_{t-k}$ will be correlated with the current level of $Y_{t}$ even after the effect of $Y_{t-i}$ ($i<k)$ has been taken into account.
Thus, the PACF function will be nonzero for all values of $k$. In an invertible model, the PACF will decline as the lag length increases. Figure~% \ref{acf_pacf_ma1_g_arima} contains a graph of the ACF and PACF for an invertible MA(1) model.
\begin{center}
\FRAME{ftbpFU}{5.757in}{4.9286in}{0pt}{\Qcb{ACF and PACF for an MA(1) process% }}{\Qlb{acf_pacf_ma1_g_arima}}{fig18-3.gif}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 5.757in;height 4.9286in;depth 0pt;original-width 5.6982in;original-height 4.875in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/Fig18-3.gif’;file-properties “XNPEU”;}} \end{center}
In summary, an MA(1) model is characterized by: \begin{itemize}
\item an ACF that is nonzero for $k=1$, but equals zero for $k>1;$ \item a PACF that gradually tapers off and approaches zero as $k$ increases.
\end{itemize}
\subsubsection{MA($q$)}
The results for an MA($q$) model are a direct extension of the results for an MA(1) model.\footnote{%
A derivation of these results is beyond the scope of this text and may be found in Box and Jenkins (1976), pp. 67-73.} In an MA($q$) model \begin{itemize}
\item the ACF is nonzero for the first $q$ lags, but equals zero for lags greater than $q$; and
\item the PACF gradually approaches zero as $k$ increases.
\end{itemize}
\subsection{Autoregressive moving average (ARMA) processes} An ARMA($p,q$) model involves a combination of the AR($p$) and MA($q$) models discussed above. A full discussion of this model is beyond the scope of this text. For our purposes, it is sufficient to note that under an ARMA model, both the ACF and PACF functions gradually taper off.
\subsection{Overview of AR, MA, and ARMA processes} In summary, the following rules are useful for identification processes: \begin{itemize}
\item In an AR($p$) process, the ACF gradually tapers off and the PACF abruptly “cuts off” (equals zero) for lags greater than $p$.
\item In an MA($q$) model, the PACF gradually tapers off and the ACF abruptly “cuts off” for lags greater than $q$.
\item In an ARMA($p,q$) model, the ACF and PACF both gradually taper off.
\end{itemize}
Figure~\ref{arma_g_arima} illustrates the ACF and PACF for several alternative model specifications.
\begin{center}
\FRAME{ftbpFU}{2.9118in}{4.7271in}{0pt}{\Qcb{ACF\ and PACF for AR, MA, and ARMA models}}{\Qlb{arma_g_arima}}{fig18-4.gif}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 2.9118in;height 4.7271in;depth 0pt;original-width 5.9067in;original-height 9.6254in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/Fig18-4.gif’;file-properties “XNPEU”;}} \end{center}
\section{Stationarity and differencing\label{station.arima}} As noted above, an economic time-series variable is said to be stationary if the distribution is constant across time. In particular, this stationarity requirement guarantees that both the mean and the variance of the distribution are constant across time. The AR, MA, and ARMA models discussed above are all assumed to be stationary processes.\footnote{% The precise conditions for stationarity are discussed in Box and Jenkins (1976), pp. 26-30. In the case of AR, MA, and ARMA processes, the stationarity requirement guarantees that the effect of past error terms must become smaller as the length of the time lag increases. In other words, in a stationary ARMA model, the more recent past will generate a larger effect on the present than the more distant past.} Most actual economic time series, however, are characterized by a significant trend component. In this case, the mean of the distribution is not constant over time. Let’s examine how a nonstationary process might be transformed into a stationary form.
\subsection{Elimination of a linear trend by differencing} \begin{center}
\FRAME{ftbpFU}{4.5351in}{3.4091in}{0pt}{\Qcb{Time-series variable exhibiting a linear trend}}{\Qlb{Fig18_5}}{fig18_5.png}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 4.5351in;height 3.4091in;depth 0pt;original-width 8.3333in;original-height 6.9444in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/fig18_5.png’;file-properties “XNPEU”;}} \end{center}
Consider the model given by:
\begin{equation}
Y_{t}=\alpha +\beta \text{\textit{Year}}_{t}+u_{t} \label{lin.trend.arima} \end{equation}%
\begin{equation*}
\text{where: }u_{t}\text{ is a white-noise error process} \end{equation*}%
A graph of this relationship appears in Figure~18.5. Since the mean value of $Y_{t}$ is function of the time variable (assuming that $\beta \neq 0$), $% Y_{t}$ is obviously not a stationary process. Suppose, however, that we compute a first-differenced version of equation \ref{lin.trend.arima}. In order to difference this equation, it is necessary to lag equation \ref% {lin.trend.arima} to form:
\begin{equation}
Y_{t-1}=\alpha +\beta \text{\textit{Year}}_{t-1}+u_{t-1} \label{lin.lag.arima}
\end{equation}%
To perform a first-difference operation equation \ref{lin.lag.arima} is subtracted from equation \ref{lin.trend.arima} to form: \begin{equation}
Y_{t}-Y_{t-1}=\beta (\text{\textit{Year}}_{t}-\text{\textit{Year}}% _{t-1})+u_{t}-u_{t-1} \label{dif.1.lin.trend} \end{equation}%
Since:
\begin{equation*}
\text{\textit{Year}}_{t}-\text{\textit{Year}}_{t-1}=1 \end{equation*}%
equation \ref{dif.1.lin.trend} simplifies to: \begin{equation}
\Delta Y_{t}=\beta +\Delta u_{t} \label{dif.lin.trend} \end{equation}
Figure~18.6 contains a graph of equation \ref{dif.lin.trend}. As this diagram indicates, the transformed process is stationary. The transformed variable $\Delta Y_{t}$ has a mean of:
\begin{equation*}
E(\Delta Y_{t})=\beta
\end{equation*}%
and a variance of:
\begin{equation*}
var(\Delta Y_{t})=var(\Delta u_{t})
\end{equation*}%
\begin{equation*}
=2\sigma _{u}^{2}
\end{equation*}%
\begin{equation*}
\text{(where }\sigma _{u}^{2}\,\text{ is the variance of }u_{t}\text{)} \end{equation*}
\begin{center}
\FRAME{ftbpFU}{5.7406in}{4.3128in}{0pt}{\Qcb{First difference of a time-series variable exhibiting a linear trend.}}{\Qlb{fig18_6}}{fig18_6.png% }{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 5.7406in;height 4.3128in;depth 0pt;original-width 9.0139in;original-height 6.7602in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/fig18_6.png’;file-properties “XNPEU”;}} \end{center}
In general, whenever a time-series variable contains a pure linear trend, the first difference of the variable will possess a constant mean. This suggests that, in at least some cases, first differencing may be used to convert a nonstationary process into one that is stationary. The practical application of this method of inducing stationarity is discussed in Section % \ref{id.arima}.
\subsection{Elimination of a quadratic trend by differencing} In many cases, however, an economic time series variable exhibits a nonlinear trend component. A quadratic trend model is often used to represent this type of effect. A simple quadratic trend model is given by: \begin{equation} \label{quad.trend.arima} Y_t=\alpha +\beta Year_t+\gamma Year_t^2+u_t \end{equation}
A first-differenced version of equation \ref{quad.trend.arima} is given by: \begin{equation} \label{quad.2.arima}
\Delta Y_t=\beta +\gamma (2Year_t-1)+\Delta u_t \end{equation}
(The proof is left to the reader as an exercise). Equation \ref{quad.2.arima} can be expressed in the simpler form:
\begin{equation} \label{quad.3.arima}
\Delta Y_t=\left( \beta -\gamma \right) +2\gamma Year_t+\Delta u \end{equation}
Defining:
\begin{equation*}
\eta _o=\beta -\gamma
\end{equation*}
and
\begin{equation*}
\eta _1=2\gamma
\end{equation*}
equation \ref{quad.3.arima} can be expressed in the simple form: \begin{equation} \label{quad.4.arima}
\Delta Y_t=\eta _o+\eta _1Year_t+\Delta u_t \end{equation}
An inspection of equation \ref{quad.4.arima} indicates that taking the first-difference of a variable that exhibits a quadratic trend results in the creation of a new variable that exhibits a linear trend. As noted above, however, a linear trend can be removed by differencing the dependent variable. In this case, however, this would involve differencing a variable that has already been subject to a first-differencing. The transformed model is given by:
\begin{equation*}
\Delta (\Delta Y_t)=\eta _1+\Delta (\Delta u_t) \end{equation*}
This is generally expressed using the simpler notation: \begin{equation} \label{quad.5.arima}
\Delta ^2Y_t=\eta _1+\Delta ^2u_t
\end{equation}
An inspection of equation \ref{quad.5.arima} indicates that this second difference of the original variable will be stationary.
Figure \ref{fig18_7a} provides a graph of the original series, the first difference of the series and the second difference for a time series variable that exhibits a quadratic trend. As this diagram indicates, the original series exhibits a nonlinear trend. The first-differenced series, however, exhibits what appears to be a linear trend. The second-differenced series, however, appears to be stationary.
\begin{center}
\FRAME{ftbpFU}{2.9741in}{6.8338in}{0pt}{\Qcb{Plot of original, first-differenced, and second-differenced series for a time-series variable exhibiting a quadratic trend.}}{\Qlb{fig18_7}}{fig18_7.png}{\special% {language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 2.9741in;height 6.8338in;depth 0pt;original-width 8.8885in;original-height 6.6668in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/fig18_7.png’;file-properties “XNPEU”;}} \end{center}
This discussion suggests that second-differencing may be used to convert a time-series exhibiting a quadratic trend into a stationary series.
\subsection{Higher-order differencing}
It is quite possible that an economic time series might exhibit a trend relationship that appears to be approximately a cubic (or higher-order) polynomial function of time. In this case, third (or higher level) differencing would be required to induce stationarity. If the variable is a $% k$-th order polynomial function of time, a first difference operation would create a transformed variable that is a ($k-1)$st-order polynomial function of time. In this case, $k$ successive differencing operations would be required to induce stationarity.
Box and Jenkins (and numerous subsequent researchers) have found, however, that in practice, first- or second-order differencing is sufficient to transform a nonstationary series into one that appears to exhibit stationarity.
\subsection{Other transformations}
The differencing procedures discussed above work well as long as the variable exhibits nonstationarity only in the mean. In many applications, however, the variance of the time-series variable increases with the level of the series. The differencing procedure discussed above will remove the trend component in the level of the series, but will not correct for the nonconstant variance.
When econometricians find that a series exhibits nonstationarity in both the mean and the variance they may apply either a log transformation or a square root transformation to the original series. These transformations are used to:\footnote{%
A more complete discussion may be found in Vandaele (1983), pp. 18-20).} \begin{itemize}
\item convert a nonlinear trend to one that is approximately linear, and \item transform the series so that the variance is approximately constant over the entire range of the series.
\end{itemize}
\section{Detection of nonstationarity\label{nonstation.sec}} In order to determine whether differencing or some other type of transformation is needed to convert a nonstationary time series into a stationary series, two major techniques are used. The simplest procedure is to plot the time series as a function of time. If a noticeable trend is present in the level of the series, then it is likely that the series is nonstationary. If differencing is used to induce stationarity, the differenced series is also plotted. If this differenced time series appears to have a constant mean and variance, then it is likely that no further differencing is needed. On the other hand, if a noticeable trend is still present, then a second difference operation may be in order.
As noted above, however, if the series appears to be growing at a constant rate at each point in time and the variance increases with the level of the series, then a log or square root transformation may be in order. The choice between a log or square root transformation can be based on a visual inspection of a time plot of the transformed data. If the transformation has worked appropriately, a nonlinear trend will be converted into a linear trend and the variance will appear to be constant over time.
The second technique used to detect nonstationarity involves an inspection of the ACF and PACF for the original series. If the process is nonstationary, the sample ACF will remain relatively large at all lags and the sample PACF will be close to one for $k=1$, and close to zero for all subsequent lags.
It should also be noted that the use of differencing to induce stationarity is often criticized by econometricians. A first-difference operation is appropriate only if the time-series variable possesses a unit root. A first-order autoregressive process with a root close to one might easily be mistaken for a nonstationary process. Many econometricians would prefer the use of a formal test for the presence of a unit root (such as the Dickey-Fuller test discussed in the appendix to Chapter \ref{unitroots.chap}% ) to the less formal method suggested by Box and Jenkins.
\section{ARIMA($p,d,q$)}
As noted above, most economic time-series variables are nonstationary.
Because of this, it is generally inappropriate to directly attempt to fit an ARMA model to an observed time-series variable (since an ARMA\ model is a stationary process). The ARIMA($p,d,q$) model was proposed by Box and Jenkins to deal with this situation. Under this model, the original series is differenced until the resultant series appears to be stationary. The \textquotedblleft $d$\textquotedblright\ in the ARIMA($p,d,q$) specification indicates the level of differencing needed to induce stationarity. Once this is determined, an ARMA($p,q$) model is fit to the $d$th difference of the original series. Thus, an ARIMA($p,d,q$) specification simply indicates that the $d$th difference of the original time-series variable is a stationary ARMA$(p,q$) model. For example, in an ARIMA(0,1,3) model, the first difference of the time-series variable is assumed to be an ARMA(0,3) model.
\section{Box-Jenkins Methodology}
Under the Box-Jenkins methodology the selection of the form of an ARIMA model consists of three stages:
\begin{itemize}
\item identification;
\item estimation; and
\item diagnostic checking.
\end{itemize}
Let’s examine the implementation of this procedure.
\subsection{Identification\label{id.arima}} The Box-Jenkins identification process involves the following steps: \begin{enumerate}
\item[Step 1:] Compute the ACF and PACF for the original series. If the ACF does not taper off very rapidly and the PACF has a spike at $k=1$, then at least first differencing is required to induce stationarity. (This will usually be the case.)
\item[Step 2:] Compute the ACF and PACF for the first-differenced series. If the ACF still exhibits the pattern described in Step~1, then second differencing is required. Difference the series until the ACF either tapers off or abruptly drops toward zero (becomes statistically insignificant) within a small number of lags (4 or 5). Experience has shown that first differencing will be sufficient for this purpose in most applications. In those cases in which a first-difference operation is not sufficient to induce stationarity, a second differencing will usually achieve this objective.
\item[Step 3:] Once the level of differencing is determined, the ACF\ and the PACF for the differenced series is compared to the theoretical ACF and PACF functions. The model selection rule is: \end{enumerate}
\begin{description}
\item
\begin{itemize}
\item choose an AR($p$) model if the ACF gradually tapers off and the PACF abruptly cuts off for lags greater than $p;$ \item choose an MA($q$) model if the PACF gradually tapers off and the ACF abruptly cuts off for lags greater than $q$; or \item choose an ARMA($p,q$) model if the ACF and PACF both gradually taper off. Since it is difficult to determine the order of the AR and MA processes from an examination of the ACF and PACF, several versions of this model are often estimated by econometricians. (The choice among alternative models is discussed in more detail below).
\end{itemize}
\end{description}
This identification process results in the tentative specification of an ARIMA model.
\subsection{Estimation}
ARIMA models are generally estimated using a maximum likelihood (or nonlinear least squares) estimation procedure. A full discussion of the estimation of ARIMA models is beyond the scope of this text.\footnote{% Mathematically sophisticated readers may find a good discussion of the estimation of ARIMA models in Harvey (1993), pp. 48-73.} Most of the major econometric software packages contain estimators for ARIMA models. These software packages also allow the user to request estimates of the ACF and PACF for a time series variable using relatively simple commands.
\subsection{Diagnostic testing}
A relatively simple procedure can be used to assess the performance of an ARIMA model. Suppose an economist fits an ARIMA model to an economic time series variable. If the model is correctly specified, the residual term is a white-noise error process. In the case of a white-noise error process, the theoretical ACF (and PACF) is zero for $k>0$. This property can be used to construct a test for model specification based upon the sample autocorrelation function for the sample residual in an ARIMA\ model.
To test whether a particular ARIMA\ model is appropriate, the following procedure may be used:
\begin{enumerate}
\item[Step 1:] Estimate the parameters of the ARIMA\ model and store the sample residuals from this model.
\item[Step 2:] Compute either the Box-Pierce $Q$ statistic or the Ljung-Box $% Q^{\prime }$ statistic for the sample residual series. As noted in Chapter % \ref{auto.chap}, these statistics are computed as: \begin{equation*}
\text{Box-Pierce statistic: }Q=N\sum_{i=1}^k\hat{\rho}_i \end{equation*}
\begin{equation*}
\text{Ljung-Box statistic: }Q^{\prime }=N(N+2)\sum_{i=1}^k\left( \frac{\hat{% \rho}_i}{N-i}\right)
\end{equation*}
(As noted in Chapter \ref{auto.chap}, many econometricians prefer the Box-Ljung statistic on the grounds that it tends to perform somewhat better in small samples.) The null hypothesis in this case is: \begin{equation*}
\text{H}_o\text{: }\rho _1=\rho _2=\cdots =\rho _k=0 \end{equation*}
Under this null hypothesis each of these statistics is asymptotically distributed as a $\chi ^2(k)$. It is common practice to compute this statistic for several lag lengths (\textit{e.g.,} $k=5$ and $k=10$).
\item[Step 3:] Reject the null hypothesis if the estimated value of the statistic in Step 2 exceeds the critical value at a preselected significance level. The rejection of the null hypothesis indicates that the tentatively specified ARIMA model should be rejected. In this case, an alternative model (consistent with the observed ACF and PACF for the suitably differenced series) should be specified. This process continues until a suitable model is selected.

\end{enumerate}
In addition to the use of a Box-Pierce (or Ljung-Box) statistic, econometricians will also examine the significance of the individual sample autocorrelations for the error terms. The error terms represent that portion of the variation in the series that is not accounted for by the ARIMA model.
Under an appropriately specified ARIMA\ model, the error terms should be a white-noise error process (since the model has already “extracted” all relevant past information about the series).
If a time series is generated by a white-noise error process, the estimated values of the ACF are asymptotically normally distributed. Under this assumption, each estimated autocorrelation has a sample variance approximately equal to:\footnote{%
This result is derived in Bartlett (1946).} \begin{equation*}
\widehat{var}(\hat{\rho}(k))=\frac 1N\left( 1+2\sum_{i=1}^q\hat{\rho}% (i)^2\right) \text{ for }q<k
\end{equation*}
\begin{equation*}
\text{where: }N\text{ = number of observations in the series} \end{equation*}
The approximate standard error for the estimated autocorrelation is given by:
\begin{equation*}
\text{s.e.(}\hat{\rho}(k))=\sqrt{\widehat{var}(\hat{\rho}(k))} \end{equation*}
Thus, under the null hypothesis that states that the error terms are a white-noise error process, an approximate 95\% confidence interval for each estimated autocorrelation is given by:
\begin{equation*}
\hat{\rho}(k)\pm 1.96\text{s.e.(}\hat{\rho}(k)) \end{equation*}
Most econometric software packages will construct a graph of the sample ACF that contains this approximate 95\% confidence interval.
\subsubsection{Parsimonious parameterization} Box and Jenkins suggested that the model selection criterion should be based on the principle of parsimony.\footnote{%
See the discussion in Box and Jenkins (1976), pp. 17-18.} This principle suggests that, in general, it is best to select the simplest possible model that adequately explains the time series under consideration. This was particularly important during the mid 1970s when more complex models required a fairly large quantity of relatively expensive computer time.
Faster computer hardware and improved software algorithms have made the argument for parsimony somewhat weaker today. In general, however, most time-series analysts prefer to use simpler models as long as these perform as well as more complex specifications.
\section{Example: Interest yield on 10-year U.S. Treasury bonds} Let’s examine the application of the Box-Jenkins methodology. Figure~\ref% {fig18_8} contains a graph of the nominal interest yield on 10-year Treasury bonds from 1982 through 2003.\footnote{%
The data used to estimate this model appears in the file \textquotedblleft int.dat\textquotedblright\ (described in Table \ref{int.dat} in Appendix \ref% {data.appendix}).} An examination of this graph suggests that this series exhibits a downward trend throughout most of this period. Figure~\ref% {fig18_9} contains SAS output containing the sample ACF and PACF for this series. The gradual tapering off of the ACF function and the spike of the PACF function at $k=1$ suggests that this interest-rate series is nonstationary. Under the Box-Jenkins methodology, this indicates a need for at least a first difference of this data.\footnote{% Of course, the unit root test described in Chapter \ref{unitroots.chap} can also be used to test for the presence of a nonstationary process. The computation of the Dickey-Fuller statistic for this series is left to the reader as an exercise.}
\begin{center}
\FRAME{ftbpFU}{5.7424in}{4.3145in}{0pt}{\Qcb{Nominal yield on 10-year constant maturity U.S. Treasury maturities – monthly data \ 1/82-9/03}}{\Qlb{% fig18_8}}{fig18_8.png}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 5.7424in;height 4.3145in;depth 0pt;original-width 8.8885in;original-height 6.6668in;cropleft “0”;croptop “1”;cropright “0.9998”;cropbottom “0”;filename ‘GRAPHS/fig18_8.png’;file-properties “XNPEU”;}} \FRAME{ftbpFU}{5.0609in}{6.3347in}{0pt}{\Qcb{Selected SAS output – ACF and PACF functions for int10y.}}{\Qlb{fig18_9}}{fig18_9.png}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 5.0609in;height 6.3347in;depth 0pt;original-width 7.625in;original-height 9.5553in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename
‘GRAPHS/fig18_9.png’;file-properties “XNPEU”;}} \end{center}
Figure~\ref{fig18_10} provides a graph of the first difference of the interest rate series. As this diagram indicates, the first difference of this series appears to be stationary. This is confirmed by the estimated ACF and PACF contained in Figure~\ref{fig18_11}. An examination of the estimated ACF and PACF for the first-differenced interest-rate series indicates that the ACF is large at only the first lag while the PACF\ appears to taper off.
This suggests that an MA(1) model is an appropriate specification for the differenced series.\footnote{%
This interpretation is somewhat ambiguous since both the ACF\ and PACF taper off fairly quickly. To check the model selection, an ARIMA(1,1,1) model was also estimated. There was no substantial improvement in the results and the AR(1)\ term was not statistically significant at a 5\% level. Following the principle of parsimonious parameterization, the simpler ARIMA(0,1,1) model was selected to represent this series.} In the notation discussed above this model is called an ARIMA (0,1,1) model.
\begin{center}
\FRAME{ftbpFU}{5.7424in}{4.3145in}{0pt}{\Qcb{First difference of interest rate variable.}}{\Qlb{fig18_10}}{fig18_10.png}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 5.7424in;height 4.3145in;depth 0pt;original-width 8.8885in;original-height 6.6668in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/fig18_10.png’;file-properties “XNPEU”;}} \FRAME{ftbpFU}{5.1283in}{7.318in}{0pt}{\Qcb{SAS output of ACF and PACF functions for first-differenced interest rate series.}}{\Qlb{fig18_11}}{% fig18_11.png}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 5.1283in;height 7.318in;depth 0pt;original-width 9.7222in;original-height 14.7917in;cropleft “0”;croptop “1”;cropright “1.0645”;cropbottom “0”;filename ‘GRAPHS/fig18_11.png’;file-properties “XNPEU”;}} \end{center}
The estimated parameters for this model appear in Table \ref{arima011.tab}.
%TCIMACRO{%
%\TeXButton{tabular}{\begin {table}
%\begin{center}
%\begin{tabular}{|ccrr|} \hline
%\bf {Coefficient} & \bf {lag length} & \bf {Estimate} & \bf {$t$-ratio} \\ \hline %constant & 0 & -0.0396 & -1.63 \\ %MA & 1 & -0.423 & -7.49 \\ \hline %\end{tabular}
%\caption{Esimated ARIMA(0,1,1) model of the interest yield on 30-year Treasury bonds \label{arima011.tab}} %\end{center}
%\end{table}} }%
%BeginExpansion
\begin {table}
\begin{center}
\begin{tabular}{|ccrr|} \hline
\bf {Coefficient} & \bf {lag length} & \bf {Estimate} & \bf {$t$-ratio} \\ \hline constant & 0 & -0.0396 & -1.63 \\ MA & 1 & -0.423 & -7.49 \\ \hline \end{tabular}
\caption{Esimated ARIMA(0,1,1) model of the interest yield on 30-year Treasury bonds \label{arima011.tab}} \end{center}
\end{table}
%EndExpansion
This model can also be expressed in equation form as: \begin{equation}
\Delta \widehat{\text{INT10Y}}_{t}=-0.0396+\hat{\epsilon}_{t}-0.423\hat{% \epsilon}_{t-1} \label{ARIMA011.bj}
\end{equation}%
To evaluate the fit of this model, it is helpful to examine the estimated ACF and PACF for the estimated residuals from this model. A plot of these estimated functions is contained in Figure~\ref{fig18_12}. As this Figure~indicates, each estimated autocorrelation and partial autocorrelation is insignificant at a 5\% significance level. The estimated Ljung-Box statistics are 1.11, 6.87 and 14.43 at lags of 6, 12, and 18, respectively.
Each of these estimated $\chi ^{2}$ statistics is less than the critical value at a 1\% significance levels. Thus, it is not possible to reject the null hypothesis that the residuals are a white-noise error process. This suggests that the ARIMA(0,1,1) model provides an adequate representation of this interest rate series.
\begin{center}
\FRAME{ftbpFU}{5.0194in}{5.4172in}{0pt}{\Qcb{SAS output of ACF and PACF functions for residuals for interest rate series.}}{\Qlb{fig18_12}}{% fig18_12.png}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 5.0194in;height 5.4172in;depth 0pt;original-width 10.792in;original-height 11.6525in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/fig18_12.png’;file-properties “XNPEU”;}} \end{center}
The fit of this ARIMA\ model may also be evaluated by examining the time plot of the estimated residuals appearing in Figure~\ref{fig18_13}. A visual inspection of this shows no obvious trend in the mean or variance of the series. This is consistent with the error terms following a white-noise error process. As Figure~18.14 indicates, a comparison of the actual and fitted values of this series also suggests that this model represents the observed data quite well in the sample period.
\begin{center}
\FRAME{ftbpFU}{4.4719in}{3.3615in}{0pt}{\Qcb{Time-series plot of the residuals from the interest-rate series.}}{\Qlb{fig18_13}}{fig18_13.png}{% \special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 4.4719in;height 3.3615in;depth 0pt;original-width 8.8885in;original-height 6.6668in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename
‘GRAPHS/fig18_13.png’;file-properties “XNPEU”;}} \FRAME{ftbpFU}{4.6942in}{3.5276in}{0pt}{\Qcb{Time-series plot of actual (Int10Y) and predicted (Ihat) interest rate series.}}{\Qlb{fig18_14}}{% fig18_14.png}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 4.6942in;height 3.5276in;depth 0pt;original-width 6.6668in;original-height 5.0004in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/fig18_14.png’;file-properties “XNPEU”;}} \end{center}
\section{Forecasting}
Since ARIMA models are often used for forecasting purposes, it will be helpful to briefly examine how such forecasts may be constructed.\footnote{% A more detailed discussion of the use of ARIMA models for forecasting purposes may be found in Granger (1980), pp. 41-58. Granger also provides a good discussion of the computation of variance for the forecasts.} In each case, the forecast of the future value of the time series is based solely upon the information that is available at time $t$. To simplify the exposition, the discussion below focuses solely on AR, MA, and ARMA models.% \footnote{%
This discussion can be easily generalized to ARIMA models. For example, suppose that a first-difference operation is used to induce stationarity. In this case, the forecast value of the differenced series provides an estimate of the change that occurs in the original series. As long as a starting value is known, the forecast of the change in the level of the series makes it possible to readily construct a forecast for the level of the series.} In each of the cases discussed below, it is assumed that the population parameters are known for the AR, MA, or ARMA model under discussion. Of course, in practice, these parameters are not known \textit{a priori} and must be estimated by the econometrician. To generate forecasts using estimated AR, MA, or ARMA models each of the population parameters is replaced by its estimated value.
\subsection{AR models}
As defined above, an AR($p$) model may be stated as: \begin{equation} \label{for.ar.p.arima}
Y_t=\phi _o+\phi _1Y_{t-1}+\phi _2Y_{t-2}+\cdots +\phi _pY_{t-p}+\epsilon _t \end{equation}
\begin{equation*}
\text{where: }\epsilon _t\text{ is a white-noise error process} \end{equation*}
In period $t+1$, equation \ref{for.ar.p.arima} can be stated as: \begin{equation} \label{lead.ar.arima}
Y_{t+1}=\phi _o+\phi _1Y_t+\phi _2Y_{t-1}+\cdots +\phi _pY_{t-p+1}+\epsilon _{t+1}
\end{equation}
Each of the terms on the right-hand side of equation \ref{lead.ar.arima} is assumed to be known at time $t$ with the exception of $\epsilon _{t+1}$. A forecast value of $Y_{t+1}$ can be created by replacing the unknown future error term with its mathematical expectation (= 0): \begin{equation} \label{poi.arima}
\hat Y_{t+1}=\phi _o+\phi _1Y_t+\phi _2Y_{t-1}+\cdots +\phi _pY_{t-p+1} \end{equation}
\begin{equation*}
\text{where: }\hat Y_{t+1}\text{ is the one-step ahead forecast of }Y_t \end{equation*}
It is also possible to generate a forecast for the value of $Y_t$ that will occur two or more periods in the future. Note that: \begin{equation} \label{lead.ar.2.arima}
Y_{t+2}=\phi _o+\phi _1Y_{t+1}+\phi _2Y_t+\cdots +\phi _pY_{t-p+2}+\epsilon _{t+2}
\end{equation}
In equation \ref{lead.ar.2.arima}, there are two right-hand side values that are not known at period $t$: $Y_{t+1}$ and $\epsilon _{t+2}$. Once again, however, it is possible to generate a forecast by replacing the unknown values with the best forecast of these values (given the information available at time $t$). Thus, the two-step ahead forecast is given by: \begin{equation} \label{poi.2.arima}
\hat Y_{t+2}=\phi _o+\phi _1\hat Y_{t+1}+\phi _2Y_t+\cdots +\phi _pY_{t-p+2} \end{equation}
where $\hat Y_{t+1}$ is computed using equation \ref{poi.arima}. Since equation \ref{poi.2.arima} contains a predicted right-hand side variable, additional “noise” is introduced into the forecast. Therefore the variance of the two-step ahead forecast is greater than the variance of the one-step ahead forecast given by equation \ref{poi.arima}.
By continuing this process, it is possible to generate forecasts three or more periods ahead. Short-term forecasts, however, will be more reliable than longer-term forecasts.
\subsection{MA models}
To understand the process of generating forecasts from moving average models, it is helpful to consider a simple MA(1) model. As noted above, this model is given by:
\begin{equation} \label{for.ma1.arima}
Y_t=\epsilon _t-\theta \epsilon _{t-1}
\end{equation}
\begin{equation*}
\text{where }\epsilon _t\text{ is a white-noise error process} \end{equation*}
To form a one-step ahead forecast, it is convenient to note that equation % \ref{for.ma1.arima} implies:
\begin{equation} \label{for.ma1a.arima}
Y_{t+1}=\epsilon _{t+1}-\theta \epsilon _t \end{equation}
Since $\epsilon _{t+1}$ is not known at time $t$, it is set to its expected value (= 0). Thus, a forecast value of $Y_{t+1}$ can be given by: \begin{equation} \label{for.ma1b.arima}
\hat Y_{t+1}=-\theta \epsilon _t
\end{equation}
Equation \ref{for.ma1b.arima} provides a formula that may be used to generate a forecast of $Y_{t+1}$ that may be used as long as $\epsilon _t$ is known. While $\epsilon _t$ is not directly observed, it can be easily estimated. Note that $\epsilon _t$ may be expressed as: \begin{equation*}
\epsilon _t=Y_t-\theta \epsilon _{t-1}
\end{equation*}
Suppose that there are $T$ observations for $Y_t$. The error terms may be stated as:
\begin{equation}
\epsilon _1=Y_1-\epsilon _o \label{errors.lag.arima} \end{equation}
\begin{equation*}
\epsilon _2=Y_2-\epsilon _1
\end{equation*}
\begin{equation*}
\vdots
\end{equation*}
\begin{equation*}
\epsilon _T=Y_T-\epsilon _{T-1}
\end{equation*}
This set of equations may be used to construct estimates of $\epsilon _1,\epsilon _2,\ldots ,\epsilon _T$ using a simple procedure: \begin{enumerate}
\item[Step 1:] Set $\epsilon _o$ equal to its expected value (= 0) and define $\hat{\epsilon}_1=Y_1$ (using the first equation in equation system % \ref{errors.lag.arima}).
\item[Step 2:] Substitute $\hat{\epsilon}_1$ into the second equation in equation system \ref{errors.lag.arima} to construct an estimate of $\hat{% \epsilon}_2.$
\item[Step 3:] Continue this process until all of the $\epsilon _t$’s are estimated.
\end{enumerate}
As long as $\left| \theta \right| <1$, the effect of replacing $\epsilon _o$ with its expected value will be trivial for any reasonable value of $N$.
Thus, in a MA(1) model, a one-step ahead forecast may be given by: \begin{equation} \label{for.ma1c.arima}
\hat Y_{t+1}=-\theta \hat \epsilon _t
\end{equation}
Suppose that an econometrician wished to construct a two-step ahead forecast for this model. In this case, equation \ref{for.ma1.arima} becomes: \begin{equation*}
Y_{t+2}=\epsilon _{t+2}-\theta \epsilon _{t+1} \end{equation*}
Unfortunately, however, it is not possible to construct a useful estimate of either $\epsilon _{t+2}$ or $\epsilon _{t+1}$ from the information available at time $t$. Thus, the best estimate for a two-step ahead forecast for an MA(1) model is formed by replacing these variables with their expected values. Since $E(\epsilon _{t+1})=E(\epsilon _{t+2})=0$, the best forecast is given by:
\begin{equation*}
\hat{Y}_{t+2}=0
\end{equation*}
In general, under an MA(1) model the current error term only contains information about the next period’s outcome. Thus, knowledge of $Y_t$ makes it possible to generate a nonzero forecast for $Y_{t+1}$, but does not improve our ability to forecast $Y_{t+i}$ (for $i>1$).\footnote{% This discussion is based on the assumption that the mean of $Y_t$ is zero.
If the mean of $Y_t$ is nonzero (as in the interest rate example discussed above) the expected value of the forecasts equals the mean value of $Y_t$ for those outcomes involving only future random shocks.} A similar method is used to generate forecasts under an MA($q$) model.
\subsection{ARMA models}
The basic strategy for generating forecasts under an ARMA($p,q$) model is the same as under the AR and MA processes described above. A simple example can be used to illustrate this process. Consider the ARMA(1,1) model given by:
\begin{equation} \label{for.arma.arima}
Y_t=\phi _o+\phi _1Y_{t-1}+\epsilon _t-\theta \epsilon _{t-1} \end{equation}
In period $t+1$, equation \ref{for.arma.arima} may be stated as: \begin{equation} \label{for.arma1.arima}
Y_{t+1}=\phi _o+\phi _1Y_t+\epsilon _{t+1}-\theta \epsilon _t \end{equation}
In this equation, $\epsilon _{t+1}$ is the only variable that cannot be observed (or estimated) in period $t$. Thus, the forecast value of $Y_{t+1}$ is given by:
\begin{equation*}
\hat Y_{t+1}=\phi _o+\phi _1Y_t-\theta \epsilon _t \end{equation*}
A similar technique is used to generate forecasts in more complex ARIMA specifications.
In practice, however, virtually all econometrics packages that estimate ARIMA\ models will also construct forecasts (at the user’s request). Thus, econometricians do not generally need to directly perform the tedious calculations necessary to generate forecasts under complex ARMA (and ARIMA) models.
\subsection{Example: Forecasts of the interest rate on 10-year Treasury bonds% }
Let’s examine how the estimated ARIMA\ model discussed above can be used to generate forecasts of the future value of the interest rate series. The ARIMA(0,1,1) model estimated above may be expressed as: \begin{equation}
\widehat{\Delta \text{INT10Y}}_{t}=-0.03964+\epsilon _{t}-0.42324\epsilon _{t-1} \label{ARIMA0112.bj}
\end{equation}%
The predicted change in the interest rate in period $t+1$ is given by: \begin{equation*}
\widehat{\Delta \text{INT10Y}_{t+1}}=-0.0396+\hat{\epsilon}_{t+1}-0.423\hat{% \epsilon}_{t}
\end{equation*}%
Since $\hat{\epsilon}_{t+1}$ is assumed to be a white-noise error process, it is uncorrelated with current error terms and current values of INT30$_{t}$% . Because of this, current information on either $\hat{\epsilon}_{t}$ or INT30Y$_{t}$ cannot be used to predict the level of $\hat{\epsilon}_{t+1}$.
Therefore, the best available forecast of the future value of $\epsilon _{t}$ is its expected value:$\ \hat{\epsilon}_{t+1}=0$. An estimated value of $% \hat{\epsilon}_{t}$ is constructed by the econometric software package when the ARIMA model is estimated. Thus, in this example, the best forecast of the change in the interest rate one-period ahead is given by: \begin{equation*}
\widehat{\Delta \text{INT10Y}_{t+1}}=-0.0396+0-0.423\hat{\epsilon}_{t} \end{equation*}%
The forecast of the change in the interest rate in period 2 and subsequent periods is given by:
\begin{equation*}
\widehat{\Delta \text{INT10Y}_{t+k}}=-0.0396+\hat{\epsilon}_{t+k}-0.423\hat{% \epsilon}_{t+(k-1)}
\end{equation*}%
The future values of both $\hat{\epsilon}_{t+k}$ and $\hat{\epsilon}% _{t+(k-1)}$ cannot be estimated when $k$ is greater than or equal to 2, Thus, the predicted forecast becomes:
\begin{equation*}
\widehat{\Delta \text{INT10Y}_{t+k}}=-0.0396 \end{equation*}%
Since this is the measure of the change in the interest rate that occurs, the level of the predicted interest rate decreases by -0.0396 units each period. As noted above, however, the error in this forecast increases for forecasts in the more distant future.
In practice, the computation of forecasts using ARIMA\ models can be somewhat tedious. Fortunately, however, most econometric software packages that estimate ARIMA models also contain procedures to automatically generate forecasts and confidence intervals for those forecasts.
\begin{center}
\FRAME{ftbpFU}{5.354in}{3.8787in}{0pt}{\Qcb{SAS generated forecasts of future interest rates with confidence intervals using ARIMA(0,1,1) model.}}{% \Qlb{fig18_15}}{fig18_15.png}{\special{language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 5.354in;height 3.8787in;depth 0pt;original-width 5.2978in;original-height 3.8303in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename ‘GRAPHS/fig18_15.png’;file-properties “XNPEU”;}} \FRAME{ftbpFU}{5.7424in}{4.3145in}{0pt}{\Qcb{Forecast interest rate series with 95\% confidence interval.}}{\Qlb{fig18_16}}{fig18_16.png}{\special% {language “Scientific Word”;type “GRAPHIC”;maintain-aspect-ratio TRUE;display “USEDEF”;valid_file “F”;width 5.7424in;height 4.3145in;depth 0pt;original-width 8.8885in;original-height 6.6668in;cropleft “0”;croptop “1”;cropright “1”;cropbottom “0”;filename
‘GRAPHS/fig18_16.png’;file-properties “XNPEU”;}} \end{center}
Figure~\ref{fig18_15} contains a copy of the SAS output for the forecasts of the next 24 values of the interest rate yield for the ARIMA(0,1,1) model considered above. A\ graph of these forecasts and the confidence intervals appear in Figure~18.16. Notice that the size of the confidence intervals increase quite substantially as the length of the forecast increases.
\section{ARIMA models vs. structural models} One of the major criticisms of ARIMA models is that they are constructed using \textit{ad hoc} identification procedures instead of being based on economic theory. Advocates of the Box-Jenkins approach, however, argue that time-series models should be judged on the basis of their forecasting accuracy. On this ground, ARIMA models tend to perform quite well. These ARIMA models require much less information, can be easily computed using widely available econometric software programs, and often generate forecasts that are as reliable as those generated using structural models.
It should be noted, however, that the Box-Jenkins methodology does, in fact, rely on a somewhat \textit{ad hoc} identification procedure. Two econometricians attempting to fit an ARIMA model to the same series may end up selecting different model specifications.
Of course, if an econometrician is interested in testing the predictions or assumptions of an economic model, the estimation of a structural or reduced-form model is most appropriate. If, however, the goal is to provide forecasts of the future value of a time series variable, the relative success of structural and ARIMA models cannot be determined \textit{a priori}% . Even those who advocate the use of structural models recognize that ARIMA models often work surprisingly well. Many economists use ARIMA models as a benchmark that can be used to compare the accuracy of alternative forecasting models.
\section{Extensions of ARIMA models}
\subsection{Seasonal ARIMA models}
Monthly and quarterly time series variables often exhibit a seasonal pattern that tends to occur in each year. For example, retail sales often exhibit a substantial increase during the holiday period. The consumption of heating oil exhibits substantial increases during the winter months. Vacation travel generally increases during the summer months. As a result of these patterns, monthly data often exhibits a pattern in which data for a particular month is highly correlated with the observations for the same month in previous years. In a similar manner, quarterly data often exhibits a pattern in which the observations for a particular quarter are highly correlated with the observations in the same period in prior years. As the result of these steady seasonal patterns, the ACF and PACF functions have “spikes” at seasonal intervals.
Seasonal ARIMA models are designed to take seasonal effects such as these into account. In a seasonal ARIMA model, seasonal and/or regular differencing (of the sort described above) are used to induce stationarity.
The ACF and PACF of the suitably differenced variable are then used to specify a seasonal ARIMA model that may include seasonal autoregressive and moving average parameters (as well as the regular autoregressive and moving average parameters described above). For example, a simple first-order seasonal autoregressive process for quarterly data is given by: \begin{equation*}
y_t=\phi _o+\phi _1y_{t-4}+\epsilon _t
\end{equation*}
The rules for identifying seasonal ARIMA models are a direct extension of the procedure discussed above.\footnote{%
A good discussion may be found in Vandaele (1983), pp. 91-106.} \subsection{Transfer function models}
One of the basic shortcomings of the ARIMA approach discussed above is that it is based on a univariate model. This limitation, however, may be partly overcome through the use of a \textbf{transfer function model}. A transfer function model is one in which a time series variable is assumed to be a function of current and past levels of one or more other variables in addition to the autoregressive and moving average effects described above.
Box and Tiao (1975) proposed an interesting variation of the transfer function model that may be used to investigate whether the time path of an economic variable is altered by an external event. In this model, a dummy variable representing the external effect is used as an independent variable in the transfer function model.
The use of alternative functional forms makes it possible to investigate the possibility of the event generating either a temporary or permanent effect on the level (or growth rate) of the dependent variable. This form of transfer function analysis is known as \textbf{intervention analysis}.% \footnote{%
For a good discussion of intervention analysis, see Vandaele (1983), Chapter 14.}
%TCIMACRO{%
%\TeXButton{Intervention box}{\exbox{Intervention analysis and welfare reform}{ %In 1981, the Reagan administration introduced substantial changes in the %AFDC program that were designed to reduce total caseloads and expenditures. Englander %and Kane (1992) used intervention analysis to investigate whether this policy %intervention had a significant effect on the time path of AFDC caseloads and expenditures.
%It was found that the Reagan reforms significantly reduced welfare caseloads, but %had no significant long-term effect on the level of program expenditures. Thus, it appears %that fewer people received AFDC payments, but those who remained eligible received %more total payments (perhaps as a result of labor supply disincentives introduced under %the revised program).
%}}}%
%BeginExpansion
\exbox{Intervention analysis and welfare reform}{ In 1981, the Reagan administration introduced substantial changes in the AFDC program that were designed to reduce total caseloads and expenditures. Englander and Kane (1992) used intervention analysis to investigate whether this policy intervention had a significant effect on the time path of AFDC caseloads and expenditures.
It was found that the Reagan reforms significantly reduced welfare caseloads, but had no significant long-term effect on the level of program expenditures. Thus, it appears that fewer people received AFDC payments, but those who remained eligible received more total payments (perhaps as a result of labor supply disincentives introduced under the revised program).
}%
%EndExpansion
Model identification, however, is particularly difficult under a transfer function model specification. In addition to determining the appropriate degree of differencing, and the appropriate ARMA specification, it is also necessary to determine the time structure of the relationship between each of the independent variables and the dependent variable.
\subsection{Vector autoregressions}
In the past two decades, greater emphasis has been placed on attempts to model the joint relationship between a set of current variables and past observations on these variables. The \textbf{vector autoregression (or VAR)} model is a particularly popular generalization of univariate ARIMA models. A vector autoregression model is based on the assumption that each variable in a set of time series variables is affected by past values of each of the other variables in this set. For example, consider a simple vector autoregression model that attempts to explain the money supply ($MS_t)$ and nominal GDP ($Y_t)$. The vector autoregression model is given by: \begin{equation} \label{var.1.arima}
MS_t=\alpha _o+\alpha _1MS_{t-1}+\alpha _2MS_{t-2}+\cdots +\alpha _pMS_{t-p} \end{equation}
\begin{equation*}
+\beta _1Y_{t-1}+\beta _2Y_{t-2}+\cdots +\beta _pY_{t-p}+u_t \end{equation*}
and:
\begin{equation*}
Y_t=\gamma _1MS_{t-1}+\gamma _2MS_{t-2}+\cdots +\gamma _pMS_{t-p} \end{equation*}
\begin{equation*}
+\eta _1Y_{t-1}+\eta _2Y_{t-2}+\cdots +\eta _pY_{t-p}+v_t \end{equation*}
There is no distinction between exogenous and endogenous variables in the basic VAR specification. This is an advantage over simultaneous equation models in which there are often identification problems caused by a lack of a sufficient number of exogenous variables.
One practical problem with vector autoregressions is that the number of variables and/or the length of the lag structure must often be limited due to the limited number of observations available for many economic time series.
\section{Summary}
In this chapter, the Box-Jenkins procedure for estimating univariate ARIMA models was introduced. It was shown that first or second differencing may be used to remove a linear or quadratic trend, respectively. Once the level of differencing has been determined, the sample ACF and PACF for the differenced series is compared with the theoretical ACF and PACF for alternative AR, MA, and ARMA models to determine a tentative model identification. An examination of the residuals from this model is used to assess the fit of the model. If the ARIMA model is appropriately specified, the residuals from the fitted model should approximate a white-noise error process.
It was also noted that many economic theorists and econometricians criticize the \textit{ad hoc} nature of the Box-Jenkins methodology. Several extensions of the Box-Jenkins procedure were also introduced.
\section{Key Concepts}
ARIMA model
stationary time-series process
white-noise error process
AR($p$)
MA($q$)
ARMA($p,q$)
autocorrelation function (ACF)
partial autocorrelation function (PACF)
identification process
invertible model
stationarity
Box-Pierce statistic
Ljung-Box statistic
parsimonious parameterization
seasonal ARIMA model
transfer function model
intervention analysis
vector autoregression
\newpage\
\section{Exercises and problems}
\begin{enumerate}
\item Using equation \ref{invert.arima} explain what would happen if $\left| \rho \right| $ is greater than one. Why would such a process be nonstationary?
\item Consider the model:
\begin{equation}
Y_t=10-1.5Y_{t-1}+\epsilon _t \label{ques.1.arima} \end{equation}
\begin{enumerate}
\item Is this model stationary?
\item Explain your answer to part (a). What are the implications of the stationarity (or nonstationarity) of the model given in equation \ref% {ques.1.arima}?
\item Why might econometricians wish to consider only stationary models?
\end{enumerate}
\item Consider the two models:
\begin{equation*}
\text{Model A: }Y_t=.5Y_{t-1}+\epsilon _t
\end{equation*}
\begin{equation*}
\text{Model B: }Y_t=.9Y_{t-1}+\epsilon _t
\end{equation*}
\begin{enumerate}
\item In which case will the effect of past random shocks die out more rapidly? Explain.
\item Suppose that $Y_t=100$. Generate forecasts of the next five years using Models A and B.
\end{enumerate}
\item Use equation \ref{xvz.arima} to show that the ACF for an MA(1) process is nonzero for $k=1$ and equals zero for $k$ greater than $1$. Explain your work.
\item
\begin{enumerate}
\item At a 5\% significance level, test for a presence of a unit root in the interest rate on 30-year Treasury bonds using a Dickey-Fuller test using the specification:
\begin{equation*}
Y_t=\rho Y_{t-1}+\epsilon _t
\end{equation*}
\item Test for the presence of a unit root in the first-difference of this series.
\item Verify the results for the ARIMA(0,1,1) model presented in this chapter.
\item Estimate the parameters of a ARIMA (0,1,2) model. Is this model preferred to that appearing in (c)? Explain.
\item Estimate the parameters of an ARIMA (2,1,0) model. How does this model compare to that appearing in part (c)?
\end{enumerate}
\item Show that equation \ref{quad.2.arima} can be derived from equation \ref% {quad.trend.arima}.
\item The file \textquotedblleft unemploy.dat\textquotedblright\ (described in Table~\ref{unemployment.dat} in Appendix \ref{data.appendix}) contains data on the civilian unemployment rate for the U.S.
\begin{enumerate}
\item Use a computer software package to compute the ACF and PACF for this series. Does this series appear to be stationary?
\item Compute the ACF and PACF for the first difference of the unemployment rate series. Does this series appear to be stationary? Is higher-order differencing necessary?
\end{enumerate}
\item Use the data on interest yields on 3-month Treasury bills contained in the file \textquotedblleft int.dat\textquotedblright\ (described in Table~% \ref{int.dat} on p.~\pageref{int.dat}) to estimate the parameters of an appropriate ARIMA\ model using the following steps: \begin{enumerate}
\item Estimate the ACF and PACF for the original series.
\item Estimate the ACF and PACF for the first difference of the series.
\item Use the results from (a) and (b) to specify a tentative ARIMA model.
Estimate the parameters of this model.
\item Estimate the ACF and PACF for the residuals from the model selected in part (c). Does your model work well?
\item Generate forecasts for the next 5 values of the interest yield rate on these securities using the ARIMA model that you have selected.
\end{enumerate}
\item Use the GNP data contained in the file \textquotedblleft okun.dat\textquotedblright\ (this file is described in Table \ref% {okuns.law.dat} on p. \pageref{okuns.law.dat} in Appendix \ref{data.appendix}% ) to estimate the parameters of an appropriate ARIMA\ model for the GNP series using the following steps:
\begin{enumerate}
\item Estimate the ACF and PACF for the original series.
\item Estimate the ACF and PACF for the first difference of the series.
\item Use the results from (a) and (b) to specify a tentative ARIMA model.
Estimate the parameters of this model.
\item Estimate the ACF and PACF for the residuals from the model selected in part (c). Does your model work well?
\item Generate forecasts for the next 5 values of GNP using the ARIMA model that you have selected.
\end{enumerate}
\item Use the quarterly unemployment rate data contained in the file \textquotedblleft okun.dat\textquotedblright\ (this file is described in Table \ref{okuns.law.dat} on p. \pageref{okuns.law.dat} in Appendix \ref% {data.appendix}) to estimate the parameters of an appropriate ARIMA\ model for the unemployment rate series using the following steps: \begin{enumerate}
\item Estimate the ACF and PACF for the original series.
\item Estimate the ACF and PACF for the first difference of the series.
\item Use the results from (a) and (b) to specify a tentative ARIMA model.
Estimate the parameters of this model.
\item Estimate the ACF and PACF for the residuals from the model selected in part (c). Does your model work well?
\item Generate forecasts for the next 5 values of GNP using the ARIMA model that you have selected.
\end{enumerate}
\item Table \ref{deficit.dat} in Appendix \ref{data.appendix} (and the file \textquotedblleft deficit.dat\textquotedblright ) contains data on the Federal budget deficit.
\begin{enumerate}
\item Compute the ACF and PACF for the original series.
\item Construct a time plot of the observations. Does it appear that differencing is needed to induce stationarity?
\item If a first-difference operation is used, examine the ACF and PACF for the differenced series to determine whether a second-difference operation is necessary.
\item Use the ACF and PACF for the appropriately differenced series to specify an ARIMA model. Estimate the parameters of this model.
\item Use the Ljung-Box statistic to determine whether one or more of the first six estimated autocorrelations are significantly different than zero at a 5\% significance level. What does the outcome of this test indicate about the fit of your model?
\end{enumerate}
\item Table \ref{imports.dat} in Appendix \ref{data.appendix} (and the file \textquotedblleft imports.dat\textquotedblright ) contains data on the U.S.
imports.
\begin{enumerate}
\item Compute the ACF and PACF for the original series.
\item Construct a time plot of the observations. Does it appear that differencing is needed to induce stationarity?
\item If a first-difference operation is used, examine the ACF and PACF for the differenced series to determine whether a second-difference operation is necessary.
\item Use the ACF and PACF for the appropriately differenced series to specify an ARIMA model. Estimate the parameters of this model.
\item Use the Ljung-Box statistic to determine whether one or more of the first six estimated autocorrelations are significantly different than zero at a 5\% significance level. What does the outcome of this test indicate about the fit of your model?
\end{enumerate}
\item The \textquotedblleft gdp.dat\textquotedblright\ file (described in Table \ref{gdp.dat} in Appendix \ref{data.appendix}) contains data on real consumption expenditures.
\begin{enumerate}
\item Compute the ACF and PACF for the original consumption series.
\item Construct a time plot of the observations. Does it appear that differencing is needed to induce stationarity? Verify this result using a Dickey-Fuller test.
\item Construct a first difference of the series. Examine the ACF and PACF for the differenced series to determine whether a second-difference operation is necessary. Perform a Dickey-Fuller test on the differenced series. Do these procedures provide similar results?
\item Use the ACF and PACF for the appropriately differenced series to specify an ARIMA model. Estimate the parameters of this model.
\item Use the Ljung-Box statistic to determine whether one or more of the first six estimated autocorrelations are significantly different than zero at a 5\% significance level. What does the outcome of this test indicate about the fit of your model?
\end{enumerate}
\item Consider the moving average model given by: \begin{equation*}
Y_{t}=\epsilon _{t}+.2\epsilon _{t-1}-.1\epsilon _{t-2} \end{equation*}
\begin{enumerate}
\item Can a two-step ahead forecast be constructed in this case? If so, state the formula for this estimator.
\item Is it possible to estimate a three-step ahead forecast in this case?
If so, what will this estimator equal?
\end{enumerate}
\item Consider the ARIMA model given by:
\begin{equation*}
\Delta Y_{t}=0.2+0.7Y_{t-1}+.1Y_{t-2}
\end{equation*}%
Suppose that the value of $Y_{t}$ equals 50 in 1999 and 52 in 2000. Predict the values of this series for the years 2001 and 2002.
\end{enumerate}

License

Icon for the Creative Commons Attribution 4.0 International License

License

Share This Book