{Yt} is weakly stationary if its mean, variance and autocovariances do not depend on t
{Xt} and {Yt} are jointly weakly stationary if each series is weakly stationary and cov(Yt, Xt-h) does not depend on t for all h
the population autocorrelation function measures the persistence of a time series and is consistently estimated by the sample autocorrelation function
stationary processes are commonly only weakly persistent
strong persistence (weakly decaying autocorrelation) is often suggestive of non-stationarity
{Yt} is strongly stationary if the distribution of every subsequence has the same distribution
trends are usually dealt with by transforming the data to obtain a stationary process
To detrend a dataset we might have to take logarithms before differencing, particularly for series where exponential growth/decay makes sense
the AR(1) model is intended as a descriptive model, not a causal one
the mean squared forecast error (MSFE) minimising forecast is the expectation of Yt+1 conditional on all past values of Yt
forecast errors generally have two components: estimation error and the unforecastable component of the model
advantages to choosing a larger p: more flexible model, potentially better description of the dynamics of Yt, more parameters over which to optimise forecast rule, less 'bias' in the approximation of the optimal forecast
disadvantages of larger p: more parameters to estimate using the same amount of data, more 'variance' in the estimation of the model parameters
an AR(p) model has m=p+1 parameters
BIC penalises larger models more than AIC does
ARDL(p,q) model has m=p+q+1 parameters
we say Xt Granger causes Yt if lags of Xt improve forecasts of Yt made on the basis of its own lags (reduce the optimal forecasts MSFE)
a break is an abrupt change in model parameters on (or very near) a particular date; modelled using breakpoint dummies
breaks are a leading cause of forecast failure, particularly those near the end of a sample
if delta Yt follows an AR(p-1) model, Yt follows an AR(p) model with coefficients summing to 1
deterministic trends generate linear growth
stochastic trends generate and are synonymous with 'random wandering' behaviour
unit root AR processes can decompose into the sum of a deterministic trend, a stochastic trend, and an initial value
unit root processes provide biased OLS estimates, and CLTs do not apply - inference is non-standard
Dickey-Fuller test is used to test for unit roots
ADF test may have misleading conclusions when applied to a series with a linear trend
order of integration is the smallest number of differences required to make a sequence stationary
spurious regression is the systematic tendency to find statistically significant regression relationships between unrelated I(1) series
Xt and Yt are cointegrated if there exists a cointegrating coefficient such that Yt-thetaXt ~ I(0)