This article reproduces the Phillips-curve forecasting illustration
of Clark and McCracken (2001, Section 5) on the bundled
cm2001 dataset. The benchmark forecasts US unemployment
from its own lags (univariate AR); the alternative adds lagged
inflation. enc_new() tests whether the AR forecasts
encompass those of the augmented ARX model — i.e., whether
inflation adds any predictive content beyond what lagged unemployment
already contains.
library(forecastdom)
data(cm2001)
# CM (2001) sample: 1957:01-1997:08.
cm <- subset(cm2001,
date >= as.Date("1957-01-01") &
date <= as.Date("1997-08-01"))
nrow(cm)
#> [1] 488Recursive forecasts
Both models are estimated by OLS on an expanding window with
p lags of each variable. At each origin we form a
one-step-ahead forecast, then roll forward by one month.
make_lags <- function(z, p) {
n <- length(z)
sapply(seq_len(p), function(k) c(rep(NA, k), z[seq_len(n - k)]))
}
recursive_arx <- function(y, x, p, R) {
n <- length(y)
YL <- make_lags(y, p)
XL <- make_lags(x, p)
target <- y[(p + 1):n]
YL <- YL[(p + 1):n, , drop = FALSE]
XL <- XL[(p + 1):n, , drop = FALSE]
P <- length(target) - R
e_ar <- e_arx <- numeric(P)
for (j in seq_len(P)) {
idx <- seq_len(R + j - 1)
fit_ar <- lm.fit(cbind(1, YL[idx, ]), target[idx])
fit_arx <- lm.fit(cbind(1, YL[idx, ], XL[idx, ]), target[idx])
pred_ar <- sum(coef(fit_ar) * c(1, YL[R + j, ]))
pred_arx <- sum(coef(fit_arx) * c(1, YL[R + j, ], XL[R + j, ]))
e_ar[j] <- target[R + j] - pred_ar
e_arx[j] <- target[R + j] - pred_arx
}
list(e_ar = e_ar, e_arx = e_arx)
}ENC-NEW across lag orders
CM (2001) consider lag orders p = 1, 3, 6, 12. The
initial estimation window is R = 120 months (10 years); the
remaining observations form the out-of-sample period.
R <- 120L
rows <- lapply(c(1L, 3L, 6L, 12L), function(p) {
fc <- recursive_arx(cm$unrate, cm$infl, p = p, R = R)
enc <- enc_new(fc$e_ar, fc$e_arx)
msfe1 <- mean(fc$e_ar ^ 2)
msfe2 <- mean(fc$e_arx ^ 2)
data.frame(p = p,
n_oos = length(fc$e_ar),
pi_ratio = round(length(fc$e_ar) / R, 2),
MSFE_AR = msfe1,
MSFE_ARX = msfe2,
R2OS_pct = 100 * (1 - msfe2 / msfe1),
ENC_NEW = unname(enc$statistic))
})
tab <- do.call(rbind, rows)
knitr::kable(
tab, digits = 3, row.names = FALSE,
col.names = c("$p$", "$T_{OOS}$", "$\\pi = P/R$",
"$MSFE_{AR}$", "$MSFE_{ARX}$",
"$R^2_{OS}$ (%)", "ENC-NEW"))| (%) | ENC-NEW | |||||
|---|---|---|---|---|---|---|
| 1 | 367 | 3.06 | 0.035 | 0.034 | 2.968 | 13.930 |
| 3 | 365 | 3.04 | 0.033 | 0.032 | 2.184 | 11.412 |
| 6 | 362 | 3.02 | 0.031 | 0.031 | 1.833 | 14.754 |
| 12 | 356 | 2.97 | 0.032 | 0.033 | -2.459 | 20.984 |
For every lag order considered, the ARX model achieves a lower MSFE
than the pure AR — that is, knowing past inflation reduces out-of-sample
unemployment-forecast error. ENC-NEW values are large relative to the
Clark-McCracken (2001) Table 2 asymptotic 5% critical values, which are
tabulated in k2 (number of extra regressors =
p) and π = P/R. For k2 = 6,
π ≈ 2, the 5% CV is ≈ 8.6; for k2 = 12,
π ≈ 2, the 5% CV is ≈ 12.0. The realised statistics in the
table exceed these benchmarks decisively in every specification, so the
AR forecasts do not encompass the ARX — inflation
carries real predictive content for unemployment.
Comparison with Clark-West MSFE-adjusted
enc_new() answers an encompassing question;
cw_test() answers the closely related “equal MSFE” question
with a t-statistic that has a standard normal asymptotic distribution.
The two tests typically point the same way; reporting both is
informative.
rows2 <- lapply(c(1L, 3L, 6L, 12L), function(p) {
fc <- recursive_arx(cm$unrate, cm$infl, p = p, R = R)
f1 <- cm$unrate[(p + 1 + R):nrow(cm)] - fc$e_ar
f2 <- cm$unrate[(p + 1 + R):nrow(cm)] - fc$e_arx
cw <- cw_test(fc$e_ar, fc$e_arx, f1, f2)
data.frame(p = p,
CW_stat = unname(cw$statistic),
CW_pvalue = unname(cw$pvalue))
})
knitr::kable(
do.call(rbind, rows2), digits = 3, row.names = FALSE,
col.names = c("$p$", "CW stat", "CW $p$-value"))| CW stat | CW -value | |
|---|---|---|
| 1 | 2.531 | 0.006 |
| 3 | 2.295 | 0.011 |
| 6 | 2.522 | 0.006 |
| 12 | 2.489 | 0.006 |
CW p-values are far below 5% for every lag order, agreeing with the ENC-NEW decision: the ARX model has significantly lower out-of-sample MSFE than the pure AR.
References
- Clark, T. E. and McCracken, M. W. (2001). Tests of equal forecast accuracy and encompassing for nested models. Journal of Econometrics, 105(1), 85-110.
- Clark, T. E. and West, K. D. (2007). Approximately normal tests for equal predictive accuracy in nested models. Journal of Econometrics, 138(1), 291-311.
