This article applies gw_test() — Giacomini and White
(2006) conditional equal predictive ability (CEPA) test — to the Survey
of Professional Forecasters’ mean CPI inflation forecasts. The benchmark
is a naive random walk in inflation (next quarter’s forecast = this
quarter’s realised inflation); the alternative is the SPF mean at
horizons
quarters. The bundled gw2006 dataset extends GW’s quarterly
sample to the present using the same Philadelphia Fed SPF source.
library(forecastdom)
data(gw2006)
str(gw2006)
#> 'data.frame': 180 obs. of 10 variables:
#> $ date : Date, format: "1981-09-01" "1981-12-01" ...
#> $ year : int 1981 1981 1982 1982 1982 1982 1983 1983 1983 1983 ...
#> $ quarter : int 3 4 1 2 3 4 1 2 3 4 ...
#> $ infl : num 11.61 6.66 3.6 5.91 7.13 ...
#> $ infl_lag: num NA 11.61 6.66 3.6 5.91 ...
#> $ spf_h0 : num 7.71 10.73 6.36 3.54 5.33 ...
#> $ spf_h1 : num NA 9.22 8.95 5.51 3.69 ...
#> $ spf_h2 : num NA NA 7.93 7.76 6.08 ...
#> $ spf_h3 : num NA NA NA 7.76 7.56 ...
#> $ spf_h4 : num NA NA NA NA 7.61 ...CEPA across forecast horizons
run_h <- function(h) {
spf_col <- paste0("spf_h", h)
ok <- complete.cases(gw2006[, c("infl", spf_col, "infl_lag")])
d <- gw2006[ok, ]
e_spf <- d$infl - d[[spf_col]] # SPF errors
e_rw <- d$infl - d$infl_lag # random-walk errors
r <- gw_test(e_spf, e_rw) # default: constant + lagged loss diff
data.frame(h = h, n = nrow(d),
mse_spf = mean(e_spf^2),
mse_rw = mean(e_rw^2),
wald = unname(r$statistic),
pvalue = unname(r$pvalue),
reject = unname(r$pvalue) < 0.05)
}
tab <- do.call(rbind, lapply(0:4, run_h))
knitr::kable(
tab, digits = 3, row.names = FALSE,
col.names = c("$h$", "$n$",
"$MSE_{SPF}$", "$MSE_{RW}$",
"Wald", "$p$-value", "Reject"))| Wald | -value | Reject | ||||
|---|---|---|---|---|---|---|
| 0 | 176 | 5.657 | 5.565 | 4.510 | 0.105 | FALSE |
| 1 | 176 | 4.906 | 5.565 | 1.206 | 0.547 | FALSE |
| 2 | 175 | 4.631 | 5.456 | 4.177 | 0.124 | FALSE |
| 3 | 174 | 4.708 | 5.434 | 3.581 | 0.167 | FALSE |
| 4 | 173 | 4.884 | 5.435 | 2.908 | 0.234 | FALSE |
At every horizon SPF has lower mean squared error than the random-walk benchmark, yet the GW test fails to reject conditional equal predictive ability at the 5% level. The pattern is consistent with Atkeson and Ohanian (2001): once the autocorrelation of the loss differential is accounted for, sophisticated inflation forecasts are hard to distinguish from a “no-change” benchmark in a formal test.
Choice of instruments
gw_test() defaults to two instruments: a constant and
the lagged loss differential. Different instruments target different
features of the conditioning information set. Below we replace the lag
with the lagged inflation level — a regime indicator that captures
high/low inflation environments — at horizon
.
h <- 1L
ok <- complete.cases(gw2006[, c("infl", paste0("spf_h", h), "infl_lag")])
d <- gw2006[ok, ]
e_spf <- d$infl - d[[paste0("spf_h", h)]]
e_rw <- d$infl - d$infl_lag
# Default: constant + lagged loss differential
r_default <- gw_test(e_spf, e_rw)
# Custom: constant + lagged inflation level (padded with NA at t = 1
# so that nrow(W) matches length(e1)).
W_infl <- cbind(1, c(NA, head(d$infl_lag, -1)))
keep <- !is.na(W_infl[, 2])
r_infl <- gw_test(e_spf[keep], e_rw[keep],
instruments = W_infl[keep, , drop = FALSE])
# Absolute-error loss instead of squared-error
r_ae <- gw_test(e_spf, e_rw, loss = "AE")
tab2 <- data.frame(
spec = c("Default (const + lagged Δloss)",
"Const + lagged inflation",
"Absolute-error loss"),
wald = c(r_default$statistic, r_infl$statistic, r_ae$statistic),
df = c(r_default$df, r_infl$df, r_ae$df),
pvalue = c(r_default$pvalue, r_infl$pvalue, r_ae$pvalue)
)
knitr::kable(
tab2, digits = 3, row.names = FALSE,
col.names = c("Specification", "Wald", "df", "$p$-value"))| Specification | Wald | df | -value |
|---|---|---|---|
| Default (const + lagged Δloss) | 1.206 | 2 | 0.547 |
| Const + lagged inflation | 10.738 | 2 | 0.005 |
| Absolute-error loss | 1.957 | 2 | 0.376 |
The decision is robust across all three specifications: SPF and the random walk cannot be statistically distinguished at the 5% level regardless of which conditioning instruments or loss function we choose.
Note on the original paper
Giacomini and White (2006, Section 4) compare SPF nowcasts to the Greenbook — the Federal Reserve staff’s internal inflation forecast — using a richer instrument set. They find that the test does reject equal conditional predictive ability between the two sophisticated forecasts in some conditioning states. Replicating that result requires Greenbook data with its five-year embargo; this article instead demonstrates the test mechanics against the simpler random-walk benchmark, which is freely available.
