Implements scaled principal component analysis (sPCA): predictors are first standardized, then each standardized predictor is scaled by its univariate predictive slope on the target, and finally principal components are extracted from the scaled predictors.
Usage
spca_est(target, X, nfac, winsorize = FALSE, winsor_probs = c(0, 99))Arguments
- target
A numeric vector of length
T_reg(T_reg <= T).- X
A numeric matrix or data frame with
Trows andNcolumns. Whenlength(target) < nrow(X), the firstlength(target)rows of the standardizedXare used for the scaling regression while allTrows are used for standardization and factor extraction. This matches the out-of-sample workflow in Huang et al. (2022), where the predictive regressiony_{t+1} ~ X_tuses fewer rows than the full training window.- nfac
A positive integer giving the number of factors to extract.
- winsorize
Logical; if
TRUE, winsorize absolute slope estimates before scaling predictors.- winsor_probs
Numeric vector of length 2 giving winsorization percentiles. Used only when
winsorize = TRUE.
Value
An object of class "sdim_spca" with components:
- factors
A
T x nfacmatrix of extracted sPCA factors.- beta
A numeric vector of predictor-specific predictive slopes.
- beta_scaled
A numeric vector of scaling coefficients actually used.
- col_means
Column means of
X(used bypredict).- col_sds
Column standard deviations of
X(used bypredict).- Xs
The standardized predictor matrix.
- scaleXs
The scaled standardized predictor matrix.
- lambda
The estimated loading matrix.
- residuals
Residual matrix from the PCA reconstruction step.
- ve2
Average squared residual by row.
- eigvals
Singular values from the decomposition of
scaleXs %*% t(scaleXs).- call
The matched function call.
References
Huang, D., Jiang, F., Li, K., Tong, G., and Zhou, G. (2022). Scaled PCA: A New Approach to Dimension Reduction. Management Science, 68(3), 1678–1695. doi:10.1287/mnsc.2021.4020
Examples
set.seed(123)
X <- matrix(rnorm(200 * 10), nrow = 200, ncol = 10)
y <- rnorm(200)
fit <- spca_est(target = y, X = X, nfac = 3)
dim(fit$factors)
#> [1] 200 3
head(fit$beta)
#> [1] 0.0007544186 0.0307199352 0.0074016978 -0.0217043884 0.0892871951
#> [6] 0.0260282689
# Predictive alignment: target has fewer rows than X
fit2 <- spca_est(target = y[1:199], X = X, nfac = 3)
dim(fit2$factors) # 200 x 3 (factors for all T rows)
#> [1] 200 3
