Overview
sdim implements five factor extraction methods for asset pricing and macroeconomic forecasting:
| Function | Method | Reference |
|---|---|---|
pca_est() |
Principal Component Analysis (PCA) | He et al. (2023, MS) |
pls_est() |
Partial Least Squares (PLS) | He et al. (2023, MS) |
rra_est() |
Reduced-Rank Approach (RRA) | He et al. (2023, MS) |
spca_est() |
Scaled PCA (sPCA) | Huang et al. (2022, MS) |
ipca_est() |
Instrumented PCA (IPCA) | Kelly, Pruitt & Su (2019, JFE) |
PCA, PLS, and RRA take a multivariate target (T×N returns matrix) and a matrix of factor proxies. sPCA takes a univariate target and scales each proxy by its OLS slope on the target before extracting principal components. IPCA extracts latent factors from panel data using time-varying characteristics as instruments, estimated via alternating least squares (ALS). Performance of extracted factors can be evaluated with eval_factors().
The package ships with seven he2023_* datasets (factor proxies and portfolio returns) from the He et al. (2023, MS) replication package.
Installation
# Install from GitHub (not yet on CRAN)
# install.packages("pak")
pak::pak("GabboCg/sdim")Usage
Quick start
library(sdim)
set.seed(42)
X <- matrix(rnorm(200 * 20), 200, 20) # T x L factor proxies
ret <- matrix(rnorm(200 * 30) / 100, 200, 30) # T x N returns (target)
# Fit each method
fit_pca <- pca_est(target = ret, X = X, nfac = 3)
fit_pls <- pls_est(target = ret, X = X, nfac = 3)
fit_rra <- rra_est(target = ret, X = X, nfac = 3)
print(fit_rra)
#> <sdim_fit [rra]>
#> Observations : 200
#> Predictors : 20
#> Factors : 3
# Evaluate factor quality (RMSPE and total adj-R² from He et al. 2023, §2.4)
eval_factors(ret = ret, factors = fit_rra$factors)
#> Factor Evaluation
#> ----------------------------------------
#> Portfolios 30
#> Factors 3
#>
#> Performance (He et al., 2023, §2.4)
#> ----------------------------------------
#> RMSPE 0.9875 (%)
#> Total adj-R² 2.9593 (%)
#> SR 0.0522
#> A2R 0.9443IPCA (panel with time-varying characteristics)
# Simulate panel
set.seed(99)
TT <- 120
K <- 50
n_chars <- 6
ret <- matrix(rnorm(TT * K) / 100, TT, K)
Z <- array(rnorm(TT * K * n_chars), dim = c(TT, K, n_chars))
fit_ipca <- ipca_est(ret, Z, nfac = 3)
print(fit_ipca)
#> <sdim_fit [ipca]>
#> Observations : 120
#> Characteristics : 6
#> Factors : 3
#> Factor mean : zero
summary(fit_ipca)
#> Instrumented Principal Components Analysis (IPCA)
#> ----------------------------------------
#> Call: ipca_est(ret = ret, Z = Z, nfac = 3)
#>
#> Dimensions
#> ----------------------------------------
#> Observations 120
#> Characteristics 6
#> Factors 3
#> Factor mean zero
#>
#> Eigenvalues
#> ----------------------------------------
#> F1 F2 F3
#> Eigenvalue 0.8952 0.9336 0.8652
#> Var. expl. (%) 33.2300 34.6600 32.1200
# With factor mean specifications
fit_const <- ipca_est(ret, Z, nfac = 3, factor_mean = "constant")
fit_const$mu # time-series mean of each factor
fit_var <- ipca_est(ret, Z, nfac = 3, factor_mean = "VAR1")
fit_var$var_coef # K x K VAR(1) coefficient matrixsPCA (univariate target)
y <- rnorm(200) # univariate return series
fit_spca <- spca_est(target = y, X = X, nfac = 3)
summary(fit_spca)
#> Scaled PCA (sPCA)
#> ----------------------------------------
#> Call: spca_est(target = y, X = X, nfac = 3)
#>
#> Dimensions
#> ----------------------------------------
#> Observations 200
#> Predictors 20
#> Factors 3
#>
#> Eigenvalues
#> ----------------------------------------
#> F1 F2 F3
#> Eigenvalue 12.3456 8.7654 5.4321
#> Var. expl. (%) 46.73 33.20 20.57
#>
#> OLS slope summary (beta)
#> ----------------------------------------
#> 0% 25% 50% 75% 100%
#> -0.1234 -0.0512 0.0103 0.0634 0.1521Getting help
If you encounter a bug, please file an issue with a minimal reproducible example on GitHub. For questions, email gabriel.cabreraguzman@postgrad.manchester.ac.uk.
References
He, J., Huang, J., Li, F., and Zhou, G. (2023). “Shrinking Factor Dimension: A Reduced-Rank Approach.” Management Science, 69(9). doi:10.1287/mnsc.2022.4563
Huang, J., Jiang, J., Li, F., Tong, G., and Zhou, G. (2022). “Scaled PCA: A New Approach to Dimension Reduction.” Management Science, 68(3). doi:10.1287/mnsc.2021.4020
Kelly, B. T., Pruitt, S., and Su, Y. (2019). “Characteristics are Covariances: A Unified Model of Risk and Return.” Journal of Financial Economics, 134(3). doi:10.1016/j.jfineco.2019.05.001
