Lecture Note — Chen, Roussanov & Wang (2023)

Semiparametric Conditional Factor Models: Estimation and Inference

Kei Matsumae · 2026-05-15 · v0.2 (bilingual)

What this paper does

Introduces regressed-PCA: a two-step estimator for a conditional factor model where pricing errors $\alpha(\cdot)$ and loadings $\beta(\cdot)$ are unknown nonparametric functions of stock characteristics and factors $f_t$ are latent.
Step 1 is a cross-sectional regression on a basis of characteristics (Fama-MacBeth with sieves). Step 2 is PCA on the resulting time series of slope vectors.
Needs large $N$ but not large $T$ — rolling sub-samples are allowed, so the factor structure can drift over time.
Empirically: only 1–2 latent factors; pure-$\alpha$ portfolios with Sharpe > 3; mispricing declines over time.
Direct competitor to IPCA (Kelly–Pruitt–Su 2019): same panel, different objective, different conclusions.

1. Why this paper exists

Empirical asset pricing has a factor zoo: 300+ characteristics predict cross-sectional returns. Since Fama-MacBeth (1973) the central question is whether each predictive characteristic is:

RISK capturing time-varying exposures to genuine systematic factors, or
MISPRICING earning returns not explained by any factor.

Three challenges block a clean answer:

Factors $f_t$ are latent — we don't observe them.
The way characteristics translate into loadings ($\beta$) and pricing errors ($\alpha$) is functional, not just linear — the literature typically forces linearity for tractability.
$N \approx 12{,}000$ stocks, $T \approx 600$ months: standard factor analytics need $T\to\infty$ and break down.

Chen, Roussanov & Wang (CRW) tackle all three jointly.

Linear factor models (Fama-French, etc.): assume $\alpha, \beta$ constants. Can't handle conditional, characteristic-driven loadings.
IPCA (Kelly–Pruitt–Su 2019): assumes $\alpha(z) = z'\Gamma_\alpha$, $\beta(z) = z'\Gamma_\beta$ — linear in characteristics. Maximises joint TS+XS fit.
CRW (2023, this paper): allows $\alpha(\cdot), \beta(\cdot)$ to be arbitrary nonlinear functions. Factors capture time-series comovement first; characteristics explain cross-section of average returns second.

2. The model

For stock $i$ in month $t$:

$$ y_{it} \;=\; \alpha(z_{it}) \;+\; \beta(z_{it})' f_t \;+\; \varepsilon_{it}, \qquad i=1,\dots,N,\;\; t=1,\dots,T. $$

Symbol	Meaning	Observed?
$y_{it}$	excess return	✅
$z_{it}$	$M$-vector of stock characteristics (lagged)	✅
$\alpha(\cdot)$	scalar pricing-error function	❌ unknown
$\beta(\cdot)$	$K$-vector loading function	❌ unknown
$f_t$	$K$-vector of latent factors	❌ unknown
$\varepsilon_{it}$	idiosyncratic shock	❌

The asset-pricing question is whether $\alpha(\cdot) \equiv 0$. If yes, characteristics matter only via $\beta(\cdot)$ — only via risk exposures. If no, characteristics carry mispricing too.

3. The key trick — sieve approximation

You can't estimate $\alpha(z)$ and $\beta(z)$ directly because they're infinite-dimensional. Sieve approximation replaces each by a finite linear combination of basis functions $\phi(z)$ (polynomials, B-splines, etc.):

$$ \alpha(z) \;\approx\; \phi(z)' a, \qquad \beta_k(z) \;\approx\; \phi(z)' b_k. $$

Stacking $a$ and the $b_k$'s into a $J\times(K+1)$ coefficient matrix lets us rewrite:

$$ y_{it} \;\approx\; \phi(z_{it})'\; \underbrace{\bigl(a + B f_t\bigr)}_{\displaystyle =: \Gamma_t \in \mathbb{R}^J} \;+\; \varepsilon_{it}. $$

So at each $t$ the return is approximately linear in the basis $\phi(z_{it})$, with time-varying coefficient vector $\Gamma_t$.

The basis $\phi(z)$ creates lots of "synthetic" linear factors out of nonlinear functions of characteristics. The conditional model with $K$ latent factors becomes a linear panel with $J \gg K$ "managed portfolios", and the $K$ true factors live inside the time series of slopes.

4. The estimator — regressed-PCA in two steps

Step 1 — Cross-sectional regression each month

For each month $t$, run OLS of returns on the basis:

$$ \hat{\Gamma}_t \;=\; \bigl(\Phi_t' \Phi_t\bigr)^{-1} \Phi_t' Y_t. $$

This is Fama-MacBeth (1973) with basis functions. The slopes $\hat{\Gamma}_t \in \mathbb{R}^J$ are returns on $J$ managed portfolios — each has unit exposure to one basis function and zero exposure to the others.

Collect them: $\hat{\Gamma} = [\hat{\Gamma}_1, \dots, \hat{\Gamma}_T] \in \mathbb{R}^{J\times T}$.

Step 2 — PCA on the slope matrix

Since $\Gamma_t = a + B f_t$, the variation in $\Gamma_t$ across $t$ is driven by the $K$ latent factors $f_t$. Apply PCA to $\hat{\Gamma}$:

Top-$K$ eigenvectors → $\hat{B}$ (basis coefficients of loading functions).
Corresponding PCs → $\hat{f}_t$.
Cross-sectional mean of $\hat{\Gamma}_t$ → $\hat{a}$ (basis coefficients of $\alpha$).
Plug back: $\hat{\alpha}(z) = \phi(z)'\hat{a}$, $\hat{\beta}(z) = \phi(z)'\hat{B}$.

flowchart TB A["Raw panel
y_it , z_it
N stocks × T months"] --> B["Choose basis
φ(z) — linear / B-spline / …"] B --> C["STEP 1
For each month t:
OLS Y_t on Φ_t
→ Γ̂_t ∈ ℝ^J"] C --> D["Stack:
Γ̂ = (Γ̂_1, …, Γ̂_T)
J × T matrix"] D --> E["STEP 2 — PCA on Γ̂"] E --> F1["â = column mean
→ α̂(z) = φ(z)·â"] E --> F2["B̂ = top-K eigvecs
→ β̂(z) = φ(z)·B̂"] E --> F3["f̂_t = principal components"] E --> F4["K̂ = eigenvalue-ratio
selector"] style C fill:#eef4f8,stroke:#1f5d8a style E fill:#ecf3ee,stroke:#2e6e3e style F4 fill:#fff6e3,stroke:#b8651e

Because $\Gamma_t$ is affine in $f_t$, the principal components of the slope matrix are exactly the latent factors (up to rotation). The cross-sectional regression projects the high-dimensional return panel onto a $J$-dimensional managed-portfolio space, and PCA on that smaller object is well-conditioned even when $T$ is small. This is the whole paper in one sentence.

5. Comparison with IPCA — what is actually different

	IPCA (Kelly–Pruitt–Su 2019)	Regressed-PCA (CRW 2023)
Model	$y_{it} = z_{it}'\Gamma_\alpha + z_{it}'\Gamma_\beta f_t + \varepsilon$	$y_{it} = \alpha(z_{it}) + \beta(z_{it})'f_t + \varepsilon$
Functional form	linear in $z$	nonparametric $\alpha, \beta$
Estimation	minimise joint TS+XS squared error, iteratively	one-shot: regress, then PCA
Implicit objective	fit cross-section of average returns	fit time-series comovement (APT-flavoured)
Asymptotics	large $N$ and large $T$	large $N$, fixed $T$ OK
Empirical $K$	5 advocated	1 (linear) or 2 (nonlinear), data-selected
Take-away	characteristics ≈ loadings; small $\alpha$	characteristics carry both loadings and non-zero $\alpha$

Conceptual divergence: IPCA fits everything jointly so factors absorb whatever cross-sectional pattern characteristics suggest. CRW extracts factors that explain comovement first, then asks whether characteristics still predict average returns conditional on those factors. The answer is yes.

6. Selecting the number of factors $K$

CRW propose an eigenvalue-ratio estimator that consistently selects $K$ as $N\to\infty$ for fixed $T$:

$$ \hat{K} \;=\; \arg\max_{1 \le k \le k_{\max}} \frac{\lambda_k(\hat{\Gamma}\hat{\Gamma}')}{\lambda_{k+1}(\hat{\Gamma}\hat{\Gamma}')}. $$

The ratio spikes at $k = K$ because the $(K{+}1)$-th eigenvalue is "noise-sized". No need to pre-commit to "5 factors" (Fama-French) or "1 factor" (CAPM) — the data choose.

7. Inference — weighted bootstrap

The asymptotic distribution of $(\hat{a}, \hat{B})$ depends on a rotation matrix $H$ that is data-dependent. A naive bootstrap re-estimates $H$ in each replication and breaks consistency.

CRW's fix: enforce the same factor estimator $\hat{F}$ from the original sample in every bootstrap replication, so the rotation is held fixed.

Two tests follow:

Wald-type test of $\alpha(\cdot)\equiv 0$. Quadratic form in $\hat{a}$; critical value from the weighted bootstrap. Rejection ⟹ characteristics carry mispricing.
LR-type test of linearity of $\alpha(\cdot)$ or $\beta(\cdot)$. Compare restricted (linear) vs. unrestricted (B-spline) estimators. Critical trick: use the unrestricted $\hat{F}$ when computing the restricted estimator, so the rotation matches under null and alternative.

Weight distribution: $w_i \sim \text{Exp}(1)$ i.i.d.

8. Empirical findings (US, 1968–2014)

Sample: Kelly–Pruitt–Su (2019) panel = Freyberger–Neuhierl–Weber (2020) data: ~12,813 stocks × 36 characteristics, monthly Sep-1968 → May-2014.

Specification	$K$ selected	Total $R^2$	Out-of-sample $R^2_O$	Pure-$\alpha$ Sharpe
Linear $\alpha,\beta$	1	comparable to IPCA-1	~0.54%	> 3
B-spline (1 internal knot)	2	comparable to IPCA-2	~0.59%	> 3
B-spline (2 internal knots)	2	comparable to IPCA-2	~0.57%	> 3
IPCA-5 (reference)	5 imposed	0.60%	0.60%	smaller

Four headline claims:

Few factors suffice for comovement. 1–2 latent factors capture as much time-series variation as IPCA's 5.
Mispricing is real and large. The $\alpha(z)\equiv 0$ test is rejected; Sharpe ratios above 3 on $\alpha$-portfolios.
Mispricing has declined. Rolling sub-samples show pure-$\alpha$ Sharpe drifting down over time.
Nonlinearity matters. Specification tests reject linearity for both $\alpha(\cdot)$ and $\beta(\cdot)$ in most characteristics.

9. Why this matters for practitioners

CLAIM 1 The factor zoo is mostly compressible — two latent factors plus flexible $\beta(\cdot)$ recover most of the cross-sectional risk story.
CLAIM 2 Pure-$\alpha$ portfolios exist and are exploitable at institutional scale — high Sharpe even after accounting for the most plausible latent risk factors.
CLAIM 3 Specification testing is now a first-class tool. You can test, not assume, whether a linear factor model is enough.
CLAIM 4 The estimator is light. Step 1 is OLS; Step 2 is PCA. No iterative optimisation, no joint MLE. Runs in seconds on a 12k × 600 panel.

For Japan: nobody has run this estimator on a full-universe JP panel yet. Open question whether you'd find 1, 2, or more factors, and whether $\alpha$-portfolios survive transaction costs in the JP market.

10. Replication recipe

INPUT:
  Y  ∈ ℝ^{N×T}     excess returns
  Z  ∈ ℝ^{N×T×M}   characteristics (lagged, rank-transformed to [-0.5, 0.5])
  φ  ∶ ℝ^M → ℝ^J   basis (start with (1, z); upgrade to B-splines for nonlinear)

STEP 1  (cross-sectional OLS each month)
  for t = 1..T:
      Φ_t  =  φ(Z_{·,t})                       # N × J
      Γ̂_t  = (Φ_t'Φ_t)^{-1} Φ_t' Y_{·,t}       # J
  Γ̂     = [Γ̂_1, …, Γ̂_T]                        # J × T

STEP 2  (PCA)
  â     = mean of columns of Γ̂                 # J
  Γ̃     = Γ̂ − â · 1_T'
  SVD: Γ̃ = U Σ V'                               # take top-K̂
  B̂     = U_{:,1:K̂}                             # J × K̂
  F̂     = Σ_{1:K̂} V_{:,1:K̂}'                   # K̂ × T

OUTPUTS:
  α̂(z)  = φ(z)' â
  β̂(z)  = φ(z)' B̂
  f̂_t   = F̂_{·,t}

INFERENCE:
  For b = 1..B:
      w_i ∼ Exp(1) i.i.d.
      repeat Step 1 with weights; PCA fixed to original F̂
      collect (â^{(b)}, B̂^{(b)})
  Build Wald stat for H_0: a = 0 → bootstrap p-value
  Compare nested specs for linearity test

About 200 lines of Python with numpy and scipy.sparse.linalg.eigsh.

11. Open questions / extensions

What basis $\phi$ is best? CRW use $(1, z)$ and linear B-splines. Neural nets, kernel ridge, wavelets all candidates.
Time-varying $K$. Rolling sub-samples allow $K$ to change — a formal test for structural breaks in $K$ would be interesting.
Transaction costs. Sharpe-3 arbitrage portfolios churn a lot. Post-cost story is the obvious follow-up.
International evidence. US-only. Replicating on Japan, Europe, EM would test external validity. ← this is exactly what we are setting up to do.
Macro / state-variable extensions. The model allows $z_{it}$ to include macro state variables, not just firm characteristics. Mostly unexplored.

12. Reading order if you want to go deeper

Connor & Linton (2007) — original semiparametric factor model in finance.
Kelly, Pruitt & Su (2019) — IPCA; the immediate benchmark.
Freyberger, Neuhierl & Weber (2020) — nonparametric characteristic selection; source of the 36-char panel.
CRW (2023) — this paper.
Kim, Korajczyk & Neuhierl (2020) — alternative semiparametric estimator; triangulation.
Bai (2003), Bai & Ng (2002) — large-$N$, large-$T$ approximate factor model foundations.
Onatski (2010) — eigenvalue-ratio factor-number estimator (CRW's selector is a cousin).

講義ノート — Chen, Roussanov & Wang (2023)

セミパラメトリック条件付きファクターモデル：推定と推測

松前景一郎 · 2026-05-15 · v0.2（日英対訳版）

論文の要点

Regressed-PCA（回帰型主成分分析）を提案。これは、価格付け誤差 $\alpha(\cdot)$ とファクターローディング $\beta(\cdot)$ が銘柄特性の未知のノンパラメトリック関数であり、かつファクター $f_t$ が潜在変数である条件付きファクターモデルの 2 段階推定量である。
第 1 段階は特性の基底関数に対するクロスセクション回帰（Fama-MacBeth 回帰の篩（sieve）版）。第 2 段階はその係数ベクトル時系列に対する主成分分析（PCA）。
$N$ が大きければ $T$ は小さくてよい — ローリングサブサンプル分析が可能で、ファクター構造の時間変化を許容する。
実証：潜在ファクターは1〜2 個で十分。純 $\alpha$ ポートフォリオのシャープレシオは 3 を超える。ミスプライシングは時間とともに低下傾向。
IPCA（Kelly–Pruitt–Su 2019）の直接の対抗馬：同一データ、異なる目的関数、異なる結論。

1. なぜこの論文が必要か

実証アセットプライシングにはファクター動物園（factor zoo）問題がある。300 を超える特性がクロスセクション収益率を予測するとされている。Fama-MacBeth (1973) 以降の中心的問いは、各特性が以下のどちらに該当するかである：

リスク真の系統的ファクターへの時間変動エクスポージャーを捉えているのか、それとも
ミスプライシングいかなるファクターでも説明できないリターンを稼いでいるのか。

明確な回答を阻む 3 つの障害：

ファクター $f_t$ は潜在変数であり観測できない。
特性がローディング $\beta$・価格付け誤差 $\alpha$ に作用する関係は関数的であり、線形とは限らない。先行研究は計算容易性のため線形性を仮定することが多い。
$N \approx 12{,}000$ 銘柄、$T \approx 600$ ヶ月：標準的なファクター分析は $T\to\infty$ を要求するため成立しない。

Chen, Roussanov & Wang（CRW）はこの 3 つを同時に解決する。

線形ファクターモデル（Fama-French 等）：$\alpha, \beta$ を定数と仮定。条件付き・特性依存のローディングには対応できない。
IPCA（Kelly–Pruitt–Su 2019）：$\alpha(z) = z'\Gamma_\alpha$, $\beta(z) = z'\Gamma_\beta$ を仮定 — 特性に対し線形。時系列とクロスセクションの同時フィットを最大化。
CRW (2023, 本論文)：$\alpha(\cdot), \beta(\cdot)$ を任意の非線形関数として許容。ファクターはまず時系列共動を捉え、その後特性が平均リターンのクロスセクションを説明する。

2. モデル

銘柄 $i$、月 $t$ について：

$$ y_{it} \;=\; \alpha(z_{it}) \;+\; \beta(z_{it})' f_t \;+\; \varepsilon_{it}, \qquad i=1,\dots,N,\;\; t=1,\dots,T. $$

記号	意味	観測可能？
$y_{it}$	超過収益率	✅
$z_{it}$	銘柄特性ベクトル（$M$ 次元、ラグ付き）	✅
$\alpha(\cdot)$	スカラーの価格付け誤差関数	❌ 未知
$\beta(\cdot)$	$K$ 次元のローディング関数	❌ 未知
$f_t$	$K$ 次元の潜在ファクター	❌ 未知
$\varepsilon_{it}$	個別ショック	❌

アセットプライシング上の問いは $\alpha(\cdot) \equiv 0$ かどうかである。もし「然り」なら、特性は $\beta(\cdot)$ を通じてのみ意味を持つ — すなわちリスクエクスポージャーとしてのみ機能する。「否」ならば、特性はミスプライシングも担っている。

3. 鍵となる工夫 — 篩（sieve）近似

$\alpha(z)$ と $\beta(z)$ は無限次元なので直接推定できない。篩近似はそれぞれを基底関数 $\phi(z)$（多項式、B スプライン等）の有限線形結合で置き換える：

$$ \alpha(z) \;\approx\; \phi(z)' a, \qquad \beta_k(z) \;\approx\; \phi(z)' b_k. $$

$a$ と $b_k$ たちを $J\times(K+1)$ 係数行列に積み重ねると、モデルは以下のように書き直せる：

$$ y_{it} \;\approx\; \phi(z_{it})'\; \underbrace{\bigl(a + B f_t\bigr)}_{\displaystyle =: \Gamma_t \in \mathbb{R}^J} \;+\; \varepsilon_{it}. $$

各 $t$ において、リターンは基底 $\phi(z_{it})$ に対しほぼ線形であり、係数ベクトル $\Gamma_t$ が時間変動する。

基底 $\phi(z)$ は、特性の非線形関数から「合成ファクター」を多数生成する。$K$ 個の潜在ファクターを持つ条件付きモデルは、$J \gg K$ 個の「マネージドポートフォリオ」を持つ線形パネルに変換され、本当の $K$ 個のファクターはそれらの係数の時系列の中に潜んでいる。

4. 推定量 — 2 段階の Regressed-PCA

第 1 段階 — 各月のクロスセクション回帰

各月 $t$ で、リターンを基底に対し OLS 回帰：

$$ \hat{\Gamma}_t \;=\; \bigl(\Phi_t' \Phi_t\bigr)^{-1} \Phi_t' Y_t. $$

これは基底関数を用いた Fama-MacBeth (1973) 回帰である。係数 $\hat{\Gamma}_t \in \mathbb{R}^J$ は、$J$ 個のマネージドポートフォリオのリターンと解釈できる — 各々ある基底関数に単位エクスポージャーを持ち、他の基底にはゼロエクスポージャーを持つポートフォリオ。

これらをまとめる：$\hat{\Gamma} = [\hat{\Gamma}_1, \dots, \hat{\Gamma}_T] \in \mathbb{R}^{J\times T}$。

第 2 段階 — 係数行列の主成分分析

$\Gamma_t = a + B f_t$ であるから、$\Gamma_t$ の時間変動は $K$ 個の潜在ファクター $f_t$ に駆動される。$\hat{\Gamma}$ に PCA を適用：

上位 $K$ 個の固有ベクトル → $\hat{B}$（ローディング関数の基底係数）。
対応する主成分 → $\hat{f}_t$。
$\hat{\Gamma}_t$ の列平均 → $\hat{a}$（$\alpha$ の基底係数）。
代入：$\hat{\alpha}(z) = \phi(z)'\hat{a}$, $\hat{\beta}(z) = \phi(z)'\hat{B}$。

flowchart TB A["元データ
y_it , z_it
N 銘柄 × T 月"] --> B["基底を選択
φ(z) — 線形 / B スプライン …"] B --> C["第 1 段階
各月 t について
Y_t を Φ_t に OLS
→ Γ̂_t ∈ ℝ^J"] C --> D["積み重ね
Γ̂ = (Γ̂_1, …, Γ̂_T)
J × T 行列"] D --> E["第 2 段階 — Γ̂ に PCA"] E --> F1["â = 列平均
→ α̂(z) = φ(z)·â"] E --> F2["B̂ = 上位 K 固有ベクトル
→ β̂(z) = φ(z)·B̂"] E --> F3["f̂_t = 主成分"] E --> F4["K̂ = 固有値比
セレクター"] style C fill:#eef4f8,stroke:#1f5d8a style E fill:#ecf3ee,stroke:#2e6e3e style F4 fill:#fff6e3,stroke:#b8651e

$\Gamma_t$ が $f_t$ についてアフィンであるため、係数行列の主成分は（回転を除いて）正確に潜在ファクターと一致する。クロスセクション回帰により高次元のリターンパネルが $J$ 次元のマネージドポートフォリオ空間に射影され、その小さな対象に対する PCA は $T$ が小さくとも well-conditioned である。これが論文全体を一文に凝縮した内容。

5. IPCA との比較 — 何が本当に違うのか

	IPCA (Kelly–Pruitt–Su 2019)	Regressed-PCA (CRW 2023)
モデル	$y_{it} = z_{it}'\Gamma_\alpha + z_{it}'\Gamma_\beta f_t + \varepsilon$	$y_{it} = \alpha(z_{it}) + \beta(z_{it})'f_t + \varepsilon$
関数形	$z$ について線形	$\alpha, \beta$ はノンパラメトリック
推定	時系列＋クロスセクション二乗誤差を反復的に同時最小化	一発：回帰 → PCA
暗黙の目的関数	平均リターンのクロスセクションフィット	時系列共動のフィット（APT 的）
漸近論	$N$ も $T$ も大きい必要	$N$ 大、$T$ 固定で OK
実証 $K$	5 を主張	1（線形）/ 2（非線形）、データで決定
結論	特性はほぼローディング；$\alpha$ は小	特性はローディングと非ゼロ $\alpha$ の両方を担う

概念的な分岐：IPCA は全てを同時にフィットするため、ファクターがクロスセクション上のパターンを吸収してしまう。CRW はまず共動を説明するファクターを抽出し、その後「特性は依然として平均リターンを予測するか」と問う。答えは「然り」である。

6. ファクター数 $K$ の選択

CRW は $N\to\infty$ かつ $T$ 固定で $K$ を一致推定する固有値比推定量を提案：

$$ \hat{K} \;=\; \arg\max_{1 \le k \le k_{\max}} \frac{\lambda_k(\hat{\Gamma}\hat{\Gamma}')}{\lambda_{k+1}(\hat{\Gamma}\hat{\Gamma}')}. $$

$(K{+}1)$ 番目の固有値が「ノイズサイズ」になるため、比は $k = K$ で急激に大きくなる。「Fama-French の 5 ファクター」や「CAPM の 1 ファクター」を事前に決め打つ必要はなく、データが選ぶ。

7. 推測 — 重み付きブートストラップ

$(\hat{a}, \hat{B})$ の漸近分布はデータ依存の回転行列 $H$ を含む。素朴なブートストラップでは各反復で $H$ を再推定してしまい、一致性が崩れる。

CRW の解決策：すべてのブートストラップ反復で元サンプルのファクター推定値 $\hat{F}$ を固定し、回転を不変に保つ。

2 つの検定が得られる：

$\alpha(\cdot)\equiv 0$ の Wald 型検定。$\hat{a}$ の二次形式。臨界値は重み付きブートストラップから取得。棄却 ⟹ 特性はミスプライシングを担う。
$\alpha(\cdot)$ または $\beta(\cdot)$ の線形性の LR 型検定。制約付き（線形）と制約なし（B スプライン）の推定量を比較。重要なテクニック：制約付き推定量を計算する際にも制約なしの $\hat{F}$ を用いることで、帰無・対立の下で回転を一致させる。

重み分布：$w_i \sim \text{Exp}(1)$（i.i.d.）。

8. 実証結果（米国、1968–2014）

サンプル：Kelly–Pruitt–Su (2019) パネル＝ Freyberger–Neuhierl–Weber (2020) データ：約 12,813 銘柄 × 36 特性、月次、1968 年 9 月〜2014 年 5 月。

仕様	選択された $K$	Total $R^2$	OOS $R^2_O$	純 $\alpha$ シャープ
線形 $\alpha,\beta$	1	IPCA-1 と同等	約 0.54%	> 3
B スプライン（節点 1 個）	2	IPCA-2 と同等	約 0.59%	> 3
B スプライン（節点 2 個）	2	IPCA-2 と同等	約 0.57%	> 3
IPCA-5（参考）	5 を強制	0.60%	0.60%	より小

4 つの主要主張：

共動には少数のファクターで十分。1〜2 個の潜在ファクターで IPCA の 5 ファクターと同等の時系列変動を捉える。
ミスプライシングは実在し、規模も大きい。$\alpha(z)\equiv 0$ 検定は棄却され、$\alpha$ ポートフォリオのシャープレシオは 3 を超える。
ミスプライシングは低下傾向。ローリングサブサンプル分析では、純 $\alpha$ のシャープレシオが時間とともに低下していくことが示される。
非線形性は重要。線形性の検定はほとんどの特性について $\alpha(\cdot)$ にも $\beta(\cdot)$ にも棄却される。線形モデルでは有意でない変数がノンパラメトリックには有意となる（逆も同様）。

9. 実務家にとっての含意

主張 1ファクター動物園はほぼ圧縮可能 — 2 個の潜在ファクターと柔軟な $\beta(\cdot)$ でクロスセクションリスクの大半を回復できる。
主張 2純 $\alpha$ ポートフォリオは存在し、機関投資家規模で execution 可能 — 妥当な潜在リスクファクターを控除しても高いシャープを残す。
主張 3仕様検定が一級の道具となった。線形ファクターモデルで十分か、仮定するのではなく検定できる。
主張 4推定量は軽量。第 1 段階は OLS、第 2 段階は PCA。反復最適化も同時 MLE も不要。12k × 600 のパネルで数秒で動く。

日本市場については：このフルユニバース推定はまだ誰も実行していない。1 個か 2 個か、それ以上のファクターが見つかるか、$\alpha$ ポートフォリオが取引コスト控除後も生き残るかは未解明のオープンクエスチョン。

10. レプリケーションのレシピ

入力:
  Y  ∈ ℝ^{N×T}     超過収益率
  Z  ∈ ℝ^{N×T×M}  特性（ラグ付き、[-0.5, 0.5] にランク変換済み）
  φ  ∶ ℝ^M → ℝ^J   基底（まず (1, z)、非線形にしたければ B スプラインへ）

第 1 段階  （各月のクロスセクション OLS）
  for t = 1..T:
      Φ_t  =  φ(Z_{·,t})                       # N × J
      Γ̂_t  = (Φ_t'Φ_t)^{-1} Φ_t' Y_{·,t}       # J
  Γ̂     = [Γ̂_1, …, Γ̂_T]                        # J × T

第 2 段階  （PCA）
  â     = Γ̂ の列平均                             # J
  Γ̃     = Γ̂ − â · 1_T'
  SVD:  Γ̃ = U Σ V'                              # 上位 K̂ 個を取る
  B̂     = U_{:,1:K̂}                             # J × K̂
  F̂     = Σ_{1:K̂} V_{:,1:K̂}'                   # K̂ × T

出力:
  α̂(z)  = φ(z)' â
  β̂(z)  = φ(z)' B̂
  f̂_t   = F̂_{·,t}

推測:
  For b = 1..B:
      w_i ∼ Exp(1) i.i.d.
      第 1 段階を重み付きで反復；PCA は元の F̂ で固定
      (â^{(b)}, B̂^{(b)}) を集める
  H_0: a = 0 の Wald 統計量 → ブートストラップ p 値
  入れ子仕様の比較で線形性検定

numpy と scipy.sparse.linalg.eigsh で約 200 行の Python。

11. オープン課題・拡張

最良の基底 $\phi$ は何か。CRW は $(1, z)$ と線形 B スプラインを用いる。ニューラルネット、カーネルリッジ、ウェーブレットも候補。
時間変動する $K$。ローリングサブサンプル分析は $K$ の変化を許容する — $K$ の構造変化の形式的検定があれば興味深い。
取引コスト。シャープ 3 の裁定ポートフォリオは回転率が高い。コスト控除後の議論が次の自然な追及。
国際的エビデンス。米国のみ。日本・欧州・新興国でのレプリケーションは外的妥当性の検証となる。← まさに我々がこれから取り組もうとしていること。
マクロ・状態変数拡張。モデルは $z_{it}$ にマクロ状態変数を含められるが、ほぼ未開拓。

12. さらに深掘りしたい場合の読書順

Connor & Linton (2007) — ファイナンスにおける元祖セミパラメトリックファクターモデル。
Kelly, Pruitt & Su (2019) — IPCA、直接の比較対象。
Freyberger, Neuhierl & Weber (2020) — ノンパラメトリック特性選択；36 特性パネルの源。
CRW (2023) — 本論文。
Kim, Korajczyk & Neuhierl (2020) — 別系統のセミパラメトリック推定量；トライアンギュレーションに有用。
Bai (2003), Bai & Ng (2002) — 大 $N$ 大 $T$ 近似ファクターモデルの基礎。
Onatski (2010) — 固有値比によるファクター数推定（CRW のセレクターの親戚）。