Research Layer
Last updated: 2026-05-19
Overview
The research layer sits above the gold layer and answers: “which signals actually work, and how should I combine them into a strategy?” It consists of three components, deployed incrementally:
Gold signals ──┐ ├──► Signal Validation (walk-forward) ──► Signal Scorecard [IMPLEMENTED] ├──► Meta-Model (ML combination) ──► Combined Alpha Score [PLANNED #1] └──► Backtest Engine ──► Strategy PnL + Risk [PLANNED #2]1. Walk-Forward Signal Validation (Implemented)
Purpose
Continuously validate whether gold signals have predictive power over forward equity returns. Detects signal decay, regime sensitivity, and overfitting — so you know which signals to trust before trading on them.
Methodology
Expanding window approach:
|------------ train (>= 252 days) ------------|-- test (63 days) --| ^ window slides forward by 21 days- Start with minimum 252 trading days of history
- Test the next 63 days (one quarter)
- Slide the window forward by 21 days
- Repeat until end of available data
Metrics per window:
| Metric | Definition | Threshold |
|---|---|---|
| IC (Information Coefficient) | Spearman rank correlation: signal vs. forward return | |IC| > 0.03 is meaningful |
| Hit Rate | % of days where signal direction matched return direction | > 55% is meaningful |
| t-statistic | Statistical significance of IC | |t| > 1.96 (95% confidence) |
| Regime IC | IC computed within each regime subset (hiking/cutting/stress/etc.) | Same thresholds |
Signal health classification:
| Health | Criteria | Action |
|---|---|---|
| Active | Latest IC > 0.03 AND statistically significant | Use in strategies |
| Weakening | Historical avg IC > 0.02 but latest not significant | Monitor, reduce weight |
| Dead | No evidence of predictive power | Do not trade on this |
What Gets Validated
All gold signal columns across three horizons (5d, 20d, 60d forward SPY return):
Yield Curve signal:
- DGS30, DGS10, DGS2, T10Y2Y levels
- dgs10_dgs2_spread, dgs30_dgs10_spread
- t10y2y_ma20, t10y2y_ma60
- dgs10_momentum (20d, 60d), dgs30_momentum (20d, 60d)
- Regime: fed_cycle (hiking/cutting/hold)
Risk Composite signal:
- risk_score, risk_score_smoothed
- VIXCLS_zscore, DTWEXBGS_zscore
- spy_momentum_20d_zscore, hyg_spy_ratio_zscore
- Regime: regime (risk_on/neutral/risk_off)
Credit Leverage signal:
- credit_stress_score, credit_stress_smoothed
- BAMLH0A0HYM2, BAMLC0A4CBBB, hy_bbb_spread_diff
- hy_spread_momentum_20d, bbb_spread_momentum_20d
- margin_debt_pct_chg_3m
- Regime: credit_regime (stress/elevated/neutral/benign)
Infrastructure
- ETL:
etl/research/walk_forward_validation.py - Compute: AWS Glue (3x G.1X workers)
- Schedule: Weekly, Sunday 08:00 UTC
- Output:
s3://{research_bucket}/validation/signal={name}/+validation/scorecard/ - Trigger: EventBridge emits
SignalValidationCompletewhen scorecard is ready
How to Read the Scorecard
Query via Athena:
SELECT signal_name, horizon_days, latest_ic, latest_hit_rate, signal_healthFROM stratum_research_{env}.signal_scorecardWHERE signal_health != 'dead'ORDER BY latest_ic DESC;Key questions it answers:
- “Which signals are currently predictive at the 20-day horizon?”
- “Has credit_stress_score lost its edge since the regime shifted to ‘hold’?”
- “Which signals work in a hiking cycle but die in a hold?“
2. ML Signal Combination — Meta-Model (Planned)
GitHub: Issue #1
Concept
Individual signals are noisy and fire at different timings. The meta-model learns the optimal weighting and timing across all gold signals, producing a single daily probability score.
Architecture
- Model: LightGBM (interpretable, handles mixed frequencies)
- Features: All gold signal columns aligned on date
- Target: Forward SPY returns at multiple horizons, or binary drawdown indicator (> 5% in next 20 days)
- Training: Walk-forward (expanding window) — never uses future data
- Output: Daily
combined_alphascore + feature importances
Why Not a Simple Average?
- Signals have different optimal horizons (credit stress leads by 2-4 weeks; VIX is contemporaneous)
- Signal importance changes by regime (rates momentum matters more during hiking)
- Non-linear interactions (credit stress + rising rates = multiplicative risk)
3. Strategy Backtesting Framework (Planned)
GitHub: Issue #2
Concept
Translate validated signals into simulated portfolio decisions with realistic constraints.
Architecture
- Engine: Event-driven loop (iterates day by day, no lookahead)
- Inputs: Gold signals + meta-model + Yahoo price histories
- Assets: SPY, TLT, GLD, cash (expandable)
- Constraints: Transaction costs, slippage, position limits
- Output: Equity curve, drawdown, Sharpe, Calmar, turnover
Example Strategies to Test
- Risk-off on credit stress: Go 50% cash when
credit_stress_score > 1.0 - Rates + leverage squeeze: Short equities when
dgs30_momentum_60d > 50bpsANDcredit_regime == "stress" - Combined alpha threshold: Size positions proportional to meta-model
combined_alpha
Dependency Chain
Walk-forward validation (which signals to trust) │ ▼Meta-model (how to combine trusted signals) │ ▼Backtester (what would this combination have returned)Each layer feeds the next. Walk-forward tells you which inputs to give the meta-model. The meta-model produces the score the backtester trades on.