SOPHIE Daddy Quant Blog - Stock & Options Analysis

1. Evolution of Statistical Arbitrage

Statistical arbitrage (stat arb) is a heavily quantitative framework that exploits temporary pricing inefficiencies across diversified portfolios. Originating in the 1980s with pairs trading, it relies on isolating idiosyncratic components of asset returns by neutralizing market and factor risks.

Once isolated, these residual prices often exhibit mean-reverting characteristics—drifting away from, and eventually returning to, a long-term historical equilibrium.

The Past: Distance Metrics

Early strategies relied on simple squared Euclidean distances between historical prices (e.g., Gatev et al., 2006). These generated huge early returns but suffered severe alpha decay as markets became efficient, culminating in the 2007 "quant quake."

The Present: Advanced Models

Modern stat arb relies on highly sophisticated, multi-asset factor models utilizing deep learning architectures, strict transaction cost constraints, and rigorous validation to prevent overfitting.

2. Major Factors in Quant Trading

Factor investing targets quantifiable traits that explain cross-sectional variation in expected returns, shifting away from discretionary stock picking. It evolved from the single-factor CAPM (Market Risk) to the Fama-French Three-Factor and eventually Five-Factor models.

R_i,t - R_f,t = α_i + β₁(R_m,t - R_f,t) + β₂(SMB) + β₃(HML) + β₄(RMW) + β₅(CMA) + ε_i,t

The Fama-French Five-Factor Time-Series Regression Equation

Factor	Acronym	Economic Rationale
Market Risk	Rm - Rf	Baseline compensation for bearing general equity market risk.
Size	SMB	Small Minus Big. Smaller firms are less liquid and carry higher distress risk, demanding a premium.
Value	HML	High Minus Low. Undervalued companies correct upward due to mean reversion in sentiment.
Profitability	RMW	Robust Minus Weak. Highly profitable firms with stable earnings are less susceptible to shocks.
Investment	CMA	Conservative Minus Aggressive. Firms that overinvest tend to misallocate capital.

Key TakeawayBeyond Fama-French, modern funds heavily blend Momentum (winning assets keep winning, acting as a counterbalance to mean reversion) and Low Volatility / Quality factors to maximize risk-adjusted returns across shifting economic regimes.

3. Advanced Extraction Models

Traditional extraction uses static PCA, decomposing returns into systematic and idiosyncratic (residual) components. The residual portfolio holds zero beta to the selected risks, insulating it from macro shocks and making it mean-reverting. However, static loadings contradict dynamic corporate reality.

IPCA (Instrumented PCA)

Introduces observable firm characteristics as instrumental variables to estimate time-varying factor loadings. It successfully maps characteristics to either risk factor exposures (beta) or anomaly intercepts (alpha).

Deep Learning & Attention

Bypasses the traditional two-step process. Attention Factor Models use CNNs and transformers to jointly learn tradable factors and portfolio policies in a single step, explicitly maximizing out-of-sample Sharpe ratios after transaction costs.

4. The Ornstein-Uhlenbeck Framework

To systematically trade the extracted factor-neutral residual, quants model the cumulative residual as an Ornstein-Uhlenbeck (OU) process. It balances a deterministic drift pulling toward a mean, and a continuous random shock preventing permanent equilibrium.

dX_t = κ(μ - X_t)dt + σ dW_t

Continuous-Time Stochastic Differential Equation (SDE)

The s-score (Avellaneda-Lee Framework)

Standardizes trading signals across assets by measuring the distance of the residual from its equilibrium mean, scaled by standard deviation.

s_mod,i = (X_i,t - μ_i) / σ_eq,i - α_i / (κ_iσ_eq,i)

Entry: Open trade when |s-score| > 1.25.
Exit: Close short at 0.75; Close long at -0.50.

5. The Marriott-Pope Effect

Empirical estimation of the mean-reversion speed via Ordinary Least Squares (OLS) in finite samples has a severe flaw. The OLS estimate of the autoregressive coefficient is inherently biased downward.

Because the coefficient is depressed, the calculated mean-reversion speed is biased upwards. The model "hallucinates" that the residual will revert much faster than it actually will.

This causes algorithms to falsely categorize slow-reverting assets as highly profitable fast opportunities, triggering premature time-stop exits and resulting in devastating realized losses. Advanced practitioners use non-linear corrections or bootstrap methods to debias estimators.

6. Execution Dynamics

Mean-reversion alpha is fragile. Transaction costs—slippage and market impact—can easily destroy a profitable backtest. When an algorithm executes a large order, it consumes liquidity and moves the price against itself.

The Square-Root Law of Market Impact

Slippage is proportional to the asset's volatility and the square root of the normalized order size.

Δp = Y · σ · √(Q / V)

Where Q = Total order quantity, V = Average daily volume

Key TakeawayThe concave nature of the square-root function acts as a strict capacity ceiling. Scaling AUM exponentially increases execution costs, degrading net alpha. Algorithms must balance temporary impact (immediate liquidity costs) against timing risk (mispricing correcting before order completion).

7. Rigorous Research Practices

The most pervasive failure point in quantitative finance is backtest overfitting—fine-tuning parameters to historical noise.

Winsorization & Outliers

Returns have fat tails. Winsorization mitigates outliers by capping them at specific percentiles (e.g., 5th and 95th) rather than deleting them, preserving time-series continuity while dampening black-swan distortions.

Combinatorial Purged Cross-Validation (CPCV)

Standard cross-validation leaks future information in financial time series. CPCV fixes this via Purging (removing overlapping training data) and Embargoing (implementing a dead-zone after test sets) to generate true out-of-sample distributions.

Deflated Sharpe Ratio (DSR)

Corrects the traditional Sharpe Ratio for non-normality (skewness/kurtosis) and selection bias (multiple testing). If a strategy's DSR falls below a 95% threshold, it is rejected as a statistical illusion.

The Pinnacle of Alpha Generation

Elite quantitative practitioners must navigate advanced econometrics, transaction constraints, and the gauntlet of overfitting prevention to extract true market-neutral alpha.

Based on academic research and institutional quantitative frameworks.

Quantitative Trading of Mean Reversion