Cointegration identifies pairs of assets that share a stable long-run relationship — they may diverge in the short term, but revert to a stable spread over time. It is the mathematical foundation of pairs trading and statistical arbitrage, distinguishing temporary mispricing from permanent divergence.
Section 1: Core Mechanics
A cointegrated pair of assets has a property that correlation alone cannot confirm: their price spread is mean-reverting. You can subtract one from the other (with the right hedge ratio) and get a stationary series — one with a stable mean and variance over time. This is what makes pairs trading viable as a systematic strategy.
The critical distinction between correlation and cointegration: Two assets can be highly correlated (both rise in bull markets) but not cointegrated (their spread widens without reverting). Cointegrated assets share a common stochastic trend — they are mathematically linked in a way that prevents indefinite divergence.
Formula
The Johansen Test checks for cointegration among N time series. The two test variants are:
Trace Statistic:
Maximum Eigenvalue Statistic:
Where is the sample size, are the ordered eigenvalues from the reduced-rank regression, and is the number of cointegrating vectors being tested (null hypothesis: r = 0 means no cointegration).
If the test statistic exceeds the critical value at the 95% confidence level, reject the null hypothesis of no cointegration.
For a pair of two assets, the Engle-Granger two-step method is simpler: run OLS regression of Price_A on Price_B, then test the residuals for stationarity using the ADF unit root test. If residuals are stationary — cointegration exists.
Inputs
- Two (or more) price series: Must be non-stationary I(1) processes — confirm with ADF unit root test first. If the series are stationary (I(0)), cointegration is not applicable.
- Lookback window: Minimum 252 days; 2–5 years preferred for stable estimates
- Lag order for VECM: Selected using AIC/BIC criterion in the Vector Error Correction Model
Parameters
| Parameter | Default | Range | Impact |
|---|---|---|---|
| Lookback period | 252 days | 252–1260 days | Longer = more statistically reliable; shorter may miss structural breaks |
| Confidence level | 95% | 90%, 95%, 99% | Higher confidence = fewer pairs identified; greater certainty for those found |
| Lag order (k) | AIC-selected | 1–10 lags | Incorrect lag order invalidates the test — always use information criterion |
Output
- Test decision: Cointegrated (fail to reject null = no cointegration; reject null = cointegration exists)
- Hedge ratio: The slope coefficient from OLS regression of Price_A on Price_B — how many units of B to sell for each unit of A held long
- Spread series: Price_A minus hedge_ratio times Price_B — should be stationary if cointegrated
- Z-score: Standardized spread — (spread - mean) / standard deviation of spread
Visual Behavior
Plot the Z-score of the spread as a separate panel. A stationary Z-score oscillates around 0 within a stable band. Entry zones appear when the Z-score breaches ±2.0. Exit signals appear when the Z-score returns to 0.
Section 2: Interpretation & Signals
Correlation vs. Cointegration — Side by Side
| Property | Correlation | Cointegration |
|---|---|---|
| What it measures | Direction of co-movement of returns | Stability of long-run price relationship |
| Input series | Returns (log or simple) | Price levels (non-stationary I(1)) |
| Can be high with no mean-reversion | Yes — two trending assets both look correlated | No — cointegration requires mean-reversion of spread |
| Pairs trading sufficient? | No — correlation alone can't confirm spread reversion | Yes — stationary spread = tradeable mean reversion |
| Test used | Pearson r statistic | Johansen trace or Engle-Granger ADF |
Step-by-Step Pairs Trade Execution
Step 1 — Screen: Find candidate pairs with Pearson r above 0.7 over the past 252 days. This is a necessary but not sufficient screen.
Step 2 — Verify I(1): Run ADF unit root test on each price series. Both must be non-stationary (ADF fails to reject null = series has a unit root = I(1)). If either series is stationary, cointegration testing is invalid.
Step 3 — Test cointegration: Run Johansen test (or Engle-Granger) at 95% confidence. Confirm the test statistic exceeds the critical value.
Step 4 — Calculate hedge ratio: Use OLS regression: Price_A = alpha + beta times Price_B + residual. The slope beta is the hedge ratio.
Step 5 — Build the spread: Spread = Price_A minus beta times Price_B. Calculate rolling mean and rolling standard deviation of the spread.
Step 6 — Z-score: Z = (Spread minus rolling_mean) / rolling_std.
Step 7 — Trade signals:
- Z below -2.0: Long Asset A, short (beta units of) Asset B — spread expected to revert upward
- Z above +2.0: Short Asset A, long (beta units of) Asset B — spread expected to revert downward
- Z between -0.5 and +0.5: Exit position — spread has reverted to mean
Z-Score Signal Zones
| Z-Score | Signal |
|---|---|
| Below -2.0 | Enter long spread (long A, short B) |
| -2.0 to -0.5 | Hold long spread position |
| -0.5 to +0.5 | Exit — spread at mean |
| +0.5 to +2.0 | Hold short spread position |
| Above +2.0 | Enter short spread (short A, long B) |
| Beyond plus or minus 3.0 | Stop-loss consideration — spread may not revert |
Chart — Pairs Spread Z-Score (XOM vs CVX)
XOM vs CVX Spread Z-Score — Pairs Trade Signals (2023)
Section 3: Pass vs. Live — Real-Time Reliability
The Johansen test result is fixed once calculated on the historical window. The Z-score of the spread updates daily as new prices arrive. There is no repaint. The key real-time risk is relationship breakdown — not a calculation artifact.
Section 4: Practical Use Cases
Setup: Use intraday cointegrated pairs — same-sector ETFs (e.g., XLF vs. KBE for banks) Signal: Z-score on 20-bar intraday window breaches ±1.5 standard deviations Entry: Long lagging asset, short leading asset, verified on 15-minute close Exit: Z-score returns to 0 or reverses to the entry side Key rule: Use tight stops — intraday cointegration is weaker than daily; keep position small and spread cost manageable
Setup: Test cointegration on 252-day daily data for sector peer pairs (e.g., JPM vs. BAC, WFC vs. USB) Signal: Z-score on 60-day rolling spread breaches -2.0 or +2.0 Entry: Long the cheap leg, short the expensive leg in equal beta-weighted dollar amounts Stop: Z-score exceeds ±3.0 — close both legs and reassess the cointegration relationship Target: Z-score returns to 0 — take profit, do not be greedy about squeezing to the other side
Setup: Run Johansen test quarterly on long-term cointegrated pairs across sectors (e.g., gold miners: NEM vs. GFI) Signal: Weekly Z-score of rolling 252-day spread breaches ±2.0 Entry: Size both legs to equal 1 Beta unit of market exposure — the pair should be approximately market-neutral Exit: Z-score returns to ±0.5, or quarterly retest shows cointegration has broken (fail at 90% level) Key rule: Pairs trade is market-neutral by construction — hedge ratio must be recalculated monthly to stay balanced
Real example: XOM and CVX have been cointegrated over multiple years as the two largest US oil majors sharing the same commodity price input, capital structure, and regulatory environment. In April 2023, the Z-score of their spread dropped to -2.3, signaling XOM had lagged CVX unusually. Entering long XOM, short CVX in beta-adjusted equal dollar amounts, the spread reverted to 0 by July 2023 — a clean 3-month pairs trade with minimal market directional exposure.
Section 5: Pseudo Code
INPUT: price_a[], price_b[], confidence=0.95
PROCESS:
Step 1: Verify both series are I(1) — non-stationary
from statsmodels.tsa.stattools import adfuller
adf_a = adfuller(price_a)
adf_b = adfuller(price_b)
# Both should FAIL to reject null (p > 0.05 = non-stationary = I(1))
Step 2: Run Johansen cointegration test
from statsmodels.tsa.vector_ar.vecm import coint_johansen
result = coint_johansen(endog=[[a, b] for a, b in zip(price_a, price_b)], det_order=0, k_ar_diff=1)
trace_stat = result.lr1[0] # Trace statistic for r=0
crit_value = result.cvt[0, 1] # 95% critical value
cointegrated = (trace_stat > crit_value)
Step 3: Calculate hedge ratio via OLS (if cointegrated)
import numpy as np
hedge_ratio = np.polyfit(price_b, price_a, 1)[0]
Step 4: Build spread and Z-score
spread = price_a - hedge_ratio * price_b
rolling_mean = rolling_average(spread, window=60)
rolling_std = rolling_stdev(spread, window=60)
z_score = (spread - rolling_mean) / rolling_std
Step 5: Generate signals
if z_score[-1] < -2.0: signal = "LONG SPREAD (long A, short B)"
if z_score[-1] > +2.0: signal = "SHORT SPREAD (short A, long B)"
if abs(z_score[-1]) < 0.5: signal = "EXIT"
OUTPUT: cointegrated (bool), hedge_ratio (float), spread[], z_score[], signal (str)
EDGE CASES:
- If either series is I(0) (stationary): cointegration test is invalid — return error
- If test does not reject null at 95%: try 90% or extend the historical window
- Recalculate hedge ratio monthly — it drifts as the relationship evolves
- Stop-loss if abs(z_score) exceeds 3.5 and has not reversed within 10 bars
Section 6: Parameters & Optimization
Johansen Test Configuration
| Setting | Default | Notes |
|---|---|---|
| det_order | 0 (no trend in VECM) | Use 0 for financial prices; use -1 if you suspect no deterministic trend |
| k_ar_diff | 1 | Number of lagged differences in VECM — select using AIC/BIC |
| Confidence | 95% | 99% for conservative filtering; 90% for wider candidate set |
| Minimum history | 252 bars | 2+ years preferred — longer samples give more reliable test |
Z-Score Window Choices
| Window | Behavior | Use |
|---|---|---|
| 20-day rolling spread | Fast mean, noisy Z-score | Short-term intraday pairs |
| 60-day rolling spread | Standard swing pairs window | Most daily pairs strategies |
| 252-day rolling spread | Near-static reference level | Long-term position pairs |
What is the difference between the Johansen test and the Engle-Granger test?
The Engle-Granger test is a two-step method for testing cointegration between exactly two series. It runs an OLS regression and then tests the residuals for stationarity with ADF. It is simple but has lower statistical power and cannot handle more than two series. The Johansen test works for N series simultaneously, provides the hedge ratios directly as eigenvectors, and has greater statistical power. Use Johansen for all production pairs trading work.
How do I select the lag order k_ar_diff?
Run the Johansen test with k_ar_diff values from 1 to 10. For each, calculate the AIC: AIC = 2k minus 2 times log-likelihood. Choose the k that minimizes AIC. Most financial time series pairs need k of 1 to 3 lags. Choosing k too large wastes degrees of freedom; too small leaves autocorrelation in the residuals.
When does cointegration break?
Cointegration breaks when the economic rationale for the pair fundamentally changes: a merger announcement (two companies become one), a bankruptcy, a commodity regime shift, or a regulatory change that affects only one asset. Monitor pairs monthly. If the Johansen test no longer rejects the null at 90% confidence, assume cointegration has broken and close all positions immediately.
Section 7: Synergies & Conflicts
| Works Well With | Avoid Combining With | |
|---|---|---|
| Pearson Correlation | Use as pre-screen — only test cointegration for pairs with r above 0.7 to reduce computation and false positives | — |
| ADF Unit Root Test | Required prerequisite — both series must be I(1) before Johansen is valid | — |
| Beta | Calculate Beta-adjusted position sizes for both legs to keep the pair truly market-neutral | — |
| Z-Score entry/exit rules | The Z-score of the spread is the complete trade execution framework — clean, rule-based, backtestable | — |
| Short-history data | — | Fewer than 252 bars makes the Johansen test statistically unreliable — spurious cointegration likely |
| Illiquid pairs | — | Wide bid-ask spreads consume the mean-reversion profit before it accrues — both legs need tight spreads and high volume |
| Binary event exposure | — | Earnings, FDA decisions, or merger speculation on either leg can cause permanent spread divergence — check catalysts before entering |
Section 8: Common Mistakes
| Mistake | Root Cause | Solution |
|---|---|---|
| Skipping the ADF unit root test | Applying Johansen to stationary series invalidates the test | Always run ADF first — reject unit root on either series = stop, do not proceed |
| Using correlation to confirm a pair without cointegration | High correlation does not imply mean-reverting spread | Run Johansen test — correlation is necessary but not sufficient |
| Static hedge ratio | Hedge ratio drifts as the fundamental relationship evolves | Recalculate hedge ratio monthly using the most recent 252-day regression |
| Ignoring spread transaction costs | Mean reversion often delivers 1–3% gains; costs of 0.5% or more per leg destroys the edge | Calculate round-trip cost before entering — both legs, bid-ask, and borrow cost for the short |
| Over-fitting cointegration to historical data | Running hundreds of pairs tests finds spurious cointegration by chance | Apply Bonferroni correction or split-sample validation — test on out-of-sample data before trading |
Section 9: Cheat Sheet
USE WHEN: Identifying pairs for statistical arbitrage, testing whether a spread is mean-reverting before trading it, constructing market-neutral books
AVOID WHEN: Either series is stationary (I(0)), history is shorter than 252 bars, either asset has a binary catalyst upcoming
ENTRY SIGNAL: Spread Z-score below -2.0 = long spread (long A, short B); Z-score above +2.0 = short spread (short A, long B)
EXIT SIGNAL: Z-score returns to between -0.5 and +0.5 (mean reversion complete); stop-loss at Z-score beyond ±3.5 if not reverting
PARAMETERS: Johansen test at 95% confidence; k_ar_diff selected by AIC; 60-day rolling window for Z-score; hedge ratio recalculated monthly
CONFLUENCE: Confirm with Pearson r above 0.7, verify both series are I(1) via ADF, size both legs to market-neutral beta
RISK: Cointegration relationships break permanently — monitor monthly and close immediately on test failure
BEST TIMEFRAME: Daily bars for calculation; applicable to swing (daily) and position (weekly) timeframes