Cointegration identifies pairs of assets that share a stable long-run relationship — they may diverge in the short term, but revert to a stable spread over time. It is the mathematical foundation of pairs trading and statistical arbitrage, distinguishing temporary mispricing from permanent divergence.

Correlation & Multi-Asset
Category
Advanced
Difficulty
Statistical test — trace statistic vs. critical value; spread Z-score -3 to +3
Output Range
Minimum 252 days (1 year) of daily closing prices per series
Default Period
None — backward-looking statistical test on closed historical bars
Repaint Risk
Heavy — requires long historical series and statistical software (Python/R)
Data Need
CORRELATION · DATA_INTENSIVE · CODE_HEAVY · LAGGING · REAL_TIME
Tags

Section 1: Core Mechanics

A cointegrated pair of assets has a property that correlation alone cannot confirm: their price spread is mean-reverting. You can subtract one from the other (with the right hedge ratio) and get a stationary series — one with a stable mean and variance over time. This is what makes pairs trading viable as a systematic strategy.

The critical distinction between correlation and cointegration: Two assets can be highly correlated (both rise in bull markets) but not cointegrated (their spread widens without reverting). Cointegrated assets share a common stochastic trend — they are mathematically linked in a way that prevents indefinite divergence.

Formula

The Johansen Test checks for cointegration among N time series. The two test variants are:

Trace Statistic:

Maximum Eigenvalue Statistic:

Where is the sample size, are the ordered eigenvalues from the reduced-rank regression, and is the number of cointegrating vectors being tested (null hypothesis: r = 0 means no cointegration).

If the test statistic exceeds the critical value at the 95% confidence level, reject the null hypothesis of no cointegration.

For a pair of two assets, the Engle-Granger two-step method is simpler: run OLS regression of Price_A on Price_B, then test the residuals for stationarity using the ADF unit root test. If residuals are stationary — cointegration exists.

Inputs

  • Two (or more) price series: Must be non-stationary I(1) processes — confirm with ADF unit root test first. If the series are stationary (I(0)), cointegration is not applicable.
  • Lookback window: Minimum 252 days; 2–5 years preferred for stable estimates
  • Lag order for VECM: Selected using AIC/BIC criterion in the Vector Error Correction Model

Parameters

Parameter Default Range Impact
Lookback period 252 days 252–1260 days Longer = more statistically reliable; shorter may miss structural breaks
Confidence level 95% 90%, 95%, 99% Higher confidence = fewer pairs identified; greater certainty for those found
Lag order (k) AIC-selected 1–10 lags Incorrect lag order invalidates the test — always use information criterion

Output

  • Test decision: Cointegrated (fail to reject null = no cointegration; reject null = cointegration exists)
  • Hedge ratio: The slope coefficient from OLS regression of Price_A on Price_B — how many units of B to sell for each unit of A held long
  • Spread series: Price_A minus hedge_ratio times Price_B — should be stationary if cointegrated
  • Z-score: Standardized spread — (spread - mean) / standard deviation of spread

Visual Behavior

Plot the Z-score of the spread as a separate panel. A stationary Z-score oscillates around 0 within a stable band. Entry zones appear when the Z-score breaches ±2.0. Exit signals appear when the Z-score returns to 0.


Section 2: Interpretation & Signals

Correlation vs. Cointegration — Side by Side

Property Correlation Cointegration
What it measures Direction of co-movement of returns Stability of long-run price relationship
Input series Returns (log or simple) Price levels (non-stationary I(1))
Can be high with no mean-reversion Yes — two trending assets both look correlated No — cointegration requires mean-reversion of spread
Pairs trading sufficient? No — correlation alone can't confirm spread reversion Yes — stationary spread = tradeable mean reversion
Test used Pearson r statistic Johansen trace or Engle-Granger ADF

Step-by-Step Pairs Trade Execution

Step 1 — Screen: Find candidate pairs with Pearson r above 0.7 over the past 252 days. This is a necessary but not sufficient screen.

Step 2 — Verify I(1): Run ADF unit root test on each price series. Both must be non-stationary (ADF fails to reject null = series has a unit root = I(1)). If either series is stationary, cointegration testing is invalid.

Step 3 — Test cointegration: Run Johansen test (or Engle-Granger) at 95% confidence. Confirm the test statistic exceeds the critical value.

Step 4 — Calculate hedge ratio: Use OLS regression: Price_A = alpha + beta times Price_B + residual. The slope beta is the hedge ratio.

Step 5 — Build the spread: Spread = Price_A minus beta times Price_B. Calculate rolling mean and rolling standard deviation of the spread.

Step 6 — Z-score: Z = (Spread minus rolling_mean) / rolling_std.

Step 7 — Trade signals:

  • Z below -2.0: Long Asset A, short (beta units of) Asset B — spread expected to revert upward
  • Z above +2.0: Short Asset A, long (beta units of) Asset B — spread expected to revert downward
  • Z between -0.5 and +0.5: Exit position — spread has reverted to mean

Z-Score Signal Zones

Z-Score Signal
Below -2.0 Enter long spread (long A, short B)
-2.0 to -0.5 Hold long spread position
-0.5 to +0.5 Exit — spread at mean
+0.5 to +2.0 Hold short spread position
Above +2.0 Enter short spread (short A, long B)
Beyond plus or minus 3.0 Stop-loss consideration — spread may not revert
⚠️ WARNING
A Z-score beyond ±3.0 on a cointegrated pair can indicate a fundamental break in the relationship — not just a wider-than-usual divergence. The cointegration relationship can break permanently if a fundamental event changes one asset (merger, bankruptcy, index removal). Review the economic rationale for the pair whenever the Z-score exceeds ±3.0.

Chart — Pairs Spread Z-Score (XOM vs CVX)

XOM vs CVX Spread Z-Score — Pairs Trade Signals (2023)


Section 3: Pass vs. Live — Real-Time Reliability

None — historical cointegration test does not change; spread and Z-score update daily
Repaint Risk
The cointegration relationship is tested over historical data — it cannot predict when or if the pair will diverge
Lag
Spread Z-score updates once per day at close — no intrabar recalculation needed
Confirmation Timing
Statistical arbitrage, market-neutral pairs trading, relative value hedge funds
Best Use
Pairs with low liquidity, wide bid-ask spreads, or upcoming binary events (earnings, FDA decisions)
Avoid

The Johansen test result is fixed once calculated on the historical window. The Z-score of the spread updates daily as new prices arrive. There is no repaint. The key real-time risk is relationship breakdown — not a calculation artifact.


Section 4: Practical Use Cases

Setup: Use intraday cointegrated pairs — same-sector ETFs (e.g., XLF vs. KBE for banks) Signal: Z-score on 20-bar intraday window breaches ±1.5 standard deviations Entry: Long lagging asset, short leading asset, verified on 15-minute close Exit: Z-score returns to 0 or reverses to the entry side Key rule: Use tight stops — intraday cointegration is weaker than daily; keep position small and spread cost manageable

Real example: XOM and CVX have been cointegrated over multiple years as the two largest US oil majors sharing the same commodity price input, capital structure, and regulatory environment. In April 2023, the Z-score of their spread dropped to -2.3, signaling XOM had lagged CVX unusually. Entering long XOM, short CVX in beta-adjusted equal dollar amounts, the spread reverted to 0 by July 2023 — a clean 3-month pairs trade with minimal market directional exposure.


Section 5: Pseudo Code

INPUT: price_a[], price_b[], confidence=0.95

PROCESS:
  Step 1: Verify both series are I(1) — non-stationary
            from statsmodels.tsa.stattools import adfuller
            adf_a = adfuller(price_a)
            adf_b = adfuller(price_b)
            # Both should FAIL to reject null (p > 0.05 = non-stationary = I(1))

  Step 2: Run Johansen cointegration test
            from statsmodels.tsa.vector_ar.vecm import coint_johansen
            result = coint_johansen(endog=[[a, b] for a, b in zip(price_a, price_b)], det_order=0, k_ar_diff=1)
            trace_stat = result.lr1[0]   # Trace statistic for r=0
            crit_value = result.cvt[0, 1]  # 95% critical value
            cointegrated = (trace_stat > crit_value)

  Step 3: Calculate hedge ratio via OLS (if cointegrated)
            import numpy as np
            hedge_ratio = np.polyfit(price_b, price_a, 1)[0]

  Step 4: Build spread and Z-score
            spread = price_a - hedge_ratio * price_b
            rolling_mean = rolling_average(spread, window=60)
            rolling_std = rolling_stdev(spread, window=60)
            z_score = (spread - rolling_mean) / rolling_std

  Step 5: Generate signals
            if z_score[-1] < -2.0: signal = "LONG SPREAD (long A, short B)"
            if z_score[-1] > +2.0: signal = "SHORT SPREAD (short A, long B)"
            if abs(z_score[-1]) < 0.5: signal = "EXIT"

OUTPUT: cointegrated (bool), hedge_ratio (float), spread[], z_score[], signal (str)
EDGE CASES:
  - If either series is I(0) (stationary): cointegration test is invalid — return error
  - If test does not reject null at 95%: try 90% or extend the historical window
  - Recalculate hedge ratio monthly — it drifts as the relationship evolves
  - Stop-loss if abs(z_score) exceeds 3.5 and has not reversed within 10 bars

Section 6: Parameters & Optimization

Johansen Test Configuration

Setting Default Notes
det_order 0 (no trend in VECM) Use 0 for financial prices; use -1 if you suspect no deterministic trend
k_ar_diff 1 Number of lagged differences in VECM — select using AIC/BIC
Confidence 95% 99% for conservative filtering; 90% for wider candidate set
Minimum history 252 bars 2+ years preferred — longer samples give more reliable test

Z-Score Window Choices

Window Behavior Use
20-day rolling spread Fast mean, noisy Z-score Short-term intraday pairs
60-day rolling spread Standard swing pairs window Most daily pairs strategies
252-day rolling spread Near-static reference level Long-term position pairs
What is the difference between the Johansen test and the Engle-Granger test?

The Engle-Granger test is a two-step method for testing cointegration between exactly two series. It runs an OLS regression and then tests the residuals for stationarity with ADF. It is simple but has lower statistical power and cannot handle more than two series. The Johansen test works for N series simultaneously, provides the hedge ratios directly as eigenvectors, and has greater statistical power. Use Johansen for all production pairs trading work.

How do I select the lag order k_ar_diff?

Run the Johansen test with k_ar_diff values from 1 to 10. For each, calculate the AIC: AIC = 2k minus 2 times log-likelihood. Choose the k that minimizes AIC. Most financial time series pairs need k of 1 to 3 lags. Choosing k too large wastes degrees of freedom; too small leaves autocorrelation in the residuals.

When does cointegration break?

Cointegration breaks when the economic rationale for the pair fundamentally changes: a merger announcement (two companies become one), a bankruptcy, a commodity regime shift, or a regulatory change that affects only one asset. Monitor pairs monthly. If the Johansen test no longer rejects the null at 90% confidence, assume cointegration has broken and close all positions immediately.


Section 7: Synergies & Conflicts

Works Well WithAvoid Combining With
Pearson CorrelationUse as pre-screen — only test cointegration for pairs with r above 0.7 to reduce computation and false positives
ADF Unit Root TestRequired prerequisite — both series must be I(1) before Johansen is valid
BetaCalculate Beta-adjusted position sizes for both legs to keep the pair truly market-neutral
Z-Score entry/exit rulesThe Z-score of the spread is the complete trade execution framework — clean, rule-based, backtestable
Short-history dataFewer than 252 bars makes the Johansen test statistically unreliable — spurious cointegration likely
Illiquid pairsWide bid-ask spreads consume the mean-reversion profit before it accrues — both legs need tight spreads and high volume
Binary event exposureEarnings, FDA decisions, or merger speculation on either leg can cause permanent spread divergence — check catalysts before entering

Section 8: Common Mistakes

Mistake Root Cause Solution
Skipping the ADF unit root test Applying Johansen to stationary series invalidates the test Always run ADF first — reject unit root on either series = stop, do not proceed
Using correlation to confirm a pair without cointegration High correlation does not imply mean-reverting spread Run Johansen test — correlation is necessary but not sufficient
Static hedge ratio Hedge ratio drifts as the fundamental relationship evolves Recalculate hedge ratio monthly using the most recent 252-day regression
Ignoring spread transaction costs Mean reversion often delivers 1–3% gains; costs of 0.5% or more per leg destroys the edge Calculate round-trip cost before entering — both legs, bid-ask, and borrow cost for the short
Over-fitting cointegration to historical data Running hundreds of pairs tests finds spurious cointegration by chance Apply Bonferroni correction or split-sample validation — test on out-of-sample data before trading

Section 9: Cheat Sheet

ℹ️ INFO
**Cointegration Test (Johansen)**

USE WHEN: Identifying pairs for statistical arbitrage, testing whether a spread is mean-reverting before trading it, constructing market-neutral books

AVOID WHEN: Either series is stationary (I(0)), history is shorter than 252 bars, either asset has a binary catalyst upcoming

ENTRY SIGNAL: Spread Z-score below -2.0 = long spread (long A, short B); Z-score above +2.0 = short spread (short A, long B)

EXIT SIGNAL: Z-score returns to between -0.5 and +0.5 (mean reversion complete); stop-loss at Z-score beyond ±3.5 if not reverting

PARAMETERS: Johansen test at 95% confidence; k_ar_diff selected by AIC; 60-day rolling window for Z-score; hedge ratio recalculated monthly

CONFLUENCE: Confirm with Pearson r above 0.7, verify both series are I(1) via ADF, size both legs to market-neutral beta

RISK: Cointegration relationships break permanently — monitor monthly and close immediately on test failure

BEST TIMEFRAME: Daily bars for calculation; applicable to swing (daily) and position (weekly) timeframes