Sentiment & AI
Regime Detection with Sentiment
The most effective risk control is not a stop-loss. It is not a position size limit. It is knowing, before a position is opened, whether the current market environment is one where your strategy has a positive expected outcome.
Every systematic strategy has a regime it was built for. A mean-reversion strategy assumes that prices oscillate around a stable center and that deviations correct. It was calibrated on data from periods when that assumption held. When the regime shifts — when a credit crisis, a pandemic-level shock, or a sustained geopolitical escalation pushes markets into a trending panic — the assumption breaks, and the strategy loses systematically. Not because the strategy is wrong, but because it is operating outside its validity domain.
The challenge is detecting when the domain boundary has been crossed. Price-based regime signals — volatility indices, trend filters, correlation breakdowns — detect the shift after it has already affected price. Sentiment-based regime signals detect it as the information environment changes, which precedes the price response. This is the primary reason aggregate sentiment is a useful regime input: it is earlier.
Why Regime Matters
Most published strategy backtests do not segment performance by regime. They report Sharpe ratio, drawdown, and win rate across the entire test period — which averages good regimes and bad regimes into a single number. This averaging obscures the fact that most strategies have dramatically different characteristics depending on the market environment.
A mean-reversion strategy that has a Sharpe ratio of 1.4 over a ten-year backtest may have a Sharpe ratio of 2.2 in calm, low-volatility regimes and a Sharpe ratio of -0.8 in crisis periods. If the crisis periods represent 15% of trading days but 40% of total drawdown, the aggregate backtest is hiding a severe regime dependency. Running that strategy with no regime filter means periodically deploying capital into an environment where it is expected to lose.
The same problem affects momentum strategies in the opposite direction. Momentum strategies work well in trending regimes — persistent directional moves driven by clear information or structural shifts. They fail in choppy, mean-reverting environments where the trend signals are generated by noise. A momentum strategy that looks excellent over a period containing the 2020-2021 bull run looks substantially worse when that period is removed from the test.
Regime detection is not about predicting the future. It is about classifying the current environment accurately enough to make binary decisions: is this a regime where my strategy has positive expected value, or is it a regime where I should be flat?
Aggregate Sentiment as a Regime Indicator
Single-stock sentiment is a signal about one company. Aggregate sentiment — computed across a broad basket of tickers by volume — is a signal about the information environment itself.
When aggregate sentiment across the top 100 or 200 stocks by volume turns persistently negative, it is not a coincidence. It reflects a sustained shift in how information is being generated and interpreted across the entire market. Analysts are downgrading. Coverage is shifting to risk factors and downside scenarios. Earnings calls are flagging uncertainty. Macro commentary is negative. This is not noise — it is a regime-level information signal.
The stability advantage of aggregate sentiment over single-stock sentiment is significant. Any individual ticker can have elevated negative sentiment for idiosyncratic reasons — a product recall, a management change, a sector rotation. These do not represent regime changes. When the average sentiment across 100 diverse tickers is negative and stays negative across multiple trading sessions, the signal is more robust: the information environment has shifted broadly, and strategies calibrated on neutral or positive information environments should be treated with caution.
The contrast with price-based regime signals is instructive. The VIX — the standard regime indicator for many practitioners — measures implied volatility, which is derived from options pricing. It reacts to price moves that have already happened. Aggregate sentiment measures the information flow that causes price moves, and it can shift before the price response is visible. In practice, aggregate sentiment often turns negative one to three sessions before volatility indices register a regime shift in crisis periods. That lag advantage is not large, but in terms of drawdown reduction it can be material.
Building the Sentiment Market Index
The sentiment market index is a rolling weighted average of daily sentiment scores across a target basket of tickers. The construction has three parameters: the basket size, the rolling window, and the decay function.
Basket size is typically the top 100 or 200 stocks by average daily volume over the prior 30 days. This basket represents the highest-participation part of the market — the securities where aggregate sentiment will be most responsive to broad market information. A narrower basket (top 20 stocks) is too concentrated in megacap names to capture sector-level shifts. A wider basket (all listed equities) adds noise from thinly-traded names where sentiment coverage is sparse and unreliable.
The rolling window governs the regime classification's sensitivity. A short window (3–5 days) produces a regime indicator that responds quickly to sentiment shifts — useful for capturing rapid deteriorations like a flash crash or a sudden geopolitical event — but also more prone to false positives from brief sentiment dips that do not represent structural regime changes. A longer window (10–20 days) is more stable but has more lag. An exponential moving average with a span of 5 sessions is a reasonable starting point: it weights recent sessions more heavily than a simple moving average while avoiding the sensitivity of a 3-day window.
The daily sentiment average for the basket is computed as the mean score across all covered tickers on each date. Tickers with no coverage on a given day are excluded from the denominator — sentiment absence is not the same as neutral sentiment, and including zero-coverage tickers in the mean would suppress the index.
Risk-On vs Risk-Off Classification
Classification translates the continuous sentiment index into a discrete regime state: risk-on, risk-off, or neutral. The threshold values that define these states need to be calibrated against historical data.
A starting framework: sentiment index above +0.15 = risk-on, below -0.15 = risk-off, between -0.15 and +0.15 = neutral. These thresholds should be calibrated using the historical sentiment distribution for your basket. The goal is for risk-on periods to include the majority of low-volatility trending and range-bound environments, and for risk-off periods to include the identifiable crisis and correction periods in your backtest data.
The neutral band is intentional. The regime classification is not a binary on/off — the neutral zone is a genuine state where the expected value of running a strategy is unclear. Some practitioners treat neutral as equivalent to risk-on; others as equivalent to risk-off; others reduce position sizes in neutral but do not exit entirely. The right choice depends on the strategy's sensitivity to regime and the practitioner's preference for false positives (running in a bad regime) versus false negatives (sitting out a good regime).
Persistent regime stays matter more than individual daily classifications. A single day of sub-threshold sentiment in a sustained risk-on period is not a regime change — it is likely a noise event or a brief sector-specific deterioration. Requiring two or three consecutive days below the risk-off threshold before triggering a regime change flag reduces the number of whipsaws at the cost of some lag.
Using the Regime Index as an Execution Gate
The regime index functions as an AND condition on strategy entry. When the regime is risk-on, the strategy executes normally. When the regime is risk-off or neutral, the strategy generates no new positions — existing positions may be held until natural exits or closed immediately depending on the strategy's exit logic.
This structure is sometimes described as a "master switch" for strategy execution. It is more precise to call it an outer filter: the strategy's signal logic continues to run and generate potential trades, but the regime gate decides whether those signals become executable entries. This has a practical benefit: when the regime returns to risk-on, the strategy immediately begins executing on fresh signals without a warm-up period.
The risk reduction from this approach is asymmetric. A strategy that would have lost 18% in a crisis drawdown, gated by a regime filter, might avoid 12% of that drawdown — at the cost of missing the first 1–2% of the recovery because the sentiment index returns above threshold slightly after the price bottom. The tradeoff is typically favorable: crisis drawdowns are deeper and more sustained than the recovery lag is costly.
The aggregate nature of the signal also means it is relevant across strategies. A practitioner running three independent strategies — a gap-and-go, a mean-reversion, and a momentum strategy — can apply the same regime gate to all three simultaneously. When the sentiment market index enters risk-off, all three strategies pause. This reduces the infrastructure overhead of per-strategy regime logic and ensures consistency across the portfolio.
The Lag Problem
The regime index has inherent lag. Sentiment scores are derived from articles that are written, published, and ingested. The ingest pipeline has latency. The rolling window smooths out noise but adds delay. A genuine regime shift will appear in the sentiment index one to three sessions after it has already begun — and the regime gate will not trigger until the smoothed index crosses the threshold, which adds more sessions of delay on top of the raw signal latency.
The practical implication is to set thresholds conservatively. An aggressive threshold that requires only mild negative sentiment to trigger risk-off will fire quickly but will also produce false positives — periods classified as risk-off that turn out to be brief corrections the strategy could have traded through profitably. A conservative threshold requires deeper and more sustained negative sentiment before triggering, reducing false positives at the cost of a slower response to genuine regime shifts.
The asymmetry of consequences should guide the calibration. A false positive (entering risk-off during a correction that quickly recovers) costs missed trades. A false negative (staying in risk-on during a genuine crisis) costs drawdown. Drawdown in an automated system is typically more damaging — to the strategy's risk profile, to capital preservation, and to the practitioner's ability to continue operating — than missed trades. When uncertain, err conservative.
Python Implementation
import pandas as pd
def build_sentiment_index(
sentiment_df: pd.DataFrame, # columns: ticker, date, score
top_n: int = 100,
window: int = 5
) -> pd.Series:
daily_avg = sentiment_df.groupby('date')['score'].mean()
return daily_avg.ewm(span=window).mean()
def classify_regime(
sentiment_index: pd.Series,
risk_on_threshold: float = 0.15,
risk_off_threshold: float = -0.15
) -> pd.Series:
regime = pd.Series('neutral', index=sentiment_index.index)
regime[sentiment_index >= risk_on_threshold] = 'risk_on'
regime[sentiment_index <= risk_off_threshold] = 'risk_off'
return regime
def apply_regime_gate(signal: pd.Series, regime: pd.Series) -> pd.Series:
return signal.where(regime == 'risk_on', 0) # zero out signal in non-risk-on regimes
The build_sentiment_index function computes the daily average across the basket and applies exponential smoothing. In a production system you would filter sentiment_df to the top-N tickers by volume before passing it to this function.
The classify_regime function is threshold-based and operates on the smoothed index. The conservative approach to lag — requiring persistent sub-threshold readings rather than single-session crossings — can be implemented by adding a rolling(3).sum() step: require three consecutive risk-off days before activating the flag.
apply_regime_gate zeroes out the strategy signal series wherever the regime is not risk-on. This preserves the signal computation pipeline while preventing execution in bad regimes.
The news sentiment fundamentals article covers the structural reasons why sentiment signals carry information that price-based signals do not. The regime index is one application of that information layer at a market-wide scale.
The Oyamori Approach
Oyamori's sentiment infrastructure is built to support both single-stock and aggregate sentiment use cases. The Newsvibe index — a market-wide sentiment aggregate computed from the platform's news coverage — is available as a regime signal that strategy logic can query directly.
The design intent is to make regime-awareness a default property of strategies built on the platform, rather than an optional add-on. When a strategy is configured with a regime gate, the platform evaluates the current sentiment index state before routing any entry signals to the execution layer. The strategy author sets the thresholds; the platform enforces them consistently across all entry conditions.
The regime index also feeds the platform's monitoring layer. When the sentiment market index crosses into risk-off territory, the monitoring dashboard surfaces this as a contextual annotation — not an alert, but a visible state indicator that gives the practitioner context for understanding why their strategies are paused and for how long the current regime has been active.
Algorithmic trading carries substantial risk. This article is educational, not investment advice.