Most traders approach edge discovery backwards. They look at price charts, find a pattern that repeats a few times, and declare it an edge. Then they backtest it, see a positive return curve, and start trading it. The pattern fails within weeks. The reason it fails was never understood, because it was never real.

The correct process runs in the opposite direction. You start with a mechanism — a reason the market should be inefficient in a specific, predictable way — then you test whether that mechanism is observable in price data. The mechanism is the hypothesis. The backtest is the test. The distinction matters enormously.

Before building any signal, it is worth understanding what a trading edge is — specifically, that an edge is not a pattern but a repeatable structural or behavioral inefficiency. That foundation shapes everything that follows.

Where Edges Come From

Markets are efficient enough that random patterns quickly disappear under the weight of capital trying to exploit them. Genuine edges survive because they are rooted in one of three persistent sources of inefficiency.

Behavioral edges arise from cognitive patterns that repeat across market participants. Humans overreact to recent news, anchor to round numbers, and exhibit loss aversion at predictable thresholds. These behaviors are not going away — they are features of human cognition, not quirks of a particular market period. Momentum anomalies, post-earnings drift, and short-term mean reversion after sharp selloffs all have behavioral roots. Because the mechanism is tied to human psychology rather than market structure, behavioral edges tend to be more durable.

Structural edges come from the mechanics of how markets operate. Index funds must buy stocks added to the index on the inclusion date; the flow is predictable and the price pressure is real. Options market makers hedge delta exposure by trading the underlying, creating mechanical price pressure around major strikes near expiry. ETF creation and redemption arbitrage produces specific patterns in correlated assets. Structural edges are often more precise — the mechanism is clearer and the timing is tighter — but they can be eliminated if the structural constraint changes.

Informational edges arise from processing data faster or more completely than other participants. Alternative data — satellite imagery, credit card transaction flows, web traffic, natural language processing of regulatory filings — can provide a signal before the consensus has formed. These edges are expensive to source and erode as the data becomes commoditized, but they represent a real class of inefficiency.

These three categories are not mutually exclusive. Post-earnings drift has both behavioral roots (investors underreact to fundamental information) and structural elements (analyst estimate revisions lag the event). Understanding which source is driving the edge tells you how stable it is likely to be.

Why Most Edge Discovery Fails

The failure mode is almost always the same: the discovery process starts with the data and works backwards to a narrative.

A trader looks at historical prices, tries 50 indicator combinations, finds the three that performed best, and declares an edge. The problem is that out of 50 random tests, statistics guarantee several will look good by chance. The trader has not found an edge. They have found the best noise in the sample.

This is called data mining bias, and it compounds with backtest length. A longer backtest feels more reliable, but if the same dataset was used to generate and test the hypothesis, the validation is circular. The strategy has been fit to the data, not validated against it.

A related failure is the overfitted mechanism. The trader invents a post-hoc explanation for a pattern that was discovered empirically. The explanation feels plausible, so it gets accepted. But the true mechanism — if there is one — would have generated the hypothesis before the data was examined, not after.

The test for whether your edge is real or an artifact is simple: form the hypothesis before you look at the data. If you cannot articulate the mechanism clearly enough to make a directional prediction before seeing the price series, you are in the wrong order.

The Edge Discovery Process

The process that consistently separates real edges from artifacts follows the structure of a scientific experiment. It has four distinct stages that must be executed in order.

Stage 1 — Form a hypothesis with a stated mechanism. Write the hypothesis in two parts: the claim (what price behavior you expect to observe) and the mechanism (why that behavior should exist, in terms of market structure or human behavior). Example: "After strongly positive earnings surprises, price drifts upward over the following 5 to 15 trading days because analyst consensus revisions and institutional rebalancing are slow relative to the initial price reaction." Both parts must be written before any data is examined.

Stage 2 — Test on in-sample data. Define a specific date range as your in-sample period. Run the test on that range only. Measure the relevant statistics: edge frequency, average gain per instance, variance, and performance during drawdown periods. If the hypothesis is not supported in-sample, stop. Do not adjust parameters to make it work — that is fitting, not testing. If it is supported, move to stage three.

Stage 3 — Validate on out-of-sample data. Reserve a second, completely separate date range that was not examined during hypothesis formation or in-sample testing. Run the same strategy, with the same parameters, on this unseen data. A strategy that works in-sample and degrades significantly out-of-sample is almost certainly overfit. A strategy that holds up — even with reduced performance, which is expected — is showing genuine signal.

Stage 4 — Document the mechanism and its falsification conditions. Write down, precisely, what conditions would indicate the edge has stopped working. This could be a statistical threshold (edge frequency drops below X over a rolling N-period window), a market condition (the structural constraint that creates the edge is removed), or a behavioral shift. This documentation is not optional. Without it, you cannot distinguish between a dead edge and a temporary drawdown when performance softens.

When You've Found Something Real

A genuine edge will show a specific profile that is different from a backtest artifact.

The out-of-sample Sharpe ratio will be lower than the in-sample Sharpe ratio, but still positive. This is expected — you are measuring the edge's signal, not the noise you fit to. A result where the out-of-sample ratio is higher than the in-sample ratio is suspicious: it usually means the out-of-sample period happened to be unusually favorable, not that the edge is stronger than it appeared.

The edge should be explainable to a skeptic without reference to the backtest results. If your only evidence that the mechanism exists is that the backtest shows positive returns, you have a circular argument. The mechanism needs independent support — academic literature, market structure documentation, observable behavioral data, or logical necessity given known constraints.

Parameter sensitivity should be low. A real edge works across a reasonable range of parameter values. An edge that produces positive returns only when the lookback window is exactly 23 periods, the entry threshold is exactly 0.74, and the exit is triggered on the close rather than the open is almost certainly overfit. Real signal is robust to small parameter changes. Noise is brittle.

Transaction costs should not eliminate the edge when estimated realistically. Include spread, slippage proportional to your expected position size, and any holding costs. Many edges that look strong gross of costs are neutral or negative net of costs at realistic trade sizes.

The Oyamori Approach

Oyamori applies this framework systematically across a catalog of edges. Each entry in the catalog includes the stated mechanism, the market conditions under which the edge is expected to be active, and the falsification criteria that would indicate the edge is no longer valid.

The platform does not generate edge hypotheses for you. Hypothesis generation requires judgment about market structure and behavior — it is the part of the process that cannot be automated without reproducing the same data-mining errors that produce false edges at scale. What Oyamori does is provide the infrastructure to test hypotheses rigorously, track out-of-sample performance against in-sample benchmarks, and monitor live deployments against the documented falsification conditions.

Finding an edge and deploying one are two different problems. A well-tested hypothesis tells you there is a signal worth pursuing. Deploying it requires position sizing, risk management, and operational infrastructure that determines whether the signal translates to real returns. The edge discovery process described here is the beginning of that chain, not the end.

Next: Mean Reversion Explained →