Newsvibe
How Newsvibe Works
Most sentiment tools hand you a number and ask you to trust it. You receive a score, perhaps a direction label, sometimes a confidence percentage — but no explanation of how those values were derived. When the signal is wrong, you have nothing to audit. When it is right, you cannot tell whether the system understood the news or got lucky.
Newsvibe is designed differently. Every field in the output payload corresponds to a specific processing step with a defined meaning. This article walks through each step in order: how news enters the system, how it is scored, how urgency and confidence are derived, and what a developer actually receives when the signal fires. Understanding the anatomy of the signal is a prerequisite for using it correctly.
News Ingestion
Newsvibe ingests from four primary source categories: financial news wires, earnings releases and transcript services, SEC filings, and macro announcements.
Financial news wires include the real-time feeds that publish breaking corporate and market news — earnings previews, analyst rating changes, product announcements, merger disclosures. These arrive at high frequency and require deduplication at the content level, not at the URL level alone, because the same story is often published across multiple outlets within minutes of each other.
Earnings releases and transcripts arrive on a schedule tied to the earnings calendar. When a company reports, both the press release and the prepared remarks from the earnings call are ingested. Transcript processing is separated from wire processing because the information density and structure differ significantly — an earnings call contains forward guidance, analyst Q&A, and management commentary that carry different signal weights than the headline EPS number.
SEC filings — 8-K, 10-Q, 10-K, proxy filings — are ingested on a document-type-specific schedule. 8-K filings are treated as breaking events because they contain material non-public disclosures that are being made public for the first time. 10-Q and 10-K filings are scheduled and predictable, so they receive less urgency weighting but more depth of analysis.
Macro announcements — Federal Reserve decisions, CPI releases, non-farm payroll, GDP revisions — are tracked on a fixed economic calendar. These events affect broad sentiment across sectors rather than individual tickers, so their output maps to sector-level and market-wide signals rather than single-company scores.
Before scoring begins, each ingested document passes through two filters. Deduplication removes near-duplicate content across sources using a hash-based comparison on the normalized article body, not the headline alone. Source quality scoring applies a credibility weight based on the source's historical accuracy for the relevant company and sector — a regional business publication covering a local company may have better forward-looking accuracy than a national outlet that rarely covers that company directly.
Sentiment Scoring
Scoring operates on the deduped, quality-weighted document after ingestion. The process has three sub-steps: ticker attribution, event categorization, and base sentiment scoring.
Ticker attribution answers the question: which company or companies does this document affect, and how directly? A document may mention a dozen tickers. Attribution determines whether each mention is the subject of the news (primary attribution, weight 1.0), a peer or competitor referenced for comparison (secondary attribution, weight 0.4), or a tangential mention in an industry context (tertiary attribution, weight 0.1). The same document can produce signals for multiple tickers at different attribution weights.
Event categorization classifies the primary event type in the document. The categories used by Newsvibe are: earnings_beat, earnings_miss, earnings_in_line, guidance_raise, guidance_cut, analyst_upgrade, analyst_downgrade, macro_positive, macro_negative, ma_activity, regulatory_approval, regulatory_rejection, executive_change, and sector_news. Event type is used downstream for routing — a strategy that trades only on analyst actions receives only analyst_upgrade and analyst_downgrade events, not earnings events.
The base sentiment score is the output of the NLP model applied to the document after categorization. The score runs from -1.0 (maximally negative) to +1.0 (maximally positive), with 0 indicating neutral or balanced sentiment. The NLP model is trained on financial text specifically — it is not a general-purpose sentiment classifier applied to finance. General-purpose models misclassify financial language regularly because the polarity of words in financial contexts differs from everyday usage. The word "volatile" is neutral in common speech; in a financial document covering an earnings call, it is consistently negative.
For context, how this signal layer connects to actual strategy design is covered in news sentiment and algo trading.
Urgency and Confidence
The base sentiment score alone is insufficient for trading signal generation. Two additional dimensions are required: urgency and confidence.
Urgency measures how time-sensitive the signal is. A breaking 8-K filing that discloses a major acquisition has high urgency — the information is new, the price has not yet moved to reflect it, and the signal value decays quickly as the market processes the news over the following minutes and hours. An analyst upgrade published in a morning note at 7:00 AM has lower urgency — the analyst's view was likely known to institutional clients the day before, and the signal represents information that has been partially priced in. Urgency is a function of the source type, the event type, and the elapsed time since the primary disclosure.
Urgency levels are output as: breaking, high, medium, low. Strategies that require fresh information set a minimum urgency threshold. Strategies that trade slower, multi-day trends are not penalized by lower urgency scores.
Confidence measures how clearly the document expresses a directional sentiment. High confidence means the document contains unambiguous language expressing a single direction — strong positive language in an earnings beat with raised guidance, or explicit negative language in a regulatory rejection notice. Low confidence means the document contains mixed signals, hedged language, or contradictory claims that the model cannot resolve to a clean direction. The confidence score runs from 0.0 to 1.0. A score below 0.5 indicates that the sentiment direction is uncertain; strategies should treat such signals as low-weight or exclude them entirely.
Confidence is the primary filter for noise. A base sentiment score of +0.8 with confidence 0.3 is a noisy signal that should not drive position sizing. The same score with confidence 0.9 is a high-quality directional signal.
Signal Anatomy — What a Developer Receives
Each fired signal is a structured JSON payload. Here is an example:
{
"ticker": "AAPL",
"score": 0.72,
"tier": 4,
"urgency": "high",
"confidence": 0.85,
"event_type": "earnings_beat",
"source": "SEC Filing",
"timestamp": "2026-05-18T06:32:11Z"
}
ticker — The primary attributed company for this signal. Signals with secondary or tertiary ticker attribution carry a separate attribution_weight field indicating their relationship to the primary subject.
score — The base sentiment score from -1.0 to +1.0. Positive values indicate net-positive sentiment for the attributed ticker. Negative values indicate net-negative. Values between -0.3 and +0.3 are in a near-neutral band where directional reliability is lower.
tier — A five-level classification of signal strength and source quality: Tier 1 is the lowest (low score, low confidence, low-quality source); Tier 5 is the highest (strong score, high confidence, primary-source disclosure). Tier 4 and 5 signals are suitable for direct strategy input. Tier 1 and 2 signals are informational only. Most strategies set a minimum tier threshold at Tier 3.
urgency — Time sensitivity of the signal: breaking, high, medium, or low. Strategies using intraday execution weight high and breaking urgency signals at higher priority.
confidence — The model's confidence in the directional score, from 0.0 to 1.0. A confidence below 0.6 should trigger reduced position sizing or exclusion in most strategies.
event_type — The categorized event type. Used for signal routing to relevant strategies. A strategy trading only regulatory events subscribes to regulatory_approval and regulatory_rejection only.
source — The originating source category: SEC Filing, Earnings Transcript, News Wire, Analyst Report, or Macro Announcement. Source type affects the default urgency and trust weight applied.
timestamp — UTC ISO 8601 timestamp of the original document publication, not the time Newsvibe processed it. Signal age is computed from this field. Strategies define a maximum signal age beyond which the signal is discarded regardless of score or tier.
Where NLP Fails
No NLP-based sentiment system is accurate at the edges of language. The failure modes are predictable, which makes them manageable — but they must be acknowledged.
Sarcasm and irony do not translate well to sentiment models trained on financial text. Financial writing rarely uses them, which means the training corpus underrepresents these patterns. A headline written sarcastically — the sort that appears in financial opinion journalism — may score opposite to its actual meaning. The confidence score provides a partial defense: sarcastic content tends to produce lower confidence because the directional signals are inconsistent. But this is a heuristic, not a guarantee.
Conflicting reports on the same event are a sharper problem. When two credible sources report opposite interpretations of the same event simultaneously — both quoting legitimate experts — the model scores each article independently and produces signals that point in opposite directions. The net score averages out near zero, which is the correct statistical behavior, but it may not reflect that one source is more authoritative than the other. Source quality weighting mitigates this but does not eliminate it entirely.
Domain-specific jargon in biotech and cryptocurrency requires specialized vocabulary handling. A drug described as having "disappointing Phase 2 results with a statistically significant primary endpoint miss" is unambiguously negative — but only to a reader who understands clinical trial language. General financial NLP models may score "statistically significant" as positive and miss the negative context. Newsvibe maintains domain-specific vocabulary layers for pharma, biotech, and crypto, but the jargon evolves faster than training cycles. Edge cases occur.
Headline-to-body divergence is a structural limitation that affects all article-based sentiment systems. A headline may be written to attract clicks with an alarming or optimistic framing that is contradicted or heavily qualified by the article body. Newsvibe weights article body more heavily than headline for most source types, but headline-only processing is sometimes unavoidable for very short documents. When headline and body diverge, confidence scores are penalized, but the resulting signal may still be directionally misleading.
The Oyamori Approach
Transparency is a design constraint, not a feature addition. Black-box sentiment tools — Trade Ideas Holly, Bloomberg Intelligence — produce outputs without exposing their derivation. That is acceptable when you are using the signal as a suggestion. It is not acceptable when the signal is wired into automated execution. When a strategy makes a wrong trade based on a wrong signal, the developer needs to be able to trace the error back to its source.
Every Newsvibe field is auditable. The event_type tells you what the model classified. The confidence tells you how clearly it read the document. The source tells you which input generated the signal. When a signal fires incorrectly, you have a complete record to examine. This is a different design philosophy from tools that optimize for apparent accuracy at the expense of interpretability.
The limitation of any NLP-based system is that language is harder to parse than price. Price is a number with a clear direction. Language has context, history, tone, and domain-specific meaning that numerical models approximate rather than fully capture. Knowing where those approximations break down — and the specific edge cases where they do — is what allows a developer to build strategies that treat Newsvibe signals as high-quality inputs rather than black-box oracles.
Newsvibe integrates directly into the Oyamori execution layer. The signal payload is the contract between the sentiment engine and the strategy. Understanding each field means you can use the contract correctly rather than trusting it blindly.
Algorithmic trading carries substantial risk. This article is educational, not investment advice.