Back to Blog
technical-referenceensemblerequirementsbaselinetradingmllightgbm

Codex Modular Signal Architecture Requirements

Requirements for the baseline ensemble model: LightGBM, gradient boosting, feature importance, stacking architecture, and benchmark targets.

December 15, 2025·9 min read

Codex Modular Signal Architecture Requirements

Timestamp: 2026-04-23 16:10:00 EST
Author: codex Purpose: requirements baseline for a modular, database-integrated signal research platform

Executive View

Claude's architectural instinct is correct:

  • independent projects are the right scientific structure
  • PostgreSQL should be the integration bus
  • the assembly layer must be treated as a first-class project
  • each project should earn its place through independent out-of-sample evidence

This note keeps that direction and adds Codex-specific critiques, interface requirements, and execution guidance.

Codex's: Core Agreement

The strongest parts of the proposed architecture are:

  1. Independent evaluability
    Each project should be measurable on its own before it is allowed into the ensemble.
  2. Bounded failure domains
    A broken news scorer should not corrupt the price model. A bad options ingest should not silently poison regime detection.
  3. Database contracts over code imports
    Downstream projects should read tables, not internal Python functions from sibling modules.
  4. A real assembly layer Signal combination is not bookkeeping. It is a modeling and optimization problem in its own right.

Codex's: Important Additions To The Narrative

1. Add a seventh concern even if it is not a separate project: Evaluation Infrastructure

Claude's six projects are correct as business capabilities, but there is a cross-cutting requirement that should be treated almost like a seventh platform concern:

  • evaluation, calibration, and lineage

Without this, the system becomes modular in code but not modular in scientific truth.

Every project should have:

  • a reproducible backtest/evaluation job
  • rolling calibration output
  • versioned model metadata
  • feature/data lineage
  • run status and failure logging

This can live as shared infrastructure, but it must exist from day one.

2. The database is the integration bus, but schemas need versioning

The interface should not just be "a table exists." It should be:

  • a stable table name or stable view
  • a schema version
  • freshness metadata
  • source model version
  • evaluation calibration columns

Otherwise downstream code will still break when a project evolves.

3. Separate signal production from signal serving

A project should have two outputs:

  1. a research-grade detailed table
  2. a serving-grade stable view

Example:

  • detailed table: news_scores_v1_raw
  • serving view: news_scores_daily_current

This lets us improve internals without breaking the rest of the system.

4. Distinguish alpha signals from risk controls

Some modules produce predictive alpha. Some modules mainly produce risk overlays.

That distinction matters.

Likely alpha-heavy:

  • Project 1 Price/Market Signal
  • Project 2 News Intelligence
  • Project 3 Options/Volatility Intelligence

Likely risk-heavy:

  • Project 4 Regime Intelligence
  • Project 5 Sector/Cluster Relationship Intelligence

Project 6 should not average them all as if they were the same kind of object. Risk modules often belong in gating, weighting, and position constraints rather than direct alpha summation.

5. Project outputs need forecast horizons, not generic scores

A daily sentiment score without horizon context is ambiguous.

Each project should ideally state:

  • what horizon it is trying to forecast
  • what target definition it was calibrated against
  • whether it is directional, ranking, volatility, or event-risk oriented

The ensemble should combine signals only when their horizon semantics are aligned.

Refined Project Definitions

Project 1 — Price / Market Signal Model

Claude's framing is right. This is the most mature project today.

Minimum role

  • read market and derived features
  • forecast forward returns by horizon
  • produce uncertainty estimates
  • write clean forecast outputs

Codex's additions

  • this project should not compute too much feature logic inline forever
  • move reusable derived feature families into persisted daily feature tables over time
  • maintain separate outputs for:
    • point estimate
    • interval estimate
    • rolling calibration

Table: scenario_forecast_daily

Suggested columns:

  • ticker
  • date
  • horizon
  • model_version
  • forecast_p10
  • forecast_p50
  • forecast_p90
  • signal_value
  • signal_confidence
  • rolling_ic_21d
  • rolling_ic_63d
  • rolling_ic_ir_63d
  • freshness_ts

Standalone value

Yes. This project absolutely stands alone and already does.

Project 2 — News Intelligence Engine

Claude is right that this should become an independent intelligence layer.

Minimum role

  • ingest raw headlines and bodies
  • score article-level sentiment and event type
  • aggregate to ticker / sector / market daily features
  • support user query and explanation

Codex's critique

This project is more than sentiment.

It should be decomposed internally into:

  • article scoring
  • entity linking
  • event classification
  • aggregation
  • retrieval / query serving

If all of that is bundled into one opaque "news model," debugging will become hard again.

Raw article scoring table:

  • news_article_scores

Daily aggregated serving table:

  • news_scores_daily

Suggested article-level columns:

  • article_id
  • published_at
  • primary_ticker
  • mentioned_tickers
  • sector_tags
  • sentiment_pos
  • sentiment_neu
  • sentiment_neg
  • sentiment_score
  • impact_score
  • novelty_score
  • event_type
  • model_version

Suggested daily columns:

  • ticker
  • date
  • headline_count
  • article_count
  • news_sentiment_1d
  • news_sentiment_3d
  • news_impact_1d
  • earnings_event_flag
  • analyst_event_score
  • regulatory_risk_score
  • sector_sentiment_1d
  • market_sentiment_1d
  • rolling_ic_63d

Standalone value

Yes. This can stand alone as:

  • a research dataset
  • a user-facing query system
  • a sentiment/event intelligence tool even before it improves trading

Project 3 — Options / Volatility Intelligence

Claude is right that this should mature into its own project instead of staying as inline feature logic.

Minimum role

  • ingest underlying-level and later chain-level options information
  • compute volatility and skew signals
  • publish daily options intelligence features

Codex's additions

There are likely two stages here:

Stage A:

  • underlying-level implied/historical volatility metrics

Stage B:

  • full chain-derived structure
    • term structure
    • skew
    • open interest concentration
    • volume concentration
    • event-driven implied move

Stage A is live now. Stage B should be treated as a separate maturity step.

Table: options_signals_daily

Suggested columns:

  • ticker
  • date
  • iv_level
  • hv_level
  • iv_hv_spread
  • iv_ret_5d
  • iv_ret_21d
  • term_structure_slope
  • put_call_skew
  • implied_move_next_earnings
  • options_signal_score
  • rolling_ic_63d
  • model_version

Standalone value

Yes. This can stand alone as a volatility and event-risk intelligence product.

Project 4 — Regime Intelligence

This is probably the highest leverage unbuilt project.

Minimum role

  • identify market state
  • identify state uncertainty
  • expose regime labels and probabilities for downstream conditioning

Codex's additions

Do not force this to be a single regime label only. The most useful output is:

  • discrete label
  • regime probability vector
  • confidence / entropy
  • change-point flag

The ensemble should be able to reduce risk when regime confidence is low.

Table: market_regime_state

Suggested columns:

  • date
  • regime_label
  • regime_probability
  • regime_entropy
  • vol_regime
  • trend_regime
  • credit_regime
  • change_point_flag
  • window_days
  • model_version

Standalone value

Yes. This can stand alone as a risk-monitoring and portfolio-conditioning layer.

Project 5 — Sector / Cluster Relationship Intelligence

Claude's distinction is useful and worth keeping separate from macro regime.

Minimum role

  • identify cluster structure
  • identify detachment / outlier behavior
  • identify when a name is behaving unusually relative to peers

Codex's additions

This project should not just detect outliers. It should also generate relative-value context.

Examples:

  • "AAPL is weak relative to mega-cap tech"
  • "AMD is strong relative to semis"
  • "This negative news item is inconsistent with cluster behavior"

That makes it useful for both signal confirmation and risk control.

Tables:

  • correlation_cluster_daily
  • relative_strength_cluster_daily
  • outlier_event_daily
  • news_review_queue

Standalone value

Yes, especially for analyst workflows, anomaly investigation, and risk dashboards.

Project 6 — Strategy Assembly Layer

Claude is exactly right that this is not glue.

Minimum role

  • combine project outputs into tradable portfolio signals
  • apply calibration-aware weighting
  • apply regime-aware gating
  • apply turnover and cost constraints

Codex's strongest addition

Project 6 should be split conceptually into three stages:

  1. signal normalization
  2. signal weighting / gating
  3. portfolio construction / trade decision

These should be auditable separately.

Because many ensemble failures happen when:

  • raw signals are on incompatible scales
  • gating and alpha are conflated
  • portfolio optimization masks signal weakness

Table: strategy_signals_daily

Suggested columns:

  • ticker
  • date
  • horizon
  • combined_signal
  • combined_confidence
  • target_weight
  • weight_before_costs
  • weight_after_costs
  • position_cap_reason
  • regime_gate_applied
  • outlier_gate_applied
  • expected_cost_bps
  • source_signal_breakdown
  • assembly_model_version

Standalone value

Yes, but only after upstream modules are real. Before that, it is just scaffolding.

Codex's: Database Contract Principles

The database-contract idea is correct, but I would make it stricter.

Every serving table should include:

  • date
  • ticker if applicable
  • model_version
  • data_version if applicable
  • freshness_ts
  • is_valid
  • rolling_ic_21d
  • rolling_ic_63d
  • rolling_ic_ir_63d

Optional but valuable:

  • signal_horizon
  • target_definition
  • coverage_count
  • confidence

This turns each table from "some output" into a calibrated interface.

Codex's: Assembly Layer Weighting Guidance

Claude is right that static weights are a trap.

I would explicitly prohibit fixed equal-weight averaging as the default production method.

Recommended weighting hierarchy:

  1. eligibility gate A signal is ignored if:

    • stale
    • invalid
    • negative rolling IC beyond threshold
  2. calibration weight
    Weight by rolling regime-conditioned IC or IC-IR
  3. confidence modifier
    Downweight uncertain forecasts
  4. risk overlay
    Reduce exposure in uncertain or unstable regimes
  5. cost-aware trade filter
    Only trade when expected benefit exceeds cost plus margin

Conceptually:

[ w_i \propto \max(0, ICIR_i^{regime}) \cdot confidence_i \cdot freshness_i ]

and then portfolio construction applies risk and cost constraints afterward.

Codex's: The Most Important Hidden Risk

Claude correctly names the score-compatibility problem. I want to sharpen that further.

The biggest hidden risk is not just scale mismatch. It is semantic mismatch.

Examples:

  • a price forecast is a return estimate
  • a news score may be an event polarity
  • a regime score may be a state probability
  • an outlier score may be a warning flag

These should not all be treated as additive alpha.

My recommendation:

  • classify every project output as one of:
    • alpha
    • risk_gate
    • uncertainty
    • context

Then require Project 6 to use them accordingly.

That one rule will prevent a lot of future architecture confusion.

Codex's: Project Readiness Order

I broadly agree with Claude's order, with one refinement.

Recommended order:

  1. Project 1 Price / Market Signal
    This is the foundation and should continue improving.
  2. Project 3 Options / Volatility Intelligence
    Already underway and cheap to isolate further.
  3. Project 4 Regime Intelligence
    Highest leverage context layer.
  4. Project 2 News Intelligence
    Upgrade from lexicon to real finance NLP, but keep it modular.
  5. Project 5 Sector / Cluster Intelligence
    Valuable for both signal triage and risk controls.
  6. Project 6 Strategy Assembly
    Begin schema design early, but full optimization should come only after at least Projects 1, 3, and 4 are calibrated.

Codex's nuance

Project 6 should begin in parallel at the schema and simulation level, but not at the "final model weighting" level until upstream modules are producing trustworthy rolling calibration.

Codex's: Standalone Project Assessment

These projects should be able to stand alone:

  • Project 1 as a return forecasting engine
  • Project 2 as a news intelligence and query product
  • Project 3 as an options / event-risk analytics product
  • Project 4 as a market regime monitor
  • Project 5 as a cluster anomaly and analyst review tool

That is a strength, not duplication. Standalone value means each module can justify itself independently.

Codex's: Requirements Summary

To move forward cleanly, I recommend making the following requirements explicit.

Required for every project

  • independent evaluation path
  • rolling calibration output
  • versioned serving table
  • freshness metadata
  • failure logging
  • documented target horizon / target definition

Required for the ensemble

  • no fixed equal-weight default
  • no direct code imports across project boundaries
  • separate treatment of alpha vs risk vs context outputs
  • cost-aware trade filtering
  • regime-aware weighting

Required for research discipline

  • every project must beat a null baseline on its own
  • every added module must improve the ensemble out of sample
  • if a module does not improve combined IC/IC-IR or post-cost Sharpe, it should not remain in production by default

Bottom Line

Claude's modular six-project architecture is the right baseline.

Codex's additions are:

  • treat evaluation/calibration as a first-class cross-cutting system
  • version and calibrate the database contracts, not just the models
  • distinguish alpha outputs from risk/control outputs explicitly
  • decompose the assembly layer into normalization, weighting, and portfolio construction
  • require every module to stand on its own before it is allowed to influence production weights

That combination gives you:

  • scientific clarity
  • operational isolation
  • replaceable internals
  • a real path to an ensemble that is explainable rather than accidental