LDTM v2 — Complete Architecture & Implementation Reference

Model: Long-Duration Temporal Model (LDTM) v2
Date: 2026-04-21
System: NVIDIA GB10 DGX · PostgreSQL 15 · PyTorch 2.x (NGC 25.01)
Status: Production — 103 NDX-100 tickers trained and inferring daily

System Overview
AI Model Design
Mathematical Foundations
Feature Engineering
Dataset Design
Training Architecture
Inference Pipeline
Orchestration & Parallelism
Database Schema
Prediction Outputs
Dashboard & LLM Interface
Container Architecture
Cron Schedule
Known Limitations & Caveats

1. System Overview

LDTM is a per-ticker LSTM-based price prediction system designed to forecast three forward price targets for every equity in the NASDAQ-100 universe:

Horizon	Target	Notes
Next trading day	Next-day close price	Primary signal; used for daily decision support
Next Monday	Following Monday close	Weekly directional signal
One month (~21 trading days)	Monthly close	Trend confirmation

The system processes 20+ years of daily OHLCV data per ticker, augments it with six engineered features, and trains a separate LSTM model per ticker. All 103 models run independently — there is no shared weight matrix or transfer learning. This design choice is intentional: AAPL and SQQQ have fundamentally different volatility regimes and price dynamics; forcing them to share representations would average away edge cases for both.

Pipeline Overview

Interactive Brokers TWS (IB Gateway)
        ↓
  ingestion/ingest_daily.py
        ↓
  market_data_daily (PostgreSQL)
        ↓
  model/ldtm/dataset.py  ← OHLCVDataset
        ↓
  model/ldtm/trainer.py  ← LDTM LSTM + AdamW + AMP
        ↓
  /model_weights/ldtm/{ticker}_ldtm.pt
        ↓
  model/ldtm/predict.py  ← build_inference_window → forward pass
        ↓
  ldtm_run_log (PostgreSQL)
        ↓
  snapshot_writer.py → ldtm_daily_snapshots
        ↓
  dashboard/app.py (Streamlit) ← localhost:8501
        ↓
  llm/llm_query.py → Mistral-7B FP8 (Triton, localhost:8000/v1)

2. AI Model Design

2.1 Architecture

Model class: LDTMModel (PyTorch nn.Module)

Input tensor: (batch_size, window_size, input_size)
                = (32, 30, 11)

→ LSTM
    input_size  = 11  (OHLCV + 6 engineered features)
    hidden_size = 128
    num_layers  = 2
    dropout     = 0.2  (applied between layers, not after last)
    batch_first = True

→ Last hidden state: h_n[-1]  shape (batch_size, 128)

→ Three independent prediction heads (FC towers):
    head_next_day     : Linear(128→64) → ReLU → Linear(64→1)
    head_next_monday  : Linear(128→64) → ReLU → Linear(64→1)
    head_one_month    : Linear(128→64) → ReLU → Linear(64→1)

Output: dict {
    "next_day":    tensor (batch_size, 1)  — normalized price
    "next_monday": tensor (batch_size, 1)  — normalized price
    "one_month":   tensor (batch_size, 1)  — normalized price
}

2.2 Design Rationale

Why LSTM over Transformer?

LSTMs handle variable-length historical sequences natively with O(n) memory vs O(n²) for attention
Daily OHLCV sequences are auto-regressive; the LSTM hidden state provides an efficient rolling summary of historical regime
Transformers would require positional encoding over 5,000+ daily bars; LSTMs handle this implicitly
Inference on a 30-day window takes ~2ms per ticker on CPU; transformers would require Flash Attention and a more complex deployment

Why per-ticker models? Each ticker has a unique volatility profile, price range (KLAC ~$1800 vs CPRT ~$34), earnings cadence, and sector exposure. A shared model would learn the median behavior of NDX-100, suppressing the outlier momentum patterns that generate the most useful signals.

Why 2 LSTM layers?

Layer 1 captures short-term momentum (candle patterns, gap fills, intraday reversals expressed in daily closes)
Layer 2 captures medium-term regime (trending vs mean-reverting behavior within the 30-day window)
3+ layers provide diminishing returns on daily data at this timescale; they also require longer training and are prone to gradient vanishing despite LSTM's gating

Why window_size = 30?

30 trading days ≈ 6 weeks ≈ one earnings cycle
Captures: one full options expiry cycle, intraday momentum decay, institutional rebalancing patterns
Shorter windows (< 20): insufficient context for MA20 and RSI convergence
Longer windows (> 60): the earliest prices in the window may belong to a different macro regime; per-window normalization would span across a regime change

Why 3 independent prediction heads? Each horizon has a different optimal feature representation:

Next-day close is dominated by recent momentum (ret1, RSI14)
Next-Monday is influenced by weekly patterns (ret5, MA5)
One-month is driven by medium-term trend (MA10, MA20)

Sharing a single output head forces a compromise between these regimes. Independent FC towers allow each head to learn its own weighting of the LSTM hidden state.

2.3 Parameter Count

Component	Parameters
LSTM layer 1	4 × (11 × 128 + 128 × 128 + 128) = 71,680
LSTM layer 2	4 × (128 × 128 + 128 × 128 + 128) = 131,584
head_next_day	128×64 + 64 + 64×1 + 1 = 8,257
head_next_monday	8,257
head_one_month	8,257
Total	~227,000 parameters

This is intentionally small. At 227K parameters and 30-day windows, the model generalizes well on ~4,000–5,000 training samples (20 years of daily data). Larger models would overfit.

3. Mathematical Foundations

3.1 LSTM Gate Equations

At each timestep t, given input x_t ∈ ℝ^11 and previous hidden state h_{t-1} ∈ ℝ^128:

Forget gate — what to discard from cell state:

f_t = σ(W_f · [h_{t-1}, x_t] + b_f)

Input gate — what new information to store:

i_t = σ(W_i · [h_{t-1}, x_t] + b_i)
g_t = tanh(W_g · [h_{t-1}, x_t] + b_g)

Cell state update:

C_t = f_t ⊙ C_{t-1} + i_t ⊙ g_t

Output gate:

o_t = σ(W_o · [h_{t-1}, x_t] + b_o)
h_t = o_t ⊙ tanh(C_t)

Where:

σ = sigmoid activation
tanh = hyperbolic tangent
⊙ = element-wise (Hadamard) product
W_{f,i,g,o} ∈ ℝ^{128×(128+11)} — weight matrices
b_{f,i,g,o} ∈ ℝ^{128} — bias vectors

The LSTM processes all 30 timesteps sequentially. Only the final hidden state h_{30} (from the last layer) is passed to the three prediction heads.

3.2 Per-Window Min-Max Normalization

Critical design decision. Rather than fitting a global scaler on training data (which fails catastrophically when inference-time prices are out of the training range — AAPL was $81 in 2019 training data, $266 today), LDTM v2 normalizes each 30-day window independently.

For each window W of shape (30, 11):

feat_min_j = min_{t=1..30} W[t, j]    for j = 0..10
feat_max_j = max_{t=1..30} W[t, j]    for j = 0..10

W_norm[t, j] = (W[t, j] - feat_min_j) / (feat_max_j - feat_min_j + ε)

Where ε = 1e-8 (prevents division by zero for constant-price windows).

Inverse transform (dollars from normalized prediction):

P_dollars = P_norm × (c_max - c_min) + c_min

Where c_min = feat_min[3] and c_max = feat_max[3] (column index 3 = close).

This means the model learns relative movements within each window, not absolute price levels. A normalized value of 0.5 means "mid-range of this 30-day window", regardless of whether that window is from 2005 (AAPL ~$5) or 2026 (AAPL ~$270).

3.3 Loss Function

Multi-head mean squared error:

L = MSE(ŷ_nd, y_nd) + MSE(ŷ_nm, y_nm) + MSE(ŷ_om, y_om)

Where all predictions and targets are in normalized [0,1] space:

MSE(ŷ, y) = (1/n) Σ_{i=1}^{n} (ŷ_i - y_i)²

Why MSE and not MAE? MSE penalizes large errors quadratically, which is preferable for financial prediction — a 10% prediction error is much worse than ten 1% errors from a risk management perspective.

3.4 Label Normalization

Training labels are also normalized per-window:

y_nd  = (close_{t+1}  - c_min) / (c_max - c_min + ε)
y_nm  = (close_{mon}  - c_min) / (c_max - c_min + ε)
y_om  = (close_{+21d} - c_min) / (c_max - c_min + ε)

Labels outside [0,1] are possible (and expected) — a prediction beyond the current 30-day range is a directional signal that the model expects a breakout.

3.5 Direction Accuracy Metric

For evaluation, the raw dollar prediction is compared against the actual next-day close:

direction_pred   = "UP"   if P̂_next_day > P_today  else "DOWN"
direction_actual = "UP"   if P_actual    > P_today  else "DOWN"
direction_correct = (direction_pred == direction_actual)

direction_accuracy = (Σ direction_correct) / n_evaluated × 100%

3.6 Percentage Error

pct_error = (P̂_pred - P_actual) / P_actual × 100

Reported as signed (positive = overestimate) and unsigned |pct_error| for accuracy assessment.

4. Feature Engineering

LDTM v2 expands the raw OHLCV 5-feature set to 11 features by adding momentum and trend indicators.

4.1 Feature Table

Index	Name	Formula	Purpose
0	open	raw	Price gap context
1	high	raw	Intraday range ceiling
2	low	raw	Intraday range floor
3	close	raw	Prediction anchor (`CLOSE_COL_IDX=3`)
4	volume	raw	Participation/liquidity
5	ret1	ln(C_t / C_{t-1})	1-day momentum
6	ret5	ln(C_t / C_{t-5})	Weekly momentum
7	rsi14	Wilder RSI(14)	Overbought/oversold
8	ma5	SMA(C, 5)	Short-term trend
9	ma10	SMA(C, 10)	Medium-term trend
10	ma20	SMA(C, 20)	Swing trend / regime

4.2 Log Returns

ret1_t = ln(C_t / C_{t-1})
ret5_t = ln(C_t / C_{t-5})

Log returns are used instead of simple returns (C_t/C_{t-1} - 1) for two reasons:

Log-normality: daily log returns are approximately normally distributed, making min-max normalization more stable
Time-additivity: multi-period log returns sum: ln(C_t/C_{t-5}) = Σ_{k=1}^{5} ln(C_{t-k+1}/C_{t-k})

4.3 Wilder RSI (14 periods)

Δ_t = C_t - C_{t-1}
gain_t = max(Δ_t, 0)
loss_t = max(-Δ_t, 0)

avg_gain_t = EWM(gain, α=1/14, min_periods=14)_t
avg_loss_t = EWM(loss, α=1/14, min_periods=14)_t

RS_t = avg_gain_t / (avg_loss_t + ε)
RSI_t = 100 - (100 / (1 + RS_t))

Wilder's smoothing uses com = period - 1 = 13 in pandas EWM notation:

avg_gain = gain.ewm(com=13, min_periods=14).mean()

RSI ∈ [0, 100]:

RSI > 70: overbought (potential reversal signal)
RSI < 30: oversold (potential bounce signal)
RSI 30–70: neutral momentum

4.4 Simple Moving Averages

MA_k(t) = (1/k) Σ_{i=0}^{k-1} C_{t-i}

For k ∈ {5, 10, 20}. The ratio of MA5/MA20 (implicitly captured by the model as both features are present in the normalized window) encodes the Golden Cross / Death Cross signal.

4.5 Warmup Requirement

MA20 requires 20 prior observations. For each ticker, the first valid row is:

first_valid = argmax_{t} [~any(isnan(features[t, :]))]

Windows begin at first_valid + window_size - 1, discarding the first ~20 warming rows. For tickers with 20+ years of data (~5,000 bars), this loses < 0.4% of samples — negligible.

5. Dataset Design

5.1 OHLCVDataset

class OHLCVDataset(torch.utils.data.Dataset):
    # split: 'train' (70%) | 'val' (15%) | 'test' (15%)
    # window_size: 30
    # Returns: (x_tensor[30, 11], y_labels_dict)

Split boundaries (chronological, not random):

|←─────── train (70%) ───────→|←── val (15%) ──→|←── test (15%) ──→|
t=0                           t=0.70N           t=0.85N           t=N

Chronological split is mandatory — random splits would cause data leakage (future prices appearing in training windows).

Per-window normalization in __getitem__:

window  = feat_vals[end-window_size:end]       # shape (30, 11)
feat_min = window.min(axis=0)                  # shape (11,)
feat_max = window.max(axis=0)                  # shape (11,)
c_range  = feat_max[3] - feat_min[3]

x_scaled = (window - feat_min) / (feat_max - feat_min + eps)

labels = {
    "next_day":    (close[end]   - feat_min[3]) / (c_range + eps),
    "next_monday": (close[mon]   - feat_min[3]) / (c_range + eps),
    "one_month":   (close[+21d]  - feat_min[3]) / (c_range + eps),
    "close_min":   feat_min[3],
    "close_max":   feat_max[3],
}

5.2 Inference Window (`build_inference_window`)

For inference, there is no ground truth label. The function:

Loads the last window_size + _WARMUP + 10 = 60 rows from market_data_daily
Computes all 11 features
Drops NaN warmup rows
Takes the last 30 rows
Normalizes and returns (tensor[1, 30, 11], c_min, c_max)

The extra 10 rows above _WARMUP = 20 provide a safety buffer for tickers with recent data gaps (weekends, holidays loaded as separate rows).

6. Training Architecture

6.1 Optimizer: AdamW

θ_{t+1} = θ_t - α_t × m̂_t / (√v̂_t + ε) - α_t × λ × θ_t

where:
  m_t = β_1 × m_{t-1} + (1 - β_1) × g_t        (1st moment)
  v_t = β_2 × v_{t-1} + (1 - β_2) × g_t²       (2nd moment)
  m̂_t = m_t / (1 - β_1^t)                       (bias-corrected)
  v̂_t = v_t / (1 - β_2^t)                       (bias-corrected)

Hyperparameters:

lr = 1e-3 (initial)
β_1 = 0.9, β_2 = 0.999 (Adam defaults)
ε = 1e-8
weight_decay λ = 1e-4 (L2 regularization on weights, not momentum)

AdamW is preferred over Adam because decoupled weight decay (Loshchilov & Hutter 2019) prevents weight decay from interacting with the adaptive gradient scaling, improving generalization on financial time series.

6.2 Learning Rate Schedule: CosineAnnealingLR

α_t = α_min + (1/2)(α_0 - α_min)(1 + cos(π × t/T_max))

α_0 = 1e-3 (initial)
α_min = 1e-6 (floor)
T_max = epochs (100)

Cosine annealing allows aggressive early exploration (high LR in early epochs) before converging to a tight minimum (near-zero LR at epoch 100). This is well-suited to financial data where the loss surface is relatively flat and noisy.

6.3 Automatic Mixed Precision (AMP)

Training uses torch.amp.GradScaler and torch.amp.autocast:

with torch.amp.autocast(device.type, enabled=use_amp):
    preds = model(x_batch)
    loss  = criterion(preds["next_day"], y["next_day"]) + ...

scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()

AMP computes forward/backward passes in FP16 (or BF16 on Ampere+) while maintaining FP32 master weights. On the GB10:

FP16 tensor operations run at 2× throughput vs FP32
Gradient scaling prevents underflow in FP16 (gradients near zero are scaled up before backward, scaled down before update)

6.4 Early Stopping

class EarlyStopping:
    patience = 10
    best_loss = ∞
    counter = 0

    def step(val_loss, epoch):
        if val_loss < best_loss:
            best_loss = val_loss
            best_epoch = epoch
            counter = 0
            return False   # continue
        counter += 1
        return counter >= patience  # stop

Training stops when validation loss does not improve for 10 consecutive epochs. The best model weights (lowest val_loss) are restored before saving.

Observed behavior across 103 tickers:

Fastest convergence: FER (2 epochs), GEHC (2 epochs), EA (3 epochs) — limited data or trivially learnable patterns
Typical convergence: 5–35 epochs
Slowest: MSTR (59 epochs), ADP (65 epochs), DDOG (68 epochs) — high-volatility or complex regime

6.5 Training Loop Summary

for epoch in 1..100:
    model.train()
    for x_batch, y_batch in train_loader:
        forward pass (AMP)
        loss = MSE(nd) + MSE(nm) + MSE(om)
        backward pass (GradScaler)
        AdamW step
        GradScaler update
    
    val_loss = evaluate(val_loader)
    scheduler.step()
    
    if val_loss < best_loss:
        save best_state (CPU clone)
    
    if early_stopping.step(val_loss):
        break

model.load_state_dict(best_state)
save checkpoint: {epoch, model_state, config, val_loss}
log_run(DB)

7. Inference Pipeline

7.1 Checkpoint Format

PyTorch .pt file stored at /model_weights/ldtm/{TICKER}_ldtm.pt:

{
    "epoch":       int,           # best epoch number
    "model_state": OrderedDict,   # PyTorch state_dict
    "config":      dict,          # LDTMConfig as dict
    "val_loss":    float,         # best validation loss
}

7.2 Inference Steps

# 1. Load checkpoint
checkpoint = torch.load(ckpt_path, map_location="cpu")
config = LDTMConfig(**checkpoint["config"])
model = LDTMModel(config)
model.load_state_dict(checkpoint["model_state"])

# 2. Build inference window (last 30 trading days)
x, c_min, c_max = build_inference_window(ticker, db_url, config.window_size)

# 3. Forward pass (no gradient)
model.eval()
with torch.no_grad():
    preds = model(x.to(device))

# 4. Inverse transform to dollars
def to_dollars(norm_val):
    return round(norm_val * (c_max - c_min) + c_min, 2)

result = {
    "ticker":            ticker,
    "next_day_close":    to_dollars(preds["next_day"].item()),
    "next_monday_close": to_dollars(preds["next_monday"].item()),
    "one_month_close":   to_dollars(preds["one_month"].item()),
}

7.3 Inference Speed

Step	Time	Notes
Load checkpoint	~50ms	CPU load; model is tiny (~900KB)
Build inference window	~80ms	DB query + feature computation
Forward pass (GPU)	~2ms	30 timesteps, 128 hidden, 227K params
Forward pass (CPU)	~8ms	Fallback if no GPU available
DB logging	~10ms	psycopg2 insert
Total per ticker	~150ms

8. Orchestration & Parallelism

8.1 orchestrate.py — GPU-Aware Job Dispatcher

orchestrate.py replaces naive bash background-job parallelism when --parallel > 1.

Algorithm:

Query nvidia-smi for all GPUs and their VRAM
Compute concurrent slots per GPU: slots = max(1, min(16, vram_mib // 600))
Build a thread-safe queue.Queue with slots copies of each GPU index
Spawn ThreadPoolExecutor(max_workers=total_slots)
Each worker: pop GPU token → docker run --gpus "device=N" -e CUDA_VISIBLE_DEVICES=0 → push token back

Wave ordering (for --mode both):

Wave 1: All 103 training jobs (checkpoints must exist before inference)
Wave 2: All 103 inference jobs (parallelized separately)

GB10 configuration:

Single GPU (NVIDIA GB10), VRAM = N/A (unified memory, ~128GB)
Fallback: FALLBACK_SLOTS_PER_GPU = 4
Observed VRAM per LDTM container: ~310 MiB (CUDA context + model weights + batch)

8.2 Slot Math

GB10 total VRAM:         ~128 GB (unified LPDDR5X)
Triton/Mistral-7B FP8:  ~39.4 GB
Desktop (Xorg, GNOME):  ~600 MB
Available:              ~88 GB

Per LDTM container:      ~310 MiB
Theoretical max slots:   88,000 / 310 ≈ 283

Compute cap (practical): 16 slots
Reason: GPU-Util saturates at ~90%+ with 4 jobs competing with Triton.
        16 slots on training-only runs (no Triton competition) is safe.

9. Database Schema

9.1 `ldtm_run_log` — Append-only audit log

CREATE TABLE ldtm_run_log (
    id              BIGSERIAL   PRIMARY KEY,
    run_at          TIMESTAMPTZ DEFAULT NOW(),
    ticker          TEXT        NOT NULL,
    mode            TEXT        NOT NULL,      -- 'train' | 'infer'
    status          TEXT        NOT NULL,      -- 'success' | 'failed'
    duration_sec    FLOAT,
    epochs_run      INTEGER,                   -- training only
    best_val_loss   FLOAT,                     -- training only
    next_day_close      FLOAT,                 -- inference only
    next_monday_close   FLOAT,                 -- inference only
    one_month_close     FLOAT,                 -- inference only
    error_msg       TEXT
);

9.2 `ldtm_daily_snapshots` — Queryable daily predictions with actuals

CREATE TABLE ldtm_daily_snapshots (
    id                          BIGSERIAL   PRIMARY KEY,
    run_date                    DATE        NOT NULL,
    ticker                      TEXT        NOT NULL,
    generated_at                TIMESTAMPTZ DEFAULT NOW(),

    -- Predictions (written at generation time)
    next_day_close_pred         FLOAT NOT NULL,
    next_monday_close_pred      FLOAT NOT NULL,
    one_month_close_pred        FLOAT NOT NULL,
    run_date_close              FLOAT,         -- baseline for direction calculation

    -- Actuals (filled by snapshot_fillback.py)
    next_day_actual             FLOAT,
    next_day_actual_date        DATE,
    next_day_pct_error          FLOAT,         -- signed: (pred - actual) / actual * 100
    next_day_direction_pred     TEXT,          -- 'UP' | 'DOWN'
    next_day_direction_actual   TEXT,
    next_day_direction_correct  BOOLEAN,

    next_monday_actual          FLOAT,
    next_monday_actual_date     DATE,
    next_monday_pct_error       FLOAT,

    one_month_actual            FLOAT,
    one_month_actual_date       DATE,
    one_month_pct_error         FLOAT,

    source_run_log_id           BIGINT REFERENCES ldtm_run_log(id),
    UNIQUE (run_date, ticker)
);

9.3 `ldtm_accuracy_30d` — View

CREATE VIEW ldtm_accuracy_30d AS
SELECT
    ticker,
    COUNT(*) FILTER (WHERE next_day_direction_correct IS NOT NULL) AS evaluated_days,
    ROUND(AVG(ABS(next_day_pct_error))::numeric, 2)               AS avg_abs_pct_error,
    ROUND(AVG(ABS(next_monday_pct_error))::numeric, 2)            AS avg_abs_pct_error_weekly,
    ROUND(
        100.0 * COUNT(*) FILTER (WHERE next_day_direction_correct = true)
        / NULLIF(COUNT(*) FILTER (WHERE next_day_direction_correct IS NOT NULL), 0), 1
    )                                                             AS direction_accuracy_pct,
    MAX(run_date)                                                 AS latest_run_date
FROM ldtm_daily_snapshots
WHERE run_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY ticker
ORDER BY direction_accuracy_pct DESC NULLS LAST;

10. Prediction Outputs

10.1 JSON Output (per ticker, written to `predictions/`)

{
    "ticker":            "AAPL",
    "next_day_close":    268.55,
    "next_monday_close": 268.74,
    "one_month_close":   275.05
}

10.2 DB Output (`ldtm_run_log`)

Column	Example Value
ticker	"AAPL"
mode	"infer"
status	"success"
duration_sec	0.152
next_day_close	268.55
next_monday_close	268.74
one_month_close	275.05

10.3 Training Output (`ldtm_run_log` mode='train')

Column	Example Value
ticker	"AAPL"
mode	"train"
epochs_run	10
best_val_loss	0.577952
duration_sec	31.2

10.4 Derived Signal (computed at analysis time)

implied_1m_return = (one_month_close_pred / next_day_close_pred - 1) × 100

Example (AAPL):  (275.05 / 268.55 - 1) × 100 = +2.42%
Example (AMD):   (315.00 / 282.22 - 1) × 100 = +11.61%
Example (SQQQ):  (53.15  / 56.66  - 1) × 100 = -6.20%

10.5 2026-04-22 Run Summary (103 tickers)

Category	Count	Representative
Bullish >5% (1-month)	15	AMD +11.6%, INTU +8.4%, ADBE +8.4%
Bullish +1–5%	65	MSFT +6.7%, NVDA +5.1%, AAPL +2.4%
Neutral ±1%	13	KLAC +0.2%, LIN -0.02%, BKNG -0.05%
Bearish -1 to -5%	9	ODFL -21%, DDOG -3.6%, KHC -1.8%
ETF signal	2	TQQQ +5.2%, SQQQ -6.2%

11. Dashboard & LLM Interface

11.1 Streamlit Dashboard (`dashboard/app.py`)

Dual-mode design:

DATA_SOURCE=db: full features, live PostgreSQL, LLM tab enabled
DATA_SOURCE=blob: read-only Azure Blob JSON, LLM tab disabled

5 tabs:

Ticker View — prediction history chart + accuracy metrics
Today's Predictions — all 103 tickers ranked by implied 1-month return
Accuracy Leaderboard — 30-day direction accuracy per ticker
TQQQ Signal — Random Forest signal + equity curve
LLM Query — Mistral-7B Q&A interface

11.2 LLM Context Assembly (`llm/llm_query.py`)

Three SQL queries assembled into a structured text prompt:

=== LDTM Context — 2026-04-22 ===

--- NVDA ---
Prediction date  : 2026-04-22
Last known close : $193.47
Next-day pred    : $198.00
Next-Monday pred : $199.67
1-month pred     : $208.06
Implied 1m return: +5.1%
30d accuracy     : direction=N/A  avg|%err|=N/A  n=0
Last retrain     : 2026-04-21  val_loss=0.7451  epochs=44
Recent headlines :
  [2026-04-20] NVIDIA unveils next-gen Blackwell Ultra GPUs
  ...

LLM parameters:

Model: engine-fp8 (Mistral-7B FP8 via Triton)
Temperature: 0.3 (low → factual responses)
Max tokens: 1024
Context window: 32K tokens (safe for 8-ticker queries ~3K tokens)

12. Container Architecture

Container	Base Image	GPU	Purpose
`model-ldtm`	`nvcr.io/nvidia/pytorch:25.01-py3`	Yes	Train + infer
`trading-dashboard`	`python:3.11-slim`	No	Streamlit UI
`ldtm-llm-query`	`python:3.11-slim`	No	LLM CLI
`ldtm-snapshot-writer`	`model-ldtm` image	No	DB snapshot upsert
`ldtm-snapshot-fillback`	`model-ldtm` image	No	Fill actuals
`blob-export`	`trading-dashboard`	No	Azure Blob export
`trading-postgres`	`postgres:15`	No	Database

All containers use --network host for zero-overhead DB access on the DGX host.

13. Cron Schedule

Time	Command	Purpose
6:00 PM Mon-Fri	`run_ingestion.sh`	Fetch daily OHLCV from IB
6:15 PM Mon-Fri	`run_ldtm_infer.sh`	Infer all 103 → snapshots
6:30 PM Mon-Fri	`run_news_ingestion.sh`	Fetch news headlines + bodies
7:00 PM Mon-Fri	`run_blob_export.sh`	Export JSON to Azure Blob
2:00 AM Saturday	`run_ldtm_canary_retrain.sh`	Retrain NVDA, TQQQ, AAPL
1:00 AM 1st Sunday	`run_ldtm_monthly_retrain.sh`	Full retrain all 103 tickers

Why 6:15 PM for inference (not 6:05 PM)? The ingestion job processes 103 tickers via 3 parallel IB connections at ~7-8s/ticker. At 103 tickers, parallel completion is ~4-5 minutes. 15 minutes gives a 10-minute safety buffer for slow market days or IB connection retries.

14. Known Limitations & Caveats

Absolute Price Accuracy

A small number of tickers show predictions far outside their actual price range (NFLX ~$93 predicted vs ~$1000 actual). This is a data coverage issue, not a model bug. When market_data_daily has insufficient history for a ticker (< 30 training samples, or history begins after a stock split that wasn't detected), the per-window normalization produces reasonable relative predictions but in the wrong absolute range.

Mitigation: Query SELECT ticker, COUNT(*), MIN(date), MAX(date) FROM market_data_daily GROUP BY ticker to verify history depth. Any ticker with < 250 rows (1 year) should be considered unreliable for absolute price prediction.

Val Loss Does Not Indicate Accuracy

A low val_loss (e.g. GEHC = 0.308) does not necessarily mean the model makes accurate absolute predictions. Tickers with very limited history (GEHC IPO in 2023) converge quickly on training data but have seen almost no macro regime changes. Val_loss measures fit on the held-out 15% of historical data — not forward predictive accuracy.

No Exogenous Data

The model uses only price and volume data. It does not incorporate:

Earnings announcements
Fed meeting dates
Macro indicators (CPI, jobs reports)
Analyst estimate revisions

This is a design choice for simplicity. Adding exogenous features would require a fundamentally different architecture (multi-input LSTM or temporal fusion transformer).

Short-Horizon Prediction Challenge

Even with engineered features, predicting next-day close is an exceptionally hard task. The Efficient Market Hypothesis suggests that publicly available price information is already reflected in today's price. LDTM's value lies not in its absolute price predictions but in:

The relative momentum signal (implied 1-month return direction)
The TQQQ/SQQQ pair as a market direction indicator
Sector clustering of bullish/bearish signals