LDTM v2 — Complete Architecture & Implementation Reference
Model: Long-Duration Temporal Model (LDTM) v2
Date: 2026-04-21
System: NVIDIA GB10 DGX · PostgreSQL 15 · PyTorch 2.x (NGC 25.01)
Status: Production — 103 NDX-100 tickers trained and inferring daily
Table of Contents
- System Overview
- AI Model Design
- Mathematical Foundations
- Feature Engineering
- Dataset Design
- Training Architecture
- Inference Pipeline
- Orchestration & Parallelism
- Database Schema
- Prediction Outputs
- Dashboard & LLM Interface
- Container Architecture
- Cron Schedule
- Known Limitations & Caveats
1. System Overview
LDTM is a per-ticker LSTM-based price prediction system designed to forecast three forward price targets for every equity in the NASDAQ-100 universe:
| Horizon | Target | Notes |
|---|---|---|
| Next trading day | Next-day close price | Primary signal; used for daily decision support |
| Next Monday | Following Monday close | Weekly directional signal |
| One month (~21 trading days) | Monthly close | Trend confirmation |
The system processes 20+ years of daily OHLCV data per ticker, augments it with six engineered features, and trains a separate LSTM model per ticker. All 103 models run independently — there is no shared weight matrix or transfer learning. This design choice is intentional: AAPL and SQQQ have fundamentally different volatility regimes and price dynamics; forcing them to share representations would average away edge cases for both.
Pipeline Overview
Interactive Brokers TWS (IB Gateway)
↓
ingestion/ingest_daily.py
↓
market_data_daily (PostgreSQL)
↓
model/ldtm/dataset.py ← OHLCVDataset
↓
model/ldtm/trainer.py ← LDTM LSTM + AdamW + AMP
↓
/model_weights/ldtm/{ticker}_ldtm.pt
↓
model/ldtm/predict.py ← build_inference_window → forward pass
↓
ldtm_run_log (PostgreSQL)
↓
snapshot_writer.py → ldtm_daily_snapshots
↓
dashboard/app.py (Streamlit) ← localhost:8501
↓
llm/llm_query.py → Mistral-7B FP8 (Triton, localhost:8000/v1)
2. AI Model Design
2.1 Architecture
Model class: LDTMModel (PyTorch nn.Module)
Input tensor: (batch_size, window_size, input_size)
= (32, 30, 11)
→ LSTM
input_size = 11 (OHLCV + 6 engineered features)
hidden_size = 128
num_layers = 2
dropout = 0.2 (applied between layers, not after last)
batch_first = True
→ Last hidden state: h_n[-1] shape (batch_size, 128)
→ Three independent prediction heads (FC towers):
head_next_day : Linear(128→64) → ReLU → Linear(64→1)
head_next_monday : Linear(128→64) → ReLU → Linear(64→1)
head_one_month : Linear(128→64) → ReLU → Linear(64→1)
Output: dict {
"next_day": tensor (batch_size, 1) — normalized price
"next_monday": tensor (batch_size, 1) — normalized price
"one_month": tensor (batch_size, 1) — normalized price
}
2.2 Design Rationale
Why LSTM over Transformer?
- LSTMs handle variable-length historical sequences natively with O(n) memory vs O(n²) for attention
- Daily OHLCV sequences are auto-regressive; the LSTM hidden state provides an efficient rolling summary of historical regime
- Transformers would require positional encoding over 5,000+ daily bars; LSTMs handle this implicitly
- Inference on a 30-day window takes ~2ms per ticker on CPU; transformers would require Flash Attention and a more complex deployment
Why per-ticker models? Each ticker has a unique volatility profile, price range (KLAC ~$1800 vs CPRT ~$34), earnings cadence, and sector exposure. A shared model would learn the median behavior of NDX-100, suppressing the outlier momentum patterns that generate the most useful signals.
Why 2 LSTM layers?
- Layer 1 captures short-term momentum (candle patterns, gap fills, intraday reversals expressed in daily closes)
- Layer 2 captures medium-term regime (trending vs mean-reverting behavior within the 30-day window)
- 3+ layers provide diminishing returns on daily data at this timescale; they also require longer training and are prone to gradient vanishing despite LSTM's gating
Why window_size = 30?
- 30 trading days ≈ 6 weeks ≈ one earnings cycle
- Captures: one full options expiry cycle, intraday momentum decay, institutional rebalancing patterns
- Shorter windows (< 20): insufficient context for MA20 and RSI convergence
- Longer windows (> 60): the earliest prices in the window may belong to a different macro regime; per-window normalization would span across a regime change
Why 3 independent prediction heads? Each horizon has a different optimal feature representation:
- Next-day close is dominated by recent momentum (ret1, RSI14)
- Next-Monday is influenced by weekly patterns (ret5, MA5)
- One-month is driven by medium-term trend (MA10, MA20)
Sharing a single output head forces a compromise between these regimes. Independent FC towers allow each head to learn its own weighting of the LSTM hidden state.
2.3 Parameter Count
| Component | Parameters |
|---|---|
| LSTM layer 1 | 4 × (11 × 128 + 128 × 128 + 128) = 71,680 |
| LSTM layer 2 | 4 × (128 × 128 + 128 × 128 + 128) = 131,584 |
| head_next_day | 128×64 + 64 + 64×1 + 1 = 8,257 |
| head_next_monday | 8,257 |
| head_one_month | 8,257 |
| Total | ~227,000 parameters |
This is intentionally small. At 227K parameters and 30-day windows, the model generalizes well on ~4,000–5,000 training samples (20 years of daily data). Larger models would overfit.
3. Mathematical Foundations
3.1 LSTM Gate Equations
At each timestep t, given input x_t ∈ ℝ^11 and previous hidden state h_{t-1} ∈ ℝ^128:
Forget gate — what to discard from cell state:
f_t = σ(W_f · [h_{t-1}, x_t] + b_f)
Input gate — what new information to store:
i_t = σ(W_i · [h_{t-1}, x_t] + b_i)
g_t = tanh(W_g · [h_{t-1}, x_t] + b_g)
Cell state update:
C_t = f_t ⊙ C_{t-1} + i_t ⊙ g_t
Output gate:
o_t = σ(W_o · [h_{t-1}, x_t] + b_o)
h_t = o_t ⊙ tanh(C_t)
Where:
- σ = sigmoid activation
- tanh = hyperbolic tangent
- ⊙ = element-wise (Hadamard) product
- W_{f,i,g,o} ∈ ℝ^{128×(128+11)} — weight matrices
- b_{f,i,g,o} ∈ ℝ^{128} — bias vectors
The LSTM processes all 30 timesteps sequentially. Only the final hidden state h_{30} (from the last layer) is passed to the three prediction heads.
3.2 Per-Window Min-Max Normalization
Critical design decision. Rather than fitting a global scaler on training data (which fails catastrophically when inference-time prices are out of the training range — AAPL was $81 in 2019 training data, $266 today), LDTM v2 normalizes each 30-day window independently.
For each window W of shape (30, 11):
feat_min_j = min_{t=1..30} W[t, j] for j = 0..10
feat_max_j = max_{t=1..30} W[t, j] for j = 0..10
W_norm[t, j] = (W[t, j] - feat_min_j) / (feat_max_j - feat_min_j + ε)
Where ε = 1e-8 (prevents division by zero for constant-price windows).
Inverse transform (dollars from normalized prediction):
P_dollars = P_norm × (c_max - c_min) + c_min
Where c_min = feat_min[3] and c_max = feat_max[3] (column index 3 = close).
This means the model learns relative movements within each window, not absolute price levels. A normalized value of 0.5 means "mid-range of this 30-day window", regardless of whether that window is from 2005 (AAPL ~$5) or 2026 (AAPL ~$270).
3.3 Loss Function
Multi-head mean squared error:
L = MSE(ŷ_nd, y_nd) + MSE(ŷ_nm, y_nm) + MSE(ŷ_om, y_om)
Where all predictions and targets are in normalized [0,1] space:
MSE(ŷ, y) = (1/n) Σ_{i=1}^{n} (ŷ_i - y_i)²
Why MSE and not MAE? MSE penalizes large errors quadratically, which is preferable for financial prediction — a 10% prediction error is much worse than ten 1% errors from a risk management perspective.
3.4 Label Normalization
Training labels are also normalized per-window:
y_nd = (close_{t+1} - c_min) / (c_max - c_min + ε)
y_nm = (close_{mon} - c_min) / (c_max - c_min + ε)
y_om = (close_{+21d} - c_min) / (c_max - c_min + ε)
Labels outside [0,1] are possible (and expected) — a prediction beyond the current 30-day range is a directional signal that the model expects a breakout.
3.5 Direction Accuracy Metric
For evaluation, the raw dollar prediction is compared against the actual next-day close:
direction_pred = "UP" if P̂_next_day > P_today else "DOWN"
direction_actual = "UP" if P_actual > P_today else "DOWN"
direction_correct = (direction_pred == direction_actual)
direction_accuracy = (Σ direction_correct) / n_evaluated × 100%
3.6 Percentage Error
pct_error = (P̂_pred - P_actual) / P_actual × 100
Reported as signed (positive = overestimate) and unsigned |pct_error| for accuracy assessment.
4. Feature Engineering
LDTM v2 expands the raw OHLCV 5-feature set to 11 features by adding momentum and trend indicators.
4.1 Feature Table
| Index | Name | Formula | Purpose |
|---|---|---|---|
| 0 | open | raw | Price gap context |
| 1 | high | raw | Intraday range ceiling |
| 2 | low | raw | Intraday range floor |
| 3 | close | raw | Prediction anchor (CLOSE_COL_IDX=3) |
| 4 | volume | raw | Participation/liquidity |
| 5 | ret1 | ln(C_t / C_{t-1}) | 1-day momentum |
| 6 | ret5 | ln(C_t / C_{t-5}) | Weekly momentum |
| 7 | rsi14 | Wilder RSI(14) | Overbought/oversold |
| 8 | ma5 | SMA(C, 5) | Short-term trend |
| 9 | ma10 | SMA(C, 10) | Medium-term trend |
| 10 | ma20 | SMA(C, 20) | Swing trend / regime |
4.2 Log Returns
ret1_t = ln(C_t / C_{t-1})
ret5_t = ln(C_t / C_{t-5})
Log returns are used instead of simple returns (C_t/C_{t-1} - 1) for two reasons:
- Log-normality: daily log returns are approximately normally distributed, making min-max normalization more stable
- Time-additivity: multi-period log returns sum: ln(C_t/C_{t-5}) = Σ_{k=1}^{5} ln(C_{t-k+1}/C_{t-k})
4.3 Wilder RSI (14 periods)
Δ_t = C_t - C_{t-1}
gain_t = max(Δ_t, 0)
loss_t = max(-Δ_t, 0)
avg_gain_t = EWM(gain, α=1/14, min_periods=14)_t
avg_loss_t = EWM(loss, α=1/14, min_periods=14)_t
RS_t = avg_gain_t / (avg_loss_t + ε)
RSI_t = 100 - (100 / (1 + RS_t))
Wilder's smoothing uses com = period - 1 = 13 in pandas EWM notation:
avg_gain = gain.ewm(com=13, min_periods=14).mean()
RSI ∈ [0, 100]:
- RSI > 70: overbought (potential reversal signal)
- RSI < 30: oversold (potential bounce signal)
- RSI 30–70: neutral momentum
4.4 Simple Moving Averages
MA_k(t) = (1/k) Σ_{i=0}^{k-1} C_{t-i}
For k ∈ {5, 10, 20}. The ratio of MA5/MA20 (implicitly captured by the model as both features are present in the normalized window) encodes the Golden Cross / Death Cross signal.
4.5 Warmup Requirement
MA20 requires 20 prior observations. For each ticker, the first valid row is:
first_valid = argmax_{t} [~any(isnan(features[t, :]))]
Windows begin at first_valid + window_size - 1, discarding the first ~20 warming rows. For tickers with 20+ years of data (~5,000 bars), this loses < 0.4% of samples — negligible.
5. Dataset Design
5.1 OHLCVDataset
class OHLCVDataset(torch.utils.data.Dataset):
# split: 'train' (70%) | 'val' (15%) | 'test' (15%)
# window_size: 30
# Returns: (x_tensor[30, 11], y_labels_dict)
Split boundaries (chronological, not random):
|←─────── train (70%) ───────→|←── val (15%) ──→|←── test (15%) ──→|
t=0 t=0.70N t=0.85N t=N
Chronological split is mandatory — random splits would cause data leakage (future prices appearing in training windows).
Per-window normalization in __getitem__:
window = feat_vals[end-window_size:end] # shape (30, 11)
feat_min = window.min(axis=0) # shape (11,)
feat_max = window.max(axis=0) # shape (11,)
c_range = feat_max[3] - feat_min[3]
x_scaled = (window - feat_min) / (feat_max - feat_min + eps)
labels = {
"next_day": (close[end] - feat_min[3]) / (c_range + eps),
"next_monday": (close[mon] - feat_min[3]) / (c_range + eps),
"one_month": (close[+21d] - feat_min[3]) / (c_range + eps),
"close_min": feat_min[3],
"close_max": feat_max[3],
}
5.2 Inference Window (build_inference_window)
For inference, there is no ground truth label. The function:
- Loads the last
window_size + _WARMUP + 10 = 60rows frommarket_data_daily - Computes all 11 features
- Drops NaN warmup rows
- Takes the last 30 rows
- Normalizes and returns
(tensor[1, 30, 11], c_min, c_max)
The extra 10 rows above _WARMUP = 20 provide a safety buffer for tickers with recent data gaps (weekends, holidays loaded as separate rows).
6. Training Architecture
6.1 Optimizer: AdamW
θ_{t+1} = θ_t - α_t × m̂_t / (√v̂_t + ε) - α_t × λ × θ_t
where:
m_t = β_1 × m_{t-1} + (1 - β_1) × g_t (1st moment)
v_t = β_2 × v_{t-1} + (1 - β_2) × g_t² (2nd moment)
m̂_t = m_t / (1 - β_1^t) (bias-corrected)
v̂_t = v_t / (1 - β_2^t) (bias-corrected)
Hyperparameters:
- lr = 1e-3 (initial)
- β_1 = 0.9, β_2 = 0.999 (Adam defaults)
- ε = 1e-8
- weight_decay λ = 1e-4 (L2 regularization on weights, not momentum)
AdamW is preferred over Adam because decoupled weight decay (Loshchilov & Hutter 2019) prevents weight decay from interacting with the adaptive gradient scaling, improving generalization on financial time series.
6.2 Learning Rate Schedule: CosineAnnealingLR
α_t = α_min + (1/2)(α_0 - α_min)(1 + cos(π × t/T_max))
- α_0 = 1e-3 (initial)
- α_min = 1e-6 (floor)
- T_max = epochs (100)
Cosine annealing allows aggressive early exploration (high LR in early epochs) before converging to a tight minimum (near-zero LR at epoch 100). This is well-suited to financial data where the loss surface is relatively flat and noisy.
6.3 Automatic Mixed Precision (AMP)
Training uses torch.amp.GradScaler and torch.amp.autocast:
with torch.amp.autocast(device.type, enabled=use_amp):
preds = model(x_batch)
loss = criterion(preds["next_day"], y["next_day"]) + ...
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
AMP computes forward/backward passes in FP16 (or BF16 on Ampere+) while maintaining FP32 master weights. On the GB10:
- FP16 tensor operations run at 2× throughput vs FP32
- Gradient scaling prevents underflow in FP16 (gradients near zero are scaled up before backward, scaled down before update)
6.4 Early Stopping
class EarlyStopping:
patience = 10
best_loss = ∞
counter = 0
def step(val_loss, epoch):
if val_loss < best_loss:
best_loss = val_loss
best_epoch = epoch
counter = 0
return False # continue
counter += 1
return counter >= patience # stop
Training stops when validation loss does not improve for 10 consecutive epochs. The best model weights (lowest val_loss) are restored before saving.
Observed behavior across 103 tickers:
- Fastest convergence: FER (2 epochs), GEHC (2 epochs), EA (3 epochs) — limited data or trivially learnable patterns
- Typical convergence: 5–35 epochs
- Slowest: MSTR (59 epochs), ADP (65 epochs), DDOG (68 epochs) — high-volatility or complex regime
6.5 Training Loop Summary
for epoch in 1..100:
model.train()
for x_batch, y_batch in train_loader:
forward pass (AMP)
loss = MSE(nd) + MSE(nm) + MSE(om)
backward pass (GradScaler)
AdamW step
GradScaler update
val_loss = evaluate(val_loader)
scheduler.step()
if val_loss < best_loss:
save best_state (CPU clone)
if early_stopping.step(val_loss):
break
model.load_state_dict(best_state)
save checkpoint: {epoch, model_state, config, val_loss}
log_run(DB)
7. Inference Pipeline
7.1 Checkpoint Format
PyTorch .pt file stored at /model_weights/ldtm/{TICKER}_ldtm.pt:
{
"epoch": int, # best epoch number
"model_state": OrderedDict, # PyTorch state_dict
"config": dict, # LDTMConfig as dict
"val_loss": float, # best validation loss
}
7.2 Inference Steps
# 1. Load checkpoint
checkpoint = torch.load(ckpt_path, map_location="cpu")
config = LDTMConfig(**checkpoint["config"])
model = LDTMModel(config)
model.load_state_dict(checkpoint["model_state"])
# 2. Build inference window (last 30 trading days)
x, c_min, c_max = build_inference_window(ticker, db_url, config.window_size)
# 3. Forward pass (no gradient)
model.eval()
with torch.no_grad():
preds = model(x.to(device))
# 4. Inverse transform to dollars
def to_dollars(norm_val):
return round(norm_val * (c_max - c_min) + c_min, 2)
result = {
"ticker": ticker,
"next_day_close": to_dollars(preds["next_day"].item()),
"next_monday_close": to_dollars(preds["next_monday"].item()),
"one_month_close": to_dollars(preds["one_month"].item()),
}
7.3 Inference Speed
| Step | Time | Notes |
|---|---|---|
| Load checkpoint | ~50ms | CPU load; model is tiny (~900KB) |
| Build inference window | ~80ms | DB query + feature computation |
| Forward pass (GPU) | ~2ms | 30 timesteps, 128 hidden, 227K params |
| Forward pass (CPU) | ~8ms | Fallback if no GPU available |
| DB logging | ~10ms | psycopg2 insert |
| Total per ticker | ~150ms |
8. Orchestration & Parallelism
8.1 orchestrate.py — GPU-Aware Job Dispatcher
orchestrate.py replaces naive bash background-job parallelism when --parallel > 1.
Algorithm:
- Query
nvidia-smifor all GPUs and their VRAM - Compute concurrent slots per GPU:
slots = max(1, min(16, vram_mib // 600)) - Build a thread-safe
queue.Queuewithslotscopies of each GPU index - Spawn
ThreadPoolExecutor(max_workers=total_slots) - Each worker: pop GPU token →
docker run --gpus "device=N" -e CUDA_VISIBLE_DEVICES=0→ push token back
Wave ordering (for --mode both):
- Wave 1: All 103 training jobs (checkpoints must exist before inference)
- Wave 2: All 103 inference jobs (parallelized separately)
GB10 configuration:
- Single GPU (NVIDIA GB10), VRAM = N/A (unified memory, ~128GB)
- Fallback:
FALLBACK_SLOTS_PER_GPU = 4 - Observed VRAM per LDTM container: ~310 MiB (CUDA context + model weights + batch)
8.2 Slot Math
GB10 total VRAM: ~128 GB (unified LPDDR5X)
Triton/Mistral-7B FP8: ~39.4 GB
Desktop (Xorg, GNOME): ~600 MB
Available: ~88 GB
Per LDTM container: ~310 MiB
Theoretical max slots: 88,000 / 310 ≈ 283
Compute cap (practical): 16 slots
Reason: GPU-Util saturates at ~90%+ with 4 jobs competing with Triton.
16 slots on training-only runs (no Triton competition) is safe.
9. Database Schema
9.1 ldtm_run_log — Append-only audit log
CREATE TABLE ldtm_run_log (
id BIGSERIAL PRIMARY KEY,
run_at TIMESTAMPTZ DEFAULT NOW(),
ticker TEXT NOT NULL,
mode TEXT NOT NULL, -- 'train' | 'infer'
status TEXT NOT NULL, -- 'success' | 'failed'
duration_sec FLOAT,
epochs_run INTEGER, -- training only
best_val_loss FLOAT, -- training only
next_day_close FLOAT, -- inference only
next_monday_close FLOAT, -- inference only
one_month_close FLOAT, -- inference only
error_msg TEXT
);
9.2 ldtm_daily_snapshots — Queryable daily predictions with actuals
CREATE TABLE ldtm_daily_snapshots (
id BIGSERIAL PRIMARY KEY,
run_date DATE NOT NULL,
ticker TEXT NOT NULL,
generated_at TIMESTAMPTZ DEFAULT NOW(),
-- Predictions (written at generation time)
next_day_close_pred FLOAT NOT NULL,
next_monday_close_pred FLOAT NOT NULL,
one_month_close_pred FLOAT NOT NULL,
run_date_close FLOAT, -- baseline for direction calculation
-- Actuals (filled by snapshot_fillback.py)
next_day_actual FLOAT,
next_day_actual_date DATE,
next_day_pct_error FLOAT, -- signed: (pred - actual) / actual * 100
next_day_direction_pred TEXT, -- 'UP' | 'DOWN'
next_day_direction_actual TEXT,
next_day_direction_correct BOOLEAN,
next_monday_actual FLOAT,
next_monday_actual_date DATE,
next_monday_pct_error FLOAT,
one_month_actual FLOAT,
one_month_actual_date DATE,
one_month_pct_error FLOAT,
source_run_log_id BIGINT REFERENCES ldtm_run_log(id),
UNIQUE (run_date, ticker)
);
9.3 ldtm_accuracy_30d — View
CREATE VIEW ldtm_accuracy_30d AS
SELECT
ticker,
COUNT(*) FILTER (WHERE next_day_direction_correct IS NOT NULL) AS evaluated_days,
ROUND(AVG(ABS(next_day_pct_error))::numeric, 2) AS avg_abs_pct_error,
ROUND(AVG(ABS(next_monday_pct_error))::numeric, 2) AS avg_abs_pct_error_weekly,
ROUND(
100.0 * COUNT(*) FILTER (WHERE next_day_direction_correct = true)
/ NULLIF(COUNT(*) FILTER (WHERE next_day_direction_correct IS NOT NULL), 0), 1
) AS direction_accuracy_pct,
MAX(run_date) AS latest_run_date
FROM ldtm_daily_snapshots
WHERE run_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY ticker
ORDER BY direction_accuracy_pct DESC NULLS LAST;
10. Prediction Outputs
10.1 JSON Output (per ticker, written to predictions/)
{
"ticker": "AAPL",
"next_day_close": 268.55,
"next_monday_close": 268.74,
"one_month_close": 275.05
}
10.2 DB Output (ldtm_run_log)
| Column | Example Value |
|---|---|
| ticker | "AAPL" |
| mode | "infer" |
| status | "success" |
| duration_sec | 0.152 |
| next_day_close | 268.55 |
| next_monday_close | 268.74 |
| one_month_close | 275.05 |
10.3 Training Output (ldtm_run_log mode='train')
| Column | Example Value |
|---|---|
| ticker | "AAPL" |
| mode | "train" |
| epochs_run | 10 |
| best_val_loss | 0.577952 |
| duration_sec | 31.2 |
10.4 Derived Signal (computed at analysis time)
implied_1m_return = (one_month_close_pred / next_day_close_pred - 1) × 100
Example (AAPL): (275.05 / 268.55 - 1) × 100 = +2.42%
Example (AMD): (315.00 / 282.22 - 1) × 100 = +11.61%
Example (SQQQ): (53.15 / 56.66 - 1) × 100 = -6.20%
10.5 2026-04-22 Run Summary (103 tickers)
| Category | Count | Representative |
|---|---|---|
| Bullish >5% (1-month) | 15 | AMD +11.6%, INTU +8.4%, ADBE +8.4% |
| Bullish +1–5% | 65 | MSFT +6.7%, NVDA +5.1%, AAPL +2.4% |
| Neutral ±1% | 13 | KLAC +0.2%, LIN -0.02%, BKNG -0.05% |
| Bearish -1 to -5% | 9 | ODFL -21%, DDOG -3.6%, KHC -1.8% |
| ETF signal | 2 | TQQQ +5.2%, SQQQ -6.2% |
11. Dashboard & LLM Interface
11.1 Streamlit Dashboard (dashboard/app.py)
Dual-mode design:
DATA_SOURCE=db: full features, live PostgreSQL, LLM tab enabledDATA_SOURCE=blob: read-only Azure Blob JSON, LLM tab disabled
5 tabs:
- Ticker View — prediction history chart + accuracy metrics
- Today's Predictions — all 103 tickers ranked by implied 1-month return
- Accuracy Leaderboard — 30-day direction accuracy per ticker
- TQQQ Signal — Random Forest signal + equity curve
- LLM Query — Mistral-7B Q&A interface
11.2 LLM Context Assembly (llm/llm_query.py)
Three SQL queries assembled into a structured text prompt:
=== LDTM Context — 2026-04-22 ===
--- NVDA ---
Prediction date : 2026-04-22
Last known close : $193.47
Next-day pred : $198.00
Next-Monday pred : $199.67
1-month pred : $208.06
Implied 1m return: +5.1%
30d accuracy : direction=N/A avg|%err|=N/A n=0
Last retrain : 2026-04-21 val_loss=0.7451 epochs=44
Recent headlines :
[2026-04-20] NVIDIA unveils next-gen Blackwell Ultra GPUs
...
LLM parameters:
- Model:
engine-fp8(Mistral-7B FP8 via Triton) - Temperature: 0.3 (low → factual responses)
- Max tokens: 1024
- Context window: 32K tokens (safe for 8-ticker queries ~3K tokens)
12. Container Architecture
| Container | Base Image | GPU | Purpose |
|---|---|---|---|
model-ldtm |
nvcr.io/nvidia/pytorch:25.01-py3 |
Yes | Train + infer |
trading-dashboard |
python:3.11-slim |
No | Streamlit UI |
ldtm-llm-query |
python:3.11-slim |
No | LLM CLI |
ldtm-snapshot-writer |
model-ldtm image |
No | DB snapshot upsert |
ldtm-snapshot-fillback |
model-ldtm image |
No | Fill actuals |
blob-export |
trading-dashboard |
No | Azure Blob export |
trading-postgres |
postgres:15 |
No | Database |
All containers use --network host for zero-overhead DB access on the DGX host.
13. Cron Schedule
| Time | Command | Purpose |
|---|---|---|
| 6:00 PM Mon-Fri | run_ingestion.sh |
Fetch daily OHLCV from IB |
| 6:15 PM Mon-Fri | run_ldtm_infer.sh |
Infer all 103 → snapshots |
| 6:30 PM Mon-Fri | run_news_ingestion.sh |
Fetch news headlines + bodies |
| 7:00 PM Mon-Fri | run_blob_export.sh |
Export JSON to Azure Blob |
| 2:00 AM Saturday | run_ldtm_canary_retrain.sh |
Retrain NVDA, TQQQ, AAPL |
| 1:00 AM 1st Sunday | run_ldtm_monthly_retrain.sh |
Full retrain all 103 tickers |
Why 6:15 PM for inference (not 6:05 PM)? The ingestion job processes 103 tickers via 3 parallel IB connections at ~7-8s/ticker. At 103 tickers, parallel completion is ~4-5 minutes. 15 minutes gives a 10-minute safety buffer for slow market days or IB connection retries.
14. Known Limitations & Caveats
Absolute Price Accuracy
A small number of tickers show predictions far outside their actual price range (NFLX ~$93 predicted vs ~$1000 actual). This is a data coverage issue, not a model bug. When market_data_daily has insufficient history for a ticker (< 30 training samples, or history begins after a stock split that wasn't detected), the per-window normalization produces reasonable relative predictions but in the wrong absolute range.
Mitigation: Query SELECT ticker, COUNT(*), MIN(date), MAX(date) FROM market_data_daily GROUP BY ticker to verify history depth. Any ticker with < 250 rows (1 year) should be considered unreliable for absolute price prediction.
Val Loss Does Not Indicate Accuracy
A low val_loss (e.g. GEHC = 0.308) does not necessarily mean the model makes accurate absolute predictions. Tickers with very limited history (GEHC IPO in 2023) converge quickly on training data but have seen almost no macro regime changes. Val_loss measures fit on the held-out 15% of historical data — not forward predictive accuracy.
No Exogenous Data
The model uses only price and volume data. It does not incorporate:
- Earnings announcements
- Fed meeting dates
- Macro indicators (CPI, jobs reports)
- Analyst estimate revisions
This is a design choice for simplicity. Adding exogenous features would require a fundamentally different architecture (multi-input LSTM or temporal fusion transformer).
Short-Horizon Prediction Challenge
Even with engineered features, predicting next-day close is an exceptionally hard task. The Efficient Market Hypothesis suggests that publicly available price information is already reflected in today's price. LDTM's value lies not in its absolute price predictions but in:
- The relative momentum signal (implied 1-month return direction)
- The TQQQ/SQQQ pair as a market direction indicator
- Sector clustering of bullish/bearish signals