Skip to main content
All topics

Data Infrastructure

Tardis, ClickHouse, NATS, LunarCrush — the institutional data stack.

Data sources

Smartbull ingests data from multiple institutional-grade sources:

  • Tardis.dev — historical and real-time L2 order book snapshots, trades, liquidations
  • LunarCrush — social intelligence (Galaxy Score, AltRank, social volume, engagement)
  • CoinGecko — prices, market caps, volumes for 500+ assets
  • CoinGlass — funding rates, open interest, liquidation maps
  • DefiLlama — DeFi TVL, yields, protocol metrics
  • Arkham Intelligence — entity-level on-chain tracking
  • CryptoQuant — on-chain metrics, exchange flows, MVRV
  • Santiment — social and development activity metrics
  • Kaiko — institutional-grade market data (planned)

All data flows through a unified pipeline: Source → NATS (streaming) → ClickHouse (persistence) → Feature Store (serving).

ClickHouse analytics

ClickHouse stores all time-series data with sub-second query performance: - ohlcv_1m — 1-minute candles for all tracked symbols - l2_snapshots — L2 order book snapshots (top 20 levels) every 5 seconds - trades_raw — individual trades for microstructure analysis - funding_snapshots — hourly funding rates across all venues - bot_orders — complete audit trail of every order placed - meta_learner_snapshots — daily snapshots of allocation weights and performance - server_errors — structured error logs for anomaly detection - decision_traces — full context vectors for every AI decision

Data retention: 90 days hot (SSD), 2 years cold (S3/R2). Total ingestion: ~50GB/day.

NATS streaming

NATS provides the real-time messaging layer: - Tardis L2 snapshots are published to NATS subjects per symbol - The bot-engine subscribes to relevant subjects for microstructure signals - ClickHouse consumers batch-insert from NATS (1000 rows or 5 seconds, whichever comes first) - Latency: source → NATS → consumer: < 50ms p99

NATS also handles internal pub/sub for kill-switch propagation, circuit-breaker events, and Meta-Learner weight updates.

Observability & monitoring

The platform runs a full observability stack:

  • SLO alert engine — configurable rules that query ClickHouse and fire on threshold breach
  • PagerDuty integration — critical alerts page on-call engineers
  • Telegram admin alerts — real-time notifications for all severity levels
  • LLM Anomaly Detector — hourly scan of error patterns using AI to flag unusual activity
  • Grafana dashboards — OpenTelemetry traces, metrics, and logs
  • DR drill — automated weekly disaster recovery test (RTO < 15min, RPO < 5min)

Every SLO has a linked runbook for rapid incident response.