Skip to content

RocksDB CDC Log

Yoda's default CDC sink is the SQLite _yoda_cdc_log table, written by triggers on every INSERT / UPDATE / DELETE. The rocksdb-cdc feature replaces (or augments) that sink with a durable, high-throughput RocksDB-backed log that delivers 5–7× higher write throughput because RocksDB's LSM-tree write path has lower per-write overhead than SQLite's WAL trigger execution.

Feature flag

Enable the rocksdb-cdc feature on the yoda dependency in your Cargo.toml:

toml
[dependencies]
yoda = { version = "1", features = ["rocksdb-cdc"] }

Or, when building from a checkout of the workspace:

sh
cargo build --features rocksdb-cdc

No external RocksDB installation is required — the rust-rocksdb crate bundles the C++ library.

When to use it

ScenarioRecommended CDC backend
Development / low-volume OLTPSQLite triggers (default)
High-throughput OLTP with many small writesRocksDbCdcLog (this crate)
Sidecar mode (external source)N/A — sidecar polls the external DB directly

Operating modes

Standalone mode

The application appends CDC events directly to RocksDbCdcLog, bypassing SQLite triggers entirely. This is the simplest path: no triggers are installed, and RocksDbCdcLog is the sole source of truth for the sync engine.

rust
use yoda_cdc_rocksdb::{RocksDbCdcLog, RocksDbCdcConfig};

let config = RocksDbCdcConfig {
    path: "/var/lib/yoda/cdc_rocksdb".to_string(),
    create_if_missing: true,
};
let log = RocksDbCdcLog::open(&config)?;

Bridge mode (RocksDbBridge)

In bridge mode, SQLite triggers continue to fire into _yoda_cdc_log (the normal HTAP flow). A bridge consumer periodically polls SQLite for new events and moves them into RocksDB in a single atomic write. The sync engine then reads from RocksDB instead of SQLite.

This gives you crash durability: even if the process dies mid-cycle, the bridge watermark and the events it covers are committed atomically. On restart, the bridge resumes from get_bridge_watermark() and skips already-bridged rows.

text
SQLite triggers → _yoda_cdc_log
                        ↓  (bridge consumer, atomic WriteBatch)
                  RocksDbCdcLog
                        ↓  (CdcSyncEngine reads from here)
                  OlapBackend (DataFusion / DuckDB)

The atomic guarantee: append_and_set_bridge_watermark writes both the CDC events and the SQLite watermark in the same WriteBatch. Either both are committed, or neither is — no partial state survives a crash.

Configuration

Set rocksdb_cdc_path in the [engine] section of your TOML config (see Configuration):

toml
[engine]
oltp_path        = "app.db"
olap_backend     = "datafusion"
sync_interval_ms = 500

# Enable RocksDB CDC log (requires --features rocksdb-cdc)
rocksdb_cdc_path = "/var/lib/yoda/cdc_rocksdb"

Feature gate required

rocksdb_cdc_path is silently ignored unless built with --features rocksdb-cdc.

Storage layout

All data lives in RocksDB's default column family. Two key namespaces coexist:

Key rangeContent
8-byte big-endian i64 (sequence 1 … i64::MAX)CDC event payloads serialised as JSON
\xff__meta__/*Metadata: latest sequence number, bridge watermark

Why big-endian? RocksDB's default comparator is byte-wise lexicographic. Big-endian encoding preserves numerical order, so a forward iterator yields events in sequence order without a custom comparator.

Why 0xFF prefix for metadata? Valid sequence keys are derived from non-negative i64 values and never start with 0xFF, so metadata keys sort naturally after all event keys. A forward iterator stops before metadata automatically.

Key constants

  • \xff__meta__/latest_seq — latest committed RocksDB sequence number (restored on open)
  • \xff__meta__/bridge_watermark — last SQLite CDC sequence number durably bridged into RocksDB

Crash recovery

On RocksDbCdcLog::open, the persisted META_LATEST_SEQ key is read to restore the in-memory AtomicI64 counter. Appends resume from the correct position without scanning all events.

Thread safety

RocksDbCdcLog is both Send and Sync. The sequence counter is an AtomicI64 (lock-free increments); the underlying DB handle is Arc-shared so the log can be cheaply cloned across Tokio tasks.

RocksDB I/O is synchronous — CdcConsumer::poll and CdcConsumer::prune dispatch to tokio::task::spawn_blocking to avoid stalling the async runtime.

Source

crates/yoda-cdc-rocksdb/lib.rs (crate overview), log.rs (RocksDbCdcLog implementation), keys.rs (key encoding and metadata constants), config.rs (RocksDbCdcConfig).

Released under the Apache-2.0 License.