Skip to content

[SLOP(claude-opus-4-8-high)] perf(universaldb): batch leader apply and fold follower commit round-trips#5342

Open
MasterPtato wants to merge 1 commit into
stack/slopfix-docs-self-hosting-promote-postgres-from-experimental-to-recommended-oss-multi-node-backend-zsmysqukfrom
stack/slop-claude-opus-4-8-high-perf-universaldb-batch-leader-apply-and-fold-follower-commit-round-trips-vtrtqpyp
Open

[SLOP(claude-opus-4-8-high)] perf(universaldb): batch leader apply and fold follower commit round-trips#5342
MasterPtato wants to merge 1 commit into
stack/slopfix-docs-self-hosting-promote-postgres-from-experimental-to-recommended-oss-multi-node-backend-zsmysqukfrom
stack/slop-claude-opus-4-8-high-perf-universaldb-batch-leader-apply-and-fold-follower-commit-round-trips-vtrtqpyp

Conversation

@MasterPtato

Copy link
Copy Markdown
Contributor

No description provided.

@MasterPtato

Copy link
Copy Markdown
Contributor Author

Stack for rivet-dev/rivet

Current stack:

Dependencies:

Get stack: forklift get 5342
Push local edits: forklift submit
Merge when ready: forklift merge 5342

@claude

claude Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Review

This PR bundles several well-designed performance improvements to the UniversalDB Postgres driver: combining the follower's INSERT + pg_notify into one autocommit round-trip, resolving commits directly from the NOTIFY payload instead of always polling the DB, splitting lease renewal into its own task so it can't be starved by a long drain, folding all winners' operations in memory and writing them in O(1) bulk SQL statements per batch, and fixing the recovery floor bug (max(durable, seq_high)durable_version only). The architecture is substantially cleaner after this change.


1. Debug log level baked into compose templates (medium)

Files: self-host/compose/template/src/docker-compose.ts:295, self-host/dev-multinode/docker-compose.yml:182, 222, 262

RUST_LOG=universaldb::driver::postgres=debug is added to the shipped templates for all three node services. Under commit load, tracing::debug! fires on every await_result resolution (committed / conflict / backstop) and every batch drain — potentially hundreds of times per second. This creates significant log I/O and CPU overhead for production deployments and appears to be a temporary debugging artifact. Either remove it or gate it on a separate env var before merging.


2. .get(1) column index is fragile after combined INSERT+NOTIFY query (low)

File: engine/packages/universaldb/src/driver/postgres/commit.rs

The new query:

SELECT pg_notify($5, id::text), id FROM ins

returns (void, id), and the code calls .get(1) to extract id. This works today because tokio-postgres lazily decodes columns, but the index is non-obvious and silently breaks if the SELECT order is ever changed.

A safer form that puts id at column 0 — so .get(0) is unambiguous — is to move pg_notify into a side-effect CTE:

WITH ins AS (
    INSERT INTO udb_commit_requests (...) VALUES (...) RETURNING id
),
_notify AS (SELECT pg_notify($5, id::text) FROM ins)
SELECT id FROM ins, _notify

or simply reorder to SELECT id, pg_notify($5, id::text) FROM ins and change the call to .get(0).


3. atomic_read_keys emits duplicate keys without deduplication (low, efficiency)

File: engine/packages/universaldb/src/driver/postgres/resolver/apply.rs:atomic_read_keys

If multiple winners in a batch have read-modify-write atomic ops on the same key (e.g., two sequential Add ops on a shared counter — a common hot-key pattern), that key is pushed into keys once per winner. WHERE key = ANY($1::bytea[]) handles duplicates correctly, but sends redundant bytes over the wire and may do extra Postgres scan work for large keys vectors.

A simple fix before returning:

keys.sort_unstable();
keys.dedup();

Or collect into a HashSet and convert back. Under a hot-key batch this matters; under typical workloads it's minor.


Other observations (no action needed)

  • The recovery_floor fix (using only durable_version rather than max(durable, seq_high)) is well-reasoned. The long comment explaining why seq_high is wrong (loser/cold-reject versions race far ahead, causing mass-reject storms) is exactly the kind of non-obvious invariant CLAUDE.md asks you to document.
  • The Overlay fold correctly handles all edge cases: set-after-clear-range re-inserts a point write, point writes win over range tombstones on read-through, and the final SQL execution order (range deletes → point deletes → upserts) matches the fold semantics.
  • The sibling-task architecture (AbortOnDropHandle for both renew_loop and drain_loop) correctly ensures a hard abort of lead() tears down both tasks, and a crashed drain rolls back cleanly via the pool connection drop.
  • The parse_reply function safely returns None for payloads from other waiters on the shared reply channel, with no ID collision risk across i64 values.

Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant