You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR adds a Postgres leader failover integration test for universaldb, fixes the resolver task to be properly aborted when the driver drops (preventing zombie lease renewal), refactors the universalpubsub Postgres driver from per-subscriber heartbeats to a single per-node heartbeat, and introduces a no-responders fast-fail path for request/reply patterns. The core correctness fix (aborting the resolver's JoinHandle in Drop) is well-targeted.
Findings
1. engine/packages/perf/src/lib.rs — PerfMeasure drop level change breaks a CI assertion
The PR demotes the dropped-without-finish log from tracing::warn! to tracing::debug!. The existing test in engine/packages/perf/tests/perf_measure.rs at line 201 asserts Level::WARN for exactly this event:
assert_eq!(Level::WARN, events[0].level);
This assertion will now fail. Beyond the broken test, demoting this to debug silently hides violations of the CLAUDE.md invariant that every PerfMeasure must end with perf_finish! or perf_abandon!, making it easy for future callers to accidentally discard measurements undetected.
2. engine/packages/universalpubsub/src/driver/postgres/mod.rs — Extra DB roundtrip on every request-reply publish
has_responders() is called unconditionally on every publish(subject, payload, Some(reply_subject)). This adds a two-subquery Postgres round-trip to every request-reply publish, doubling the DB load on what is the most latency-sensitive publish path (actor action dispatch, pegboard coordination, etc.).
A targeted alternative: only probe has_responders on the first attempt and use a shorter timeout for retry, or cache the subscriber count per subject with a short TTL.
3. engine/packages/universalpubsub/src/driver/postgres/mod.rs — Queue-subscriber branch in has_responders filters by hash only, not full subject
The broadcast-subscriber branch guards correctly with WHERE s.subject_hash = $1 AND s.subject = $2, but the queue-subscriber branch uses only WHERE s.subject_hash = $1 because ups_queue_subs stores no raw subject string. A hash collision between two different subjects would cause has_responders to return true for a subject with no actual subscriber, causing the request to publish a message that sits unclaimed until GC rather than returning an immediate NoResponders. With a 64-bit hash this is low-probability but the asymmetry is worth closing by adding a subject column to ups_queue_subs.
4. engine/packages/universaldb/tests/failover.rs — Hardcoded 4-second sleep before first connection attempt
tokio::time::sleep(Duration::from_secs(4)).await is the only guard between docker_config.start() and the first connect_raw / make_db call. On a loaded CI machine, Postgres may not be accepting connections in 4 seconds; make_db will then fail and the result will be misattributed to a product regression. The scripts/run/engine-postgres.sh change in this same PR replaced nc -z with pg_isready for exactly this reason. The test should do the same: retry the connection in a loop instead of sleeping a fixed amount.
5. engine/packages/universalpubsub/src/driver/postgres/mod.rs — tokio::spawn in Drop impls may not run during shutdown
Both PostgresSubscriber::drop and PostgresQueueSubscriber::drop spawn a cleanup task to DELETE the subscriber row. If the Tokio runtime is shutting down when the subscriber is dropped, the spawned future may never be polled, leaving stale rows in ups_subs / ups_queue_subs. The node-level GC eventually removes them, but the window (up to NODE_TTL_SECS = 30 s) means messages can be queued for dead subscribers on normal process shutdown. This is best-effort cleanup that is probably acceptable given GC coverage, but worth a comment explaining the trade-off.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.