Skip to content

fix(app): prevent dashboard edits from clobbering each other#2574

Open
rachit367 wants to merge 3 commits into
hyperdxio:mainfrom
rachit367:claude/fix-dashboard-mutation-race
Open

fix(app): prevent dashboard edits from clobbering each other#2574
rachit367 wants to merge 3 commits into
hyperdxio:mainfrom
rachit367:claude/fix-dashboard-mutation-race

Conversation

@rachit367

Copy link
Copy Markdown
Contributor

Fixes #2216

Why

Two setDashboard(...) calls in quick succession on a remote dashboard (e.g. toggle bordered on group A, then add a tab to group B) can leave the backend with only the second mutation's effect. The first change appears to save in the UI but is lost on reload — silent data loss.

Root cause

setDashboard calls updateDashboard.mutate(newDashboard) with no optimistic cache update, so the component keeps reading dashboard from the ['dashboards'] React Query cache until invalidateQueries -> refetch completes. Two consecutive setDashboard(produce(dashboard, ...)) calls both produce from the same pre-mutation snapshot, so the second PATCH body derives from state that doesn't include the first mutation's change and overwrites it.

Fix

Add optimistic updating to useUpdateDashboard (the maintainer-suggested Option 1, the smallest change and the standard TanStack Query pattern):

  • onMutate: cancel in-flight ['dashboards'] queries, snapshot the current cache, and merge the pending change into the cached dashboard so the next produce reads up-to-date state.
  • onError: roll the cache back to the snapshot.
  • onSettled: invalidate ['dashboards'] to reconcile with the server (replaces the previous onSuccess invalidate so it also runs after a rollback).

Tests

New dashboard.optimistic.test.ts:

  • onMutate writes the pending change into the cache (and leaves sibling dashboards untouched);
  • onError restores the previous snapshot;
  • onSettled invalidates.

All 3 pass; the existing 42 dashboard tests still pass; tsc --noEmit and eslint are clean on the changes. Changeset included.

useUpdateDashboard had no optimistic cache update, so the dashboard read
from the ['dashboards'] query cache stayed stale until refetch. Two
consecutive setDashboard(produce(dashboard, ...)) calls both derived from
the same pre-mutation snapshot, so the second PATCH overwrote the first
and the earlier change was lost on reload.

Optimistically update the dashboards cache in onMutate (rollback in
onError, invalidate in onSettled) so a following produce reads the
latest state.

Fixes hyperdxio#2216
@vercel

vercel Bot commented Jul 2, 2026

Copy link
Copy Markdown

@rachit367 is attempting to deploy a commit to the HyperDX Team on Vercel.

A member of the Team first needs to authorize it.

@changeset-bot

changeset-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 3bc9c36

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@hyperdx/app Patch
@hyperdx/api Patch
@hyperdx/otel-collector Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@greptile-apps

greptile-apps Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a silent data-loss race condition in remote dashboard saves by adding standard TanStack Query optimistic updates to useUpdateDashboard. When two edits fired in quick succession, both produce calls read the same stale snapshot and the second PATCH overwrote the first.

  • onMutate cancels in-flight ['dashboards'] queries, snapshots the current cache, and merges the pending partial update so any immediately following setDashboard call reads the already-updated state rather than the pre-mutation snapshot.
  • onError restores the snapshot, and onSettled (replacing the old onSuccess) invalidates the cache to reconcile with the server whether or not the mutation succeeded.
  • A new test file directly proves two sequential onMutate calls compose correctly (both changes survive) and covers rollback and the edge case where the cache is empty.

Confidence Score: 5/5

Safe to merge — the change is a well-scoped addition of standard optimistic-update hooks with no modifications to the mutation logic, API contract, or local-mode path.

The fix is minimal and mechanical: three lifecycle callbacks added to one useMutation call, each doing exactly what the TanStack Query documentation prescribes. The snapshot is correctly captured before the optimistic write and restored on error; onSettled ensures the cache is reconciled with the server regardless of outcome.

No files require special attention.

Important Files Changed

Filename Overview
packages/app/src/dashboard.ts Adds standard TanStack Query optimistic update hooks (onMutate / onError / onSettled) to useUpdateDashboard; implementation is correct and follows the documented pattern.
packages/app/src/tests/dashboard.optimistic.test.ts New test file with an in-memory store mock that lets two sequential onMutate calls genuinely share state; covers composition, rollback, empty-cache edge case, and onSettled invalidation.
.changeset/fix-dashboard-mutation-race.md Patch-level changeset entry accurately describing the fix; no issues.

Reviews (3): Last reviewed commit: "refactor(app): type the onError rollback..." | Re-trigger Greptile

Comment thread packages/app/src/__tests__/dashboard.optimistic.test.ts Outdated
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Deep Review

No critical issues found. The change correctly applies the standard TanStack Query optimistic-update pattern and genuinely fixes the primary #2216 failure (permanent silent loss on reload). The findings below are residual-race and test-coverage recommendations; the server-side partial $set merge means the remaining concurrency windows self-heal to server truth rather than losing data.

🟡 P2 — recommended

  • packages/app/src/dashboard.ts:165onSettled invalidates unconditionally per mutation, so a fast edit's refetch can land while a slower sibling edit is still in flight, transiently overwriting the sibling's optimistic write and re-opening a flicker window.
    • Fix: Gate the refetch on the last in-flight mutation, e.g. assign a shared mutationKey and only invalidateQueries when queryClient.isMutating({ mutationKey }) <= 1.
    • correctness, julik-frontend-races, adversarial, reliability
  • packages/app/src/dashboard.ts:157onError restores the whole ['dashboards'] snapshot captured at that mutation's onMutate; if an earlier edit fails while a later edit is optimistically applied, the rollback reverts to a pre-sibling snapshot and wipes the concurrent edit until the next refetch reconciles.
    • Fix: Restore only the failed dashboard entity, or skip the blind snapshot restore and let onSettled reconcile when other mutations are still in flight.
    • julik-frontend-races, adversarial, reliability, correctness
  • packages/app/src/dashboard.ts:157 — The if (context?.previousDashboards) no-op branch (empty-cache case where no snapshot was taken) has no test; only the rollback path is exercised.
    • Fix: Add a test that runs onMutate against an empty cache then fires onError, asserting the cache stays undefined and no rollback setQueryData occurs.
    • testing, kieran-typescript
🔵 P3 nitpicks (5)
  • packages/app/src/dashboard.ts:141 — The mutation variables type Partial<Dashboard> & { id: Dashboard['id'] } is spelled out twice (mutationFn and onMutate) and can silently drift.
    • Fix: Hoist to a named alias (e.g. UpdateDashboardInput) and reference it in both signatures.
    • kieran-typescript, maintainability
  • packages/app/src/dashboard.ts:165useUpdateDashboard now invalidates in onSettled while sibling useCreateDashboard/useDeleteDashboard still use onSuccess; the divergence is intentional but non-obvious and invites a future "consistency" revert that reintroduces the bug.
    • Fix: Add a one-line comment explaining onSettled must fire on error too, to resync after rollback.
  • packages/app/src/dashboard.ts:144 — The ['dashboards'] query-key literal is repeated many times across the module, and this change adds four more copies.
    • Fix: Extract a shared DASHBOARDS_QUERY_KEY constant used by every query and mutation.
  • packages/app/src/dashboard.ts:148 — In IS_LOCAL_MODE the optimistic onMutate write and onSettled invalidate are redundant with the synchronous localDashboards write and briefly cache raw, un-normalized tile colors before the refetch heals them.
    • Fix: Normalize in onMutate, or skip the optimistic path when IS_LOCAL_MODE since the underlying write is synchronous.
  • packages/app/src/__tests__/dashboard.optimistic.test.ts:86 — Asserting on setQueryData.mock.calls[0][0] couples the test to positional call ordering.
    • Fix: Assert via toHaveBeenCalledWith(['dashboards'], expect.any(Function)) and the shared in-memory store instead of the call index.

Pre-existing (not introduced by this diff; do not block merge):

  • packages/app/src/dashboard.ts:263isSetting is a single boolean that cannot represent N overlapping saves; the first mutation to settle clears it while others are still in flight. Optimistic updates make this more visible. Consider deriving from queryClient.isMutating(...) or a ref counter.
  • packages/app/src/api.ts:66hdxServer has no request timeout, so a hung PATCH leaves the optimistic state applied and isSetting stuck (onSettled never fires). An explicit timeout would surface onError rollback + toast instead.

Reviewers (10): correctness, testing, maintainability, project-standards, julik-frontend-races, kieran-typescript, reliability, adversarial, agent-native, learnings-researcher.

Testing gaps: No test interleaves two genuinely in-flight mutations with an onSettled refetch (the residual race). No test covers onError rollback while a sibling's optimistic write is still pending. No IS_LOCAL_MODE coverage for the optimistic path (config is hard-mocked to false).

Address review feedback on the hyperdxio#2216 fix:
- back the mock query client with an in-memory cache and add a test that
  drives two sequential onMutate calls, proving both edits survive
  instead of the second clobbering the first
- assert setQueryData targets the ['dashboards'] key
- cover the empty-cache path so onMutate no longer writes undefined
  (guarded in the implementation)
@rachit367

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review. Addressed the test-coverage feedback (@greptile-apps, deep-review):

  • Proves the actual fix, not just wiring: backed the mock query client with an in-memory cache and added a test that drives two sequential onMutate calls for the same dashboard with different partial edits, asserting both survive — this is the Dashboard: setDashboard mutations clobber each other when fired back-to-back #2216 clobber scenario, and it fails against the pre-fix code.
  • Key assertion: the first test now asserts setQueryData targets the ['dashboards'] key, so a wrong key would be caught.
  • Empty-cache path: guarded onMutate to leave an empty/unresolved cache untouched instead of writing undefined, with a test covering it.

On the same-tick batching note (P2): that's a real but separate edge — the reachable reported scenario is two edits as distinct user actions (separate renders), which this resolves. Deriving each PATCH from getQueryData freshest state would harden the same-tick case further; happy to follow up if you'd like it in this PR or a separate one. Kept the full-snapshot rollback as the standard TanStack pattern for the single-mutation path.

5 tests pass, existing 42 dashboard tests still green, tsc + eslint clean.

@greptile-apps

greptile-apps Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

The changes look solid. The implementation is clean and correct, and the test suite now genuinely exercises the fix rather than just testing wiring.

Implementation (dashboard.ts): The onMutate guard (current ? ... : current) correctly handles the unresolved-cache case, and returning { previousDashboards } follows the standard TanStack pattern for rollback context.

Tests: The three-part coverage is exactly right:

  1. The #2216 composition test is the most valuable — it directly proves the bug is fixed and would fail against the old code.
  2. The empty-cache guard test covers a real edge case (user edits before initial fetch completes).
  3. The rollback test confirms context.previousDashboards round-trips correctly through onError.

One minor observation: the rollback in onError calls setQueryData without the type parameter (queryClient.setQueryData(['dashboards'], context.previousDashboards)), while onMutate uses the typed version (setQueryData<Dashboard[]>). Functionally fine since previousDashboards is already Dashboard[] | undefined, but adding the type param for consistency is a cosmetic improvement if you want it — not a blocker.

Otherwise this is ready to go. The fix is correct, minimal, and well-tested.

Match the typed setQueryData<Dashboard[]> used in onMutate, per review.
@rachit367

Copy link
Copy Markdown
Contributor Author

Thanks @greptile-apps! Added the <Dashboard[]> type param to the onError rollback setQueryData for consistency with onMutate.

@greptile-apps

greptile-apps Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Good change — adding <Dashboard[]> to the setQueryData call in onError is the right call. Without it, TypeScript infers the setter's type from the value being passed (context.previousDashboards), which is Dashboard[] | undefined from the onMutate return. Making it explicit ensures the compiler validates the shape being written back matches what's expected in the cache, keeping it symmetric with onMutate. No issues on my end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dashboard: setDashboard mutations clobber each other when fired back-to-back

1 participant