fix(execution): size the wasm runner V8 heap so warmup stops OOMing#129
Merged
Conversation
9a3cf86 to
28c6540
Compare
The wasm runner isolate is started with JavascriptExecutionLimits::default(), so its V8 heap falls back to isolate::DEFAULT_HEAP_LIMIT_MB (128 MiB -- the per-GUEST isolate budget). But the runner is trusted infrastructure that must compile the WASI runtime + the guest's wasm module (e.g. bash.wasm) into its own heap before the guest runs, and that routinely exceeds 128 MiB. The near-heap-limit guard then terminates the isolate with an uncatchable, message-less exception, so warmup dies and surfaces as the opaque 'WebAssembly warmup exited with status 1 (Error: null)' (ERR_AGENTOS_NODE_SYNC_RPC). A clean release hits this too. Size the runner heap explicitly (default 2048 MiB, operator-tunable via AGENTOS_WASM_RUNNER_HEAP_LIMIT_MB) instead of leaving it on the per-guest default. This does NOT weaken guest isolation: the guest module's memory/fuel/stack stay bounded separately, Rust-side, from request.limits. The value is a V8 heap ceiling (heap_limits(0, cap)), committed only as used. Unit test wasm_runner_heap_limit_defaults_and_honors_operator_override covers the default (> 128), a positive override, and zero/non-numeric fallback. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
28c6540 to
892cb8e
Compare
|
🚅 Deployed to the secure-exec-pr-129 environment in rivet-frontend
🚅 Deployed to the secure-exec-pr-129 environment in secure-exec
|
NathanFlurry
added a commit
that referenced
this pull request
Jun 26, 2026
…sponses, fix service-test build Fixes surfaced while syncing agent-os against latest secure-exec main: 1. limits: classify DEFAULT_WASM_RUNNER_HEAP_LIMIT_MB (#129) and MAX_TIMER_DELAY_MS (#131) — both added without inventory entries, so limits_audit failed on main. 2. sidecar: accept_sidecar_response drops a stale sidecar_response with no matching pending request (UnmatchedResponse) or already completed (DuplicateResponse) instead of failing the whole sidecar — a per-VM callback can be answered by the host after that VM is disposed on the shared sidecar process. Real protocol violations stay fatal. 3. tests: re-export crate::EventSinkTransport into the source-included service test crate (#132 added the use in src/service.rs without the matching test re-export, breaking 'cargo test -p secure-exec-sidecar --test service' compilation). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
NathanFlurry
added a commit
that referenced
this pull request
Jun 26, 2026
…sponses, fix service-test build (#133) Fixes surfaced while syncing agent-os against latest secure-exec main: 1. limits: classify DEFAULT_WASM_RUNNER_HEAP_LIMIT_MB (#129) and MAX_TIMER_DELAY_MS (#131) — both added without inventory entries, so limits_audit failed on main. 2. sidecar: accept_sidecar_response drops a stale sidecar_response with no matching pending request (UnmatchedResponse) or already completed (DuplicateResponse) instead of failing the whole sidecar — a per-VM callback can be answered by the host after that VM is disposed on the shared sidecar process. Real protocol violations stay fatal. 3. tests: re-export crate::EventSinkTransport into the source-included service test crate (#132 added the use in src/service.rs without the matching test re-export, breaking 'cargo test -p secure-exec-sidecar --test service' compilation). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
WASM command warmup fails with the opaque
ERR_AGENTOS_NODE_SYNC_RPC: WebAssembly warmup exited with status 1(Error: null). Root cause:JavascriptExecutionLimits::default()(crates/execution/src/wasm.rs), so its V8 heap falls back toisolate::DEFAULT_HEAP_LIMIT_MB= 128 MiB — the per-guest-isolate budget (mirrors Cloudflare Workers).bash.wasm) into its own heap before the guest runs. That routinely exceeds 128 MiB.near_heap_limit_callbackthen terminates the isolate with an uncatchable, message-less exception → warmup dies → surfaces asstatus 1 (Error: null). Sidecar log showswarn "bounded limit exhausted" limit=v8_heap_bytes observed=131072000 capacity=131072000 fill_percent=100. A clean release hits this too.Fix
Size the runner heap explicitly — default 2048 MiB, operator-tunable via
AGENTOS_WASM_RUNNER_HEAP_LIMIT_MB— instead of leaving it on the per-guest default:This does not weaken guest isolation. The guest module's memory/fuel/stack stay bounded separately, Rust-side, from
request.limits(AGENTOS_WASM_MAX_MEMORY_BYTESetc.). The value is a V8 heap ceiling (heap_limits(0, cap)), committed only as used.Test
wasm_runner_heap_limit_defaults_and_honors_operator_overridecovers the bounded default (asserts > 128), a positive operator override, and zero/non-numeric fallback to default. (A live OOM-repro needs a >128 MiB module + wasm artifacts, so it's not a deterministic unit test; the resolver + call-site change are covered.)