miner, core, consensus/bor, eth, triedb: pipelined state root computation (PoC) by pratikspatil024 · Pull Request #2180 · 0xPolygon/bor

pratikspatil024 · 2026-04-01T15:57:10Z

Description

Overlap state root computation (SRC) of block N with transaction execution of block N+1. Both for block production and block import.
On the miner side, the 500ms buffer previously reserved after transaction execution for SRC is removed when the pipeline is active. Transactions now get the full block time for inclusion since SRC runs in the background. The chain DB write is also moved off the critical path by writing asynchronously after broadcast, with witnesses cached in memory so stateless peers can fetch them immediately.
On the import side, after executing block N, the node defers IntermediateRoot + CommitWithUpdate to a background SRC goroutine and immediately proceeds to block N+1 using a FlatDiff overlay for state reads. The pipeline state persists across insertChain calls, enabling overlap even for single-block imports at the chain tip. Witnesses generated by the import pipeline are served to stateless peers via the WIT protocol.

This is built on top of the delayed SRC PoC and takes the approach further: instead of just deferring SRC, it pipelines SRC with the next block's work.

How it works - Miner (block production)

After producing block N, the miner:

Extracts a FlatDiff (in-memory snapshot of state mutations) instead of computing the state root inline
Spawns an SRC goroutine that computes the root in the background
Opens a speculative state for block N+1 using the FlatDiff overlay
Fills transactions for N+1 in a goroutine (concurrently with SRC)
Collects the SRC result as soon as it's ready, seals and broadcasts block N
Writes block N to the chain DB asynchronously
Repeats in a continuous loop

How it works - Import (block validation)

When importing block N:

Execute block N using FlatDiff overlay from block N-1 (if pipeline active)
Run ValidateStateCheap (gas, bloom, receipt root - no IntermediateRoot)
Extract FlatDiff via CommitSnapshot
Collect previous SRC(N-1) - verify root, write witness, handle trie GC
Write block metadata to DB immediately (sync protocol sees it)
Store FlatDiff for PostExecutionStateAt and RPC reads
Spawn SRC(N) in background - overlaps with block N+1's execution
Continue to next block without waiting for SRC(N)

Config

--miner.pipelined-src - enable/disable (default: enabled)
--miner.pipelined-src-logs - verbose pipeline logging (default: enabled)
--pipeline.enable-import-src - enable/disable import pipeline (default: enabled)
--pipeline.import-src-logs - verbose import pipeline logging (default: enabled)

Key changes beyond the miner pipeline

Path DB reader fallback (triedb/pathdb/reader.go) - chained fallback (entry-point layer → base disk layer) when concurrent SRC commits trigger layer flattening. Fixes "layer stale" errors for RPC reads.
FlatDiff overlay for state reads (core/blockchain_reader.go) - StateAt and StateAtWithReaders serve state from FlatDiff when the block's SRC hasn't committed yet. Enables correct eth_call, eth_getCode, eth_estimateGas,
and miner pending blocks during the pipeline window.
Trie-only reader for witness building (core/state/database.go) - SRC goroutine uses NewTrieOnly to force all reads through the MPT, ensuring complete witness capture. Flat readers bypass the trie and leave proof paths out of the
witness.
Witness serving for pipelined imports (eth/handler_wit.go, eth/handler.go) - WIT handler waits for in-flight SRC before returning empty. WitnessReadyEvent announces witness availability to stateless peers. Early caching in SRC
goroutine minimizes the availability gap.
System contract witness completeness (consensus/bor/bor.go) - PropagateReadsTo in checkAndCommitSpan ensures the validator contract's trie proof nodes are captured in the witness, even when read via a copied statedb.
FlatDiff.Destructs correctness (core/state/statedb.go) - getStateObject checks FlatDiff.Destructs before falling through to the trie reader, preventing self-destructed accounts from appearing to still exist.

…oved the post tx execution buffer time

claude

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

_{Tip: disable this comment in your organization's Code Review settings.}

claude · 2026-04-01T16:18:03Z

Code Review

Found 6 issues: 4 bugs and 2 security concerns.

Bugs

miner/worker.go:1117 — writeElapsed always measures ~0 (broken metric)
writeElapsed is computed immediately after writeStart, before either write call executes. The original code had the write call between writeStart and writeElapsed. The writeBlockAndSetHeadTimer metric will always report approximately zero. Fix: move writeElapsed := time.Since(writeStart) to after the if/else block.

bor/miner/worker.go

Lines 1116 to 1123 in 07345ad

    
           writeStart := time.Now() 
        
           writeElapsed := time.Since(writeStart) 
        
           if task.pipelined { 
        
           	_, err = w.chain.WriteBlockAndSetHeadPipelined(block, receipts, logs, task.state, true, task.witnessBytes) 
        
           } else { 
        
           	_, err = w.chain.WriteBlockAndSetHead(block, receipts, logs, task.state, true) 
        
           } 
        
           writeBlockAndSetHeadTimer.Update(writeElapsed)

miner/pipeline.go:380-383 — nil pointer dereference when chainHead == nil
When chainHead is nil, the || short-circuits to true and enters the if-body, where chainHead.Number.Uint64() panics. This is in the block production path. Per security-common.md: No panics in consensus, sync, or block production paths. Fix: split the nil check from the number check into separate if-blocks.

bor/miner/pipeline.go

Lines 379 to 384 in 07345ad

    
           chainHead := w.chain.CurrentBlock() 
        
           if chainHead == nil || chainHead.Number.Uint64() != blockNNum { 
        
           	log.Error("Pipelined SRC: chain head mismatch after waiting", "expected", blockNNum, 
        
           		"got", chainHead.Number.Uint64()) 
        
           	return 
        
           }

core/stateless/witness.go:101 — NewWitness no longer copies the context header (mutation risk)
The old code did ctx := types.CopyHeader(context) and zeroed Root/ReceiptHash. The new code stores the caller pointer directly. In miner/worker.go:1196, the raw header pointer is passed — this header is later mutated in place. The Witness will silently see those mutations. See state-security.md threat model.

bor/core/stateless/witness.go

Lines 88 to 107 in 07345ad

    
           func NewWitness(context *types.Header, chain HeaderReader) (*Witness, error) { 
        
           	// When building witnesses, retrieve the parent header, which will *always* 
        
           	// be included to act as a trustless pre-root hash container 
        
           	var headers []*types.Header 
        
           	if chain != nil { 
        
           		parent := chain.GetHeader(context.ParentHash, context.Number.Uint64()-1) 
        
           		if parent == nil { 
        
           			return nil, errors.New("failed to retrieve parent header") 
        
           		} 
        
           		headers = append(headers, parent) 
        
           	} 
        
           	// Create the witness with a reconstructed gutted out block 
        
           	return &Witness{ 
        
           		context: context, 
        
           		Headers: headers, 
        
           		Codes:   make(map[string]struct{}), 
        
           		State:   make(map[string]struct{}), 
        
           		chain:   chain, 
        
           	}, nil 
        
           }

miner/pipeline.go:124 — SetLastFlatDiff stores a provisional header hash that never matches
env.header.Hash() lacks both Root and the seal signature. In PostExecutionStateAt, the comparison uses the sealed header — so FlatDiff overlay path is never taken. The txpool falls back to StateAt(header.Root) which may fail if SRC hasn't committed. Same issue at lines 521 and 783.

bor/miner/pipeline.go

Lines 123 to 125 in 07345ad

    
           w.chain.SetLastFlatDiff(flatDiff, env.header.Hash()) 
        
           // Note: this counts block N as "entering the pipeline." If Prepare() fails

Security Concerns

core/stateless/witness.go:56 — pre-state root validation anchored to untrusted witness data
The old ValidateWitnessPreState took a caller-supplied expectedPreStateRoot. The new version fetches the parent using witness.context.ParentHash (from the witness itself). For peer-received witnesses, no call site verifies witness.context.ParentHash == block.ParentHash(). A malicious peer could bypass the pre-state root check. Per state-security.md and security-common.md peer-triggerable escalation.

bor/core/stateless/witness.go

Lines 49 to 71 in 07345ad

    
           // Get the witness context header (the block this witness is for). 
        
           contextHeader := witness.Header() 
        
           if contextHeader == nil { 
        
           	return fmt.Errorf("witness context header is nil") 
        
           } 
        
           // Get the parent block header from the chain. 
        
           parentHeader := headerReader.GetHeader(contextHeader.ParentHash, contextHeader.Number.Uint64()-1) 
        
           if parentHeader == nil { 
        
           	return fmt.Errorf("parent block header not found: parentHash=%x, parentNumber=%d", 
        
           		contextHeader.ParentHash, contextHeader.Number.Uint64()-1) 
        
           } 
        
           // Get witness pre-state root (from first header which should be parent). 
        
           witnessPreStateRoot := witness.Root() 
        
           // Compare with actual parent block's state root. 
        
           if witnessPreStateRoot != parentHeader.Root { 
        
           	return fmt.Errorf("witness pre-state root mismatch: witness=%x, parent=%x, blockNumber=%d", 
        
           		witnessPreStateRoot, parentHeader.Root, contextHeader.Number.Uint64()) 
        
           } 
        
           return nil

core/blockchain.go:4402 — SpawnSRCGoroutine uses raw go func() without panic recovery
The old code used bc.wg.Go(func() { ... }) for lifecycle-safe goroutine management. The new code uses bc.wg.Add(1) + raw go func(). If the goroutine panics, the process crashes without graceful shutdown. Per security-common.md: No panics in block production paths.

bor/core/blockchain.go

Lines 4399 to 4410 in 07345ad

    
           pending.wg.Add(1) 
        
           bc.wg.Add(1) 
        
           go func() { 
        
           	defer bc.wg.Done() 
        
           	defer pending.wg.Done() 
        
           	tmpDB, err := state.New(parentRoot, bc.statedb) 
        
           	if err != nil { 
        
           		log.Error("Pipelined SRC: failed to open tmpDB", "parentRoot", parentRoot, "err", err) 
        
           		pending.err = err 
        
           		return

claude · 2026-04-01T16:31:28Z

Code Review

Found 5 issues in miner/worker.go and miner/pipeline.go. Checked for bugs and CLAUDE.md compliance.

1. Bug: writeElapsed always ~0ns (miner/worker.go L1116-L1123)

writeElapsed := time.Since(writeStart) is computed immediately after writeStart := time.Now(), before either WriteBlockAndSetHeadPipelined or WriteBlockAndSetHead executes. writeBlockAndSetHeadTimer always records ~0, and workerMgaspsTimer (line 1148) reports inflated MGas/s. Fix: move writeElapsed := time.Since(writeStart) to after the if/else block.

2. Bug: nil pointer dereference (miner/pipeline.go L379-L384)

When chainHead is nil, the || short-circuits into the if-body, but chainHead.Number.Uint64() in log.Error dereferences nil and panics. Per CLAUDE.md: No panics in consensus, sync, or block production paths. Fix: split into two if-checks.

3. Bug: unchecked type assertion (miner/pipeline.go L335-L341)

borEngine, _ := w.engine.(*bor.Bor) discards the ok boolean. If w.engine is not *bor.Bor, borEngine is nil and borEngine.AssembleBlock(...) panics. The same assertion at line 96 correctly checks ok. Fix: check ok and return early.

4. Bug: goroutine leak on 5 return paths (miner/pipeline.go L293-L345)

initialFillDone channel (line 293) goroutine is not drained on return paths at lines 345, 357, 371, 373, 383. Only WaitForSRC error (line 331) and happy path (line 390) drain it. Fix: defer drain after line 293.

5. Bug: trie DB race after SpawnSRCGoroutine (miner/pipeline.go L206-L229)

SpawnSRCGoroutine called at line 213 launches a goroutine doing CommitWithUpdate. If StateAtWithFlatDiff fails (line 219) or GetHeader returns nil (line 228), fallbackToSequential does IntermediateRoot inline on the same parent root concurrently. The comments at lines 206-211 identify this as causing missing trie node / layer stale errors but only guard the Prepare() case. Fix: WaitForSRC() before fallbackToSequential, or move spawn after preconditions.

…ss manager hash matching

codecov · 2026-04-02T07:37:41Z

Codecov Report

❌ Patch coverage is 43.05382% with 1365 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.43%. Comparing base (0464d7b) to head (08b8c62).
⚠️ Report is 2 commits behind head on develop.

Files with missing lines	Patch %	Lines
miner/pipeline.go	5.42%	749 Missing ⚠️
miner/worker.go	60.83%	92 Missing and 20 partials ⚠️
core/state/statedb.go	81.04%	63 Missing and 6 partials ⚠️
core/state/warm_snapshot.go	45.08%	58 Missing and 9 partials ⚠️
consensus/bor/bor.go	40.00%	63 Missing ⚠️
core/state/trie_prefetcher.go	22.22%	51 Missing and 5 partials ⚠️
core/blockchain_reader.go	57.37%	47 Missing and 5 partials ⚠️
triedb/pathdb/reader.go	5.26%	36 Missing ⚠️
core/txpool/legacypool/legacypool.go	8.69%	21 Missing ⚠️
core/block_validator.go	20.00%	14 Missing and 6 partials ⚠️
... and 13 more

❌ Your patch check has failed because the patch coverage (43.05%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2180      +/-   ##
===========================================
+ Coverage    53.39%   53.43%   +0.04%     
===========================================
  Files          896      899       +3     
  Lines       159745   162799    +3054     
===========================================
+ Hits         85294    86995    +1701     
- Misses       69125    70396    +1271     
- Partials      5326     5408      +82

Files with missing lines	Coverage Δ
core/blockchain.go	`65.02% <ø> (+1.94%)`	⬆️
core/blockchain_insert.go	`78.50% <100.00%> (+1.27%)`	⬆️
core/evm.go	`93.67% <100.00%> (+19.65%)`	⬆️
core/stateless/encoding.go	`63.49% <ø> (ø)`
core/stateless/witness.go	`44.64% <100.00%> (+6.02%)`	⬆️
core/txpool/blobpool/blobpool.go	`54.87% <100.00%> (ø)`
core/types/block.go	`42.65% <ø> (ø)`
eth/ethconfig/config.go	`78.94% <ø> (ø)`
eth/peer.go	`95.80% <100.00%> (ø)`
internal/cli/server/config.go	`64.09% <100.00%> (+0.23%)`	⬆️
... and 25 more

... and 28 files with indirect coverage changes

Files with missing lines	Coverage Δ
core/blockchain.go	`65.02% <ø> (+1.94%)`	⬆️
core/blockchain_insert.go	`78.50% <100.00%> (+1.27%)`	⬆️
core/evm.go	`93.67% <100.00%> (+19.65%)`	⬆️
core/stateless/encoding.go	`63.49% <ø> (ø)`
core/stateless/witness.go	`44.64% <100.00%> (+6.02%)`	⬆️
core/txpool/blobpool/blobpool.go	`54.87% <100.00%> (ø)`
core/types/block.go	`42.65% <ø> (ø)`
eth/ethconfig/config.go	`78.94% <ø> (ø)`
eth/peer.go	`95.80% <100.00%> (ø)`
internal/cli/server/config.go	`64.09% <100.00%> (+0.23%)`	⬆️
... and 25 more

... and 28 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

lucca30 · 2026-04-06T22:10:13Z

Additionally, the 500ms buffer previously reserved after transaction execution for SRC is removed when the pipeline is active. Transactions now get the full block time for inclusion since SRC runs in the background.

I am okay with the idea of removing the remaining 100ms.

We already reduced this buffer from 500ms to 100ms in v2.7.1, and from what we have seen so far, this remaining time looks small enough that removing it seems reasonable.

My main concern is not the removal of the 100ms itself. My concern is the cost of pipelining SRC with the next block production.

In other words: by doing SRC in parallel with block building, how much do we impact SRC time itself?

Do we expect SRC to remain roughly the same, or does it become meaningfully slower because it is now competing with the next block production? That is the part I would like to understand better.

I think this is basically a TPS vs finality question:

on one side, we gain more block time for transaction inclusion, which is good for TPS
on the other side, if SRC takes longer to complete, we may delay block completion, which could hurt finality

So I am supportive of the direction, but I think the key question is still:

How much TPS do we gain, and how much finality do we lose, if any, by making SRC fully pipelined with block production?

If the impact on SRC time is only slight, then the tradeoff is probably clearly worth it.

But if SRC time increases materially once it is pipelined with block production, then we should make that tradeoff explicit

cffls · 2026-04-07T20:57:51Z

Do we expect SRC to remain roughly the same, or does it become meaningfully slower because it is now competing with the next block production? That is the part I would like to understand better.

I think SRC will be roughly the same, because the time consuming part, trie nodes prefetching, is already running at the same time with tx execution today, and this PR doesn't change this behavior.

cffls · 2026-04-07T22:52:43Z

+// state resets for pipelined SRC. This avoids import cycles between txpool
+// and legacypool packages.
+type SpeculativeResetter interface {
+	ResetSpeculativeState(newHead *types.Header, statedb *state.StateDB)


In terms of naming, I would simply name it as SpeculativeSetter and SetSpeculativeState. The reset seems redundant.

Addressed here, thanks!

cffls · 2026-04-07T23:05:14Z

+// The state commit is handled separately by the SRC goroutine that already
+// called CommitWithUpdate. This avoids the "layer stale" error that occurs
+// when two CommitWithUpdate calls diverge from the same parent root.
+func (bc *BlockChain) WriteBlockAndSetHeadPipelined(block *types.Block, receipts []*types.Receipt, logs []*types.Log, statedb *state.StateDB, emitHeadEvent bool, witnessBytes []byte) (WriteStatus, error) {


There are some shared code between this and WriteBlockAndSetHead. Could we refactor and dedupe the code?

Addressed here, thanks!

cffls · 2026-04-07T23:06:48Z

+// This is used by the txpool and RPC layer to get correct state when the chain
+// head was produced via the pipeline (where the committed trie root may lag
+// behind the actual post-execution state).
+func (bc *BlockChain) PostExecutionStateAt(header *types.Header) (*state.StateDB, error) {


nitpick PostExecutionStateAt -> PostExecState to make it simpler

Addressed here, thanks!

cffls · 2026-04-08T01:27:24Z

+// speculatively using the FlatDiff overlay, then waits for SRC(N) to complete,
+// assembles block N, and sends it for sealing. Then it finalizes N+1 and
+// seals it as well.
+func (w *worker) commitSpeculativeWork(req *speculativeWorkReq) {


This is a huge function with 500+ lines. Can we decompose it into smaller functions for maintainability?

Addressed during the diffguard refactoring, thanks!

cffls · 2026-04-08T01:50:52Z

+	var coinbase common.Address
+	if w.chainConfig.Bor != nil && w.chainConfig.Bor.IsRio(new(big.Int).SetUint64(nextBlockNumber)) {
+		coinbase = common.HexToAddress(w.chainConfig.Bor.CalculateCoinbase(nextBlockNumber))
+	}
+	if coinbase == (common.Address{}) {
+		coinbase = w.etherbase()
+	}
+
+	specHeader := &types.Header{
+		ParentHash: placeholder,
+		Number:     new(big.Int).SetUint64(nextBlockNumber),
+		GasLimit:   core.CalcGasLimit(blockNHeader.GasLimit, w.config.GasCeil),
+		Time:       blockNHeader.Time + w.chainConfig.Bor.CalculatePeriod(nextBlockNumber),
+		Coinbase:   coinbase,
+	}
+	if w.chainConfig.IsLondon(specHeader.Number) {
+		specHeader.BaseFee = eip1559.CalcBaseFee(w.chainConfig, blockNHeader)
+	}
+
+	// Call Prepare() via the speculative chain reader with waitOnPrepare=false.
+	// This sets Difficulty, Extra (validator bytes at sprint boundary), and timestamp
+	// but does NOT sleep. The timing wait is deferred until after the abort check
+	// to avoid wasting a full block period if the speculative block is discarded.
+	// NOTE: Prepare() will zero out specHeader.Coinbase. The real coinbase
+	// is preserved in the local `coinbase` variable above.
+	if err := w.engine.Prepare(specReader, specHeader, false); err != nil {
+		log.Warn("Pipelined SRC: speculative Prepare failed, falling back", "err", err)
+		w.fallbackToSequential(req)
+		return
+	}


This duplicates a few things with makeHeader in worker.go. Maybe worth to unify.

Unified coinbase resolution via resolveCoinbase(blockNumber, fallback). The rest can't be merged (placeholder parent, deterministic bor-period timestamp, no engine.Prepare); comment on buildInitialSpecHeader documents why.

cffls · 2026-04-08T02:35:37Z

+		w.fallbackToSequential(req)
+		return
+	}
+	specState.StartPrefetcher("miner-speculative", nil, nil)


Regarding "layer stale" errors from prefetcher, I think we can delay the prefetching of N+1 until SRC for block N has completed. Asked claude about this idea and this is what it suggested:

The existing getStateObject/GetCommittedState code already calls prefetcher.prefetch() during execution, which queues tasks and records what was accessed. The problem is that subfetcher.loop() immediately calls openTrie() and starts resolving — hitting the stale layer. If we just delay the resolution, the queueing and dedup logic stays untouched. The change: 1. trie_prefetcher.go (~30 lines) — add a gate channel to subfetcher: type subfetcher struct { // ... existing fields ... gate chan struct{} // If non-nil, loop blocks until closed } func (sf *subfetcher) loop() { defer close(sf.term) // Wait for gate to open before touching the trie if sf.gate != nil { select { case <-sf.gate: case <-sf.stop: return } } if err := sf.openTrie(); err != nil { return } // ... existing loop unchanged ... } Add Resume() to triePrefetcher: func (p *triePrefetcher) Resume() { p.lock.Lock() defer p.lock.Unlock() for _, f := range p.fetchers { if f.gate != nil { close(f.gate) // Re-signal wake since signals were dropped while gated select { case f.wake <- struct{}{}: default: } } } } Wire the gate through: newSubfetcher accepts a gate channel, triePrefetcher stores a gated bool, and prefetch() passes the gate when creating subfetchers. 2. statedb.go (~10 lines) — expose resume: func (s *StateDB) ResumePrefetcher() { if s.prefetcher != nil { s.prefetcher.Resume() } } 3. pipeline.go (~5 lines) — start gated, resume after SRC: // Before execution (line 225): specState.StartPrefetcherGated("miner-speculative", nil, nil) // After WaitForSRC returns (line 339): specState.ResumePrefetcher() The one tricky bit is the wake signal: schedule() has select { case sf.wake <- struct{}{}: default: } — if the loop isn't listening (gated), the signal is dropped. The Resume() method handles this by re-signaling wake after opening the gate. Any subfetcher with queued tasks will pick them up. That's it. No changes to pathdb, no changes to the hot execution path (getStateObject/GetCommittedState), no changes to the trie layer. The prefetcher's existing dedup tracking (seenReadAddr, seenReadSlot) means repeated accesses during execution are collapsed — when the gate opens, only unique trie paths get resolved. In the loop iterations (lines 620-652), the same pattern applies — the fill goroutine runs with a gated prefetcher, and Resume() is called after the iteration's WaitForSRC returns.

The only concern with gating is that it delays prefetching until SRC completes, making the overlap window slower. WDYT?

I think it is fine to wait until the SRC of previous block completes. The longest execution path (bottleneck) is the transaction execution. As long as prefetch + SRC is using less time than txn execution, it is fine to do SRC and prefetch in sequence. I think the code will look cleaner or less change is required if we force prefetcher to wait for SRC.

Hi @cffls, so I had a chance to look into this. Also, sorry for not being clear about all the errors: these were the following errors observed on the devnet

RPC: Unexpected trie node / failed opening storage trie

BP: Unexpected trie node / failed opening storage trie and layer stale

Regarding delaying prefetch until SRC - it can reduce layer stale errors on the miner speculative path, but it does not fix the storage-root mismatch / Unexpected trie node problem. Also, it will not address the RPC/import-side failures.

Also to answer your other question "Is it still necessary if we make sure all the state root is requested correctly during SRC? I am wondering whether this can cause problems when a layer is actually stale":

Yes, it is needed because correct roots and nodeFallback are solving different problems.

Correct root handling fixes opening a storage trie with a root that is inconsistent with the prefetcher’s reader state

nodeFallback fixes a read that was valid when started, but walks into a layer that became stale due to concurrent cap()/persist()

And regarding the concern:

NodeFallback only triggers on errSnapshotStale

It retries through the current layer chain first, then the current base disk layer

After fallback, Node() still does the normal got != hash check

So it should not silently return wrong data. Worst case, it still errors.

ok makes sense. Thanks for taking a look!

cffls · 2026-04-08T02:37:35Z

@@ -0,0 +1,933 @@
+package miner


Nice job on isolating the new logic in a new file!

…r block import Overlap SRC(N) with execution of block N+1 on importing/RPC nodes. After executing block N, defer IntermediateRoot + CommitWithUpdate to a background SRC goroutine and immediately proceed to block N+1 using a FlatDiff overlay for state reads. Cross-call persistence allows the SRC to run across insertChain boundaries. Key changes: - Pipeline path in insertChainWithWitnesses with ValidateStateCheap - FlatDiff overlay in StateAt, StateAtWithReaders, PostExecutionStateAt - Path DB reader chained fallback for concurrent layer flattening - Trie-only reader for SRC witness generation (no flat reader bypass) - WIT handler waits for pipelined witness before returning empty - WitnessReadyEvent for announcing witnesses to stateless peers - PropagateReadsTo in checkAndCommitSpan for witness completeness - Feature gated: --pipeline.enable-import-src

Adds TestPipelinedImportSRC_SelfDestruct to verify that the FlatDiff Destructs check in getStateObject correctly handles self-destructed contracts during pipelined import.

Two fixes for prefetcher errors during pipelined state root computation: 1. Storage root mismatch: FlatDiff accounts had storage roots from block N's post-state, but the prefetcher's NodeReader was at the committed parent root (grandparent). Add prefetchRoot field to stateObject that stores the grandparent's storage root, read from the flat state reader when loading from FlatDiff. Use it consistently across all prefetcher interactions. 2. Layer stale during trie node resolution: SRC's cap() flattens diff layers concurrently with prefetcher trie walks. Add nodeFallback to reader.Node(), mirroring the existing accountFallback/storageFallback pattern — retries via the current base disk layer on errSnapshotStale.

cffls · 2026-04-11T00:34:09Z

+		// the current base disk layer — same strategy as accountFallback and
+		// storageFallback.
+		if errors.Is(err, errSnapshotStale) {
+			blob, got, loc, err = r.nodeFallback(owner, path)


Is it still necessary if we make sure all the state root is requested correctly during SRC? I am wondering whether this can cause problems when a layer is actually stale.

I have not tried that yet, but claude thinks yes.
Because prefetchRoot and nodeFallback fix different races. prefetchRoot fixes the storage root mismatch, nodeFallback handles cap() marking the disk layer stale during concurrent SRC. Also, even with all roots correct, cap() can still stale the layer mid-walk.

Posted a reply here

A series of fixes for pipelined SRC under EIP-2935/BLOCKHASH aborts and abort-heavy devnet load: 1. Skip pipeline pre-Rio. Pre-Rio speculative Prepare walks unsigned speculative headers and can hit ecrecover failures on zero-seal Extra data. Disable pipelined SRC before Rio so the miner stays on the safe sequential path there. 2. Move slot waiting fully to Seal and keep abort rebuilds in-slot. The miner now always builds block bodies early and uses the slot for tx selection, while Bor holds propagation until the target time in Seal(). Abort-recovery headers carry a miner-local AbortRecovery flag so late speculative rebuilds stay in-slot instead of getting pushed to the next slot by minBlockBuildTime. 3. Isolate block-build timeout state per build environment. Sequential builds and speculative fills previously shared a worker-global timeout flag, so one build's timer could interrupt another build's tx selection. Move timeout state onto each environment and make timer cancel stop the timer without poisoning the build as timed out. 4. Improve speculative fill behavior and fix DAG metadata on refill. Speculative blocks now take a late refill pass when they are still under about 75% full by gas and there is at least 300ms left before the slot, not only when fully empty. Keep tx dependency DAG state on the block environment across refill passes so multi-pass speculative fills do not restart dependency indices from zero and drop metadata with non-sequential transaction index errors. 5. Harden abort recovery and mined-block propagation. After speculative aborts, requeue normal work through the standard worker path instead of re-entering commitWork recursively. On the networking side, mined inline blocks now still announce correctly when witness data is already cached but the async block write is not yet visible in the DB. 6. Add regression coverage and clean up logs. Add tests for Bor timing behavior, speculative refill decisions, per-build interrupt isolation, DAG metadata persistence across refill passes, cached-witness announcement, and BLOCKHASH(N) abort-flag behavior. Also remove duplicate EIP-2935 abort logs and fix negative seal-delay logging so slightly-late blocks no longer print huge wrapped unsigned delays.

Wires a complete metrics suite for A/B comparing pipelined vs non-pipelined import and block production on mainnet. New pipelined metrics (import): - chain/imports/pipelined/{hit,miss,root_mismatch,enabled} - chain/imports/witness_ready_end_to_end — apples-to-apples end-to-end timer, fires in both modes (primary A/B KPI) New pipelined metrics (build): - worker/pipelineSpeculativeCommitted, pipelineSRCWait, pipelineSealDuration - worker/pipelineAnnounceEarlinessMs (signed ms — PIP-66 earliness signal) - worker/pipelineSpeculativeAborts/{blockhash,src_failed,fallback} - worker/build_to_announce — producer-side end-to-end, both modes - worker/pipeline/enabled Parity wiring for legacy metrics so dashboards work in both modes: - chain/inserts, account/storage read + hash + update + commit timers, snapshot/triedb commits, stateCommitTimer, blockBatchWriteTimer, witnessEncode/DbWrite — emitted from the pipelined branch (main statedb or SRC goroutine's tmpDB as appropriate) - worker/writeBlockAndSetHead — emitted from inlineSealAndBroadcast's async write goroutine - pipelineAnnounceEarlinessMs and pipelineSpeculativeCommittedCounter also emitted from resultLoop for the sealBlockViaTaskCh path Throughput and overlay observability: - chain/{gas_used_per_block,txs_per_block,mgasps} + chain/witness/size_bytes - worker/chain/{gas_used_per_block,txs_per_block} - state/flatdiff/{account_hits,storage_hits} — FlatDiff overlay effectiveness Metrics that have no clean pipelined semantic (chain/validation, chain/write, worker/commit, worker/finalizeAndAssemble, worker/intermediateRoot) are left unemitted in pipelined mode with inline comments documenting the reason and pointing to the closest pipeline equivalent.

claude · 2026-04-21T11:25:48Z

Code Review

Found 3 issues in core/blockchain.go. Checked for bugs and CLAUDE.md compliance.

Missing ValidateReorg guard in pipelined import path (HIGH) — The non-pipelined path checks forker.ValidateReorg per-block before writing to prevent a race with newly-whitelisted Heimdall milestones/checkpoints. The pipelined path skips this check, allowing a block that conflicts with a freshly-anchored milestone to be accepted as chain head.
writeHeadBlock called without chainmu in auto-collection goroutine (HIGH) — On root mismatch, the recovery goroutine calls writeHeadBlock(parentBlock) which has a documented contract requiring the mu mutex to be held. The goroutine does not acquire it, creating a data race on chain head state.
flushPendingImportSRC error silently discarded (HIGH) — The function returns a meaningful error (SRC failure or root mismatch from a previously-committed block) but the error is discarded with _ in the block-error cleanup path. Every other call site handles this error.

claude · 2026-04-21T11:33:08Z

Inline Review Comments

Since inline comments could not be posted via the review API, here are the detailed findings with line references:

Issue 1 (HIGH): Missing ValidateReorg guard in pipelined import path — core/blockchain.go:3457

The non-pipelined path (lines 3618-3631) calls forker.ValidateReorg per-block immediately before writing to DB to prevent a race where a Heimdall milestone/checkpoint is whitelisted during block execution. The pipelined path skips this check entirely, calling writeBlockAndSetHeadPipelined without a preceding ValidateReorg. A block that conflicts with a freshly-anchored Heimdall milestone or checkpoint can be accepted as chain head in pipelined mode. The upfront check at line 3103 runs before execution begins and does not close this window.

Fix: Add the same per-block ValidateReorg guard before writeBlockAndSetHeadPipelined.

CLAUDE.md: blockchain-security.md and consensus-security.md

Issue 2 (HIGH): writeHeadBlock called without chainmu in auto-collection goroutine — core/blockchain.go:3507

writeHeadBlock has a documented contract at line 1733: "this function assumes that the mu mutex is held!". The auto-collection goroutine calls it without acquiring chainmu. After insertChainWithWitnesses returns and releases chainmu, this goroutine may still be running. If a root mismatch is detected, writeHeadBlock is called without the mutex — while another goroutine could concurrently acquire chainmu for a new InsertChain call. This is a data race on chain head state.

Fix: Acquire bc.chainmu.Lock() before calling writeHeadBlock in the error recovery path.

CLAUDE.md: security-common.md — "Shared mutable state protected by mutex or atomic operations"

Issue 3 (HIGH): flushPendingImportSRC error silently discarded — core/blockchain.go:3407

flushPendingImportSRC() returns a meaningful error (state root mismatch or SRC failure from a previously-committed block). Discarding it with _ means a block with a bad state root could persist undetected. Every other call site handles this error (line 1784, line 3328).

Fix: Replace _ = bc.flushPendingImportSRC() with if err := bc.flushPendingImportSRC(); err != nil { log.Error(...) } consistent with other call sites.

CLAUDE.md: security-common.md — "Error values checked — never discard errors with _ in security-sensitive paths"

…ions for diffguard compliance Decompose large pipelined-src-authored functions into focused helpers so every function owned by this branch sits under diffguard's 50-line / complexity-10 limits. Pure structural refactor — no behavior change. miner/pipeline.go: - commitSpeculativeWork (599) → orchestrator (35) + specSession struct with ~18 methods (setupInitial, waitForSRCAndSealBlockN, runOneIteration, prepareNextIteration, sealCurrentAndAdvance, shiftToNext, etc.) - inlineSealAndBroadcast (100) → 35 + sealViaPrivateChannel, rebindReceiptsToSealedBlock, announceInlineSealedBlock - commitPipelined (59) → 37 + buildSpeculativeReq, spawnSRCForFinalBlock - sealBlockViaTaskCh (52) → 48 (reuses spawnSRCForFinalBlock) miner/worker.go: - fillTransactions (59) → 47 + commitTxMaps - makeEnv (51) → 38 + resolveStateFor - updateTxDependencyMetadata (68) → 32 + buildTxDependencyArray Pre-existing develop functions where pipelined-src had grown the body are reduced back close to or below their develop size by extracting the added branches: - commitWork (67 → 36) via clearPendingWorkOnExit + maybeStartPrefetch - resultLoop (191 → 124; develop was 123) via emitExecutionMetrics, emitCommitMetrics, writeTaskBlock, announceTaskBlock - mainLoop (135 → 120; develop was 116) via handleSpeculativeWork - buildAndCommitBlock (93 → 83; develop was 80) via submitForSealing core/state/statedb.go: - CommitSnapshot (95, complexity 40) → 30 + captureMutation, captureObjectStorage, captureReadOnlyAccount, captureNonExistentRead - ApplyFlatDiffForCommit (49, complexity 20) → 16 + applyFlatMutation - ApplyFlatDiff (36, complexity 11) → 13 + applyFlatAccountOverlay - TouchAllAddresses (25, complexity 11) → 12 + touchAddressAndStorage, mutatedStorageKeys core/blockchain.go: - SpawnSRCGoroutine (127, complexity 35) → 13 + runSRCCompute, openSRCStateDB, preloadFlatDiffReads, emitSRCStateDBMetrics, encodeAndCachePendingWitness - writeBlockAndSetHeadPipelined (108, complexity 29) → 16 + writePipelinedBlockBatch, writeBorStateSyncLogs, resolveWriteStatus, emitPipelinedWriteEvents - handleImportTrieGC (52, complexity 16) → 21 + capTrieIfDirty, maybeFlushChosen, dereferenceUpTo - waitForPipelinedWitness (complexity 11) → 9 + waitForPendingSRCWitness, pollWitnessCache core/evm.go: - SpeculativeGetHashFn (complexity 12) → 17 + newPendingBlockNResolver core/blockchain.go insertChainWithWitnesses pipelined branch (had grown +222 lines on top of develop's 452) → +42 via buildPipelineImportOpts, persistPipelinedImport, collectPrevImportSRCIfAny, emitStateSyncFeed, runImportAutoCollection, verifyImportSRCRoot, publishImportWitness, emitPipelinedImportParityMetrics. core/blockchain.go ProcessBlock pipelined branches (+22 lines) → +4 via pipelineReaderRoot, applyFlatDiffOverlayToAll, validateStateForPipeline. eth/peer.go: - doWitnessRequest (pipelined-src pushed from 38 → 65) → 32 + awaitWitnessResponse extracting the goroutine body eth/handler_wit.go: - handleGetWitness (pipelined-src pushed from 70 → 91) → 66 + resolveWitnessSizes consolidating per-hash size resolution (rawdb + header-existence DoS guard + SRC cache fallback) tests/bor/helper.go: - InitMinerWithPipelinedSRC (65) → 32 + newPipelineTestNode (17), importValidatorKey (11) - InitImporterWithPipelinedSRC (64) → 31 (same helpers) Mutation coverage. Ran diffguard in diff-scoped mode (-base develop -include-paths <module>) across every module pipelined-src touches and filled the gaps it surfaced: - core/state: adds core/state/statedb_pipeline_mutations_test.go with 41 targeted tests that kill 24 of 28 mutation survivors in pipelined-src FlatDiff code (statedb.go lines 2031-2330, 2492-2499). The 4 remaining are equivalent mutants — Finalise removes destructed addrs before the guarded branches can fire (2114, 2163), a zero-length loop produces the same output with or without the guard (2141), and an empty-slice map entry is observationally equivalent to a missing entry (2150). Covers CommitSnapshot and its capture helpers, ApplyFlatDiff + applyFlatAccountOverlay, ApplyFlatDiffForCommit + applyFlatMutation, NewWithFlatBase, TouchAllAddresses + touchAddressAndStorage + mutatedStorageKeys, WasStorageSlotRead, and PropagateReadsTo — 14 of 15 functions at 100% line coverage (captureReadOnlyAccount at 90.9%). - core/stateless: extends witness_test.go with 3 tests targeting ValidateWitnessPreState's expectedBlock guard (ParentHash and Number checks that defend against a malicious peer substituting a witness for a different block / fork). Previous tests all passed nil for expectedBlock, leaving the entire anti-forgery branch uncovered. - eth/filters: adds TestResolveBlockNumForRangeCheck and TestCheckBlockRangeLimit (16 subcases) to api_test.go covering the RPC range-limit DoS guard at the unit level (sentinel resolution, span-at-limit boundary, sum-vs-span distinction). Extends TestInvalidGetRangeLogsRequest in filter_system_test.go to also exercise GetBorBlockLogs with an inverted range — previously only GetLogs was covered. Per-module mutation scores after this coverage: miner 96%, consensus/bor 100% (41/41), core 100% (447/447), core/state 86% (24/28 equivalent), core/stateless 100% (8/8), core/txpool 100% (12/12), tests/bor 100% (43/43), triedb/pathdb 100% (20/20). eth at 53% — remaining survivors are in auto-generated gen_config.go boilerplate (36), P2P dispatcher cancel-channel plumbing awaitWitnessResponse goroutine cleanup (3); documented as accepted gaps requiring complex mock infrastructure for diminishing security return. Remaining diffguard violations in miner and core are pre-existing develop functions (commitTransactions, insertChainWithWitnesses, newWorkLoop, NewBlockChain, ProcessBlock, writeBlockWithState, etc.) that were over threshold on develop before pipelined-src. Their pipelined-src deltas are now small (+1 to +42 lines) and out of scope for this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Renames (reviewer nits): - PostExecutionStateAt → PostExecState (BlockChain + txpool/legacypool/ blobpool interfaces + test mocks). - ResetSpeculativeState → SetSpeculativeState, SpeculativeResetter → SpeculativeSetter (the method overwrites, it doesn't revert). Dedup between writeBlockAndSetHead and writeBlockAndSetHeadPipelined. Both paths now share resolvePostWriteStatus(block, stateless) for fork-choice + reorg (stateless flag preserves the errInvalidNewChain escape for fast-forward sync), emitPostWriteEvents for the feed sends, and writeBorStateSyncLogs for the pre-Madhugiri bor receipt. ~80 lines of duplicated fork-choice + event logic removed; writeBlockAndSetHead drops from ~70 to ~11 lines. Batch bodies intentionally not merged — witness source (statedb.Witness() vs pre-encoded bytes) and trie-commit timing genuinely differ. Miner coinbase unification: extracted resolveCoinbase(blockNumber, fallback) used by makeHeader (fallback=genParams.coinbase) and the speculative header builders (fallback=etherbase()). Divergence between the speculative and real header would cause a state root mismatch, so single-sourcing this is security-meaningful. Rest of buildInitialSpecHeader kept separate from makeHeader (placeholder parent, deterministic bor period timestamp, static GasCeil, no engine.Prepare); comment documents why unifying further would hurt readability. Pipelined import correctness fixes (core/blockchain.go): 1. persistPipelinedImport now runs the Heimdall milestone/checkpoint ValidateReorg guard before writeBlockAndSetHeadPipelined — mirrors the non-pipelined path. Without it, a milestone whitelisted during block execution could be bypassed. 2. verifyImportSRCRoot wraps its writeHeadBlock revert in chainmu TryLock/Unlock. The call ran in the auto-collection goroutine without the mutex, racing any concurrent InsertChain on head state. Skips + warns if chainmu is closed (shutdown). 3. flushPendingImportSRC error in insertChain's ProcessBlock error path no longer discarded with `_`; logged like the other two call sites. Linter: dropped two `tc := tc` loop-var copies in eth/filters/api_test.go (copyloopvar, redundant since Go 1.22).

Move the expensive WarmSnapshot construction out of the synchronous pipelined import post-exec path. The import thread now stops the execution-side prefetcher, collects the quiesced warm-node maps into a WarmSnapshotInput, and passes that handoff to SRC. The SRC goroutine builds the final immutable WarmSnapshot before opening the snapshot-aware trie reader. This keeps the same safety boundary: subfetchers have exited before their trie witness maps are read, and the final WarmSnapshot still owns copied node blobs. The difference is that copy/hash/index work no longer inflates prefetchStop/postExec on the import thread. The warm_snapshot/build metric now measures SRC-side build time. Slow pipelined import logs no longer report warmBuild as a synchronous phase.

Pipelined SRC only needs warm nodes that are already loaded by the execution prefetcher. It does not need to synchronously drain every queued speculative prefetch task before spawning SRC; missing warm nodes are safe performance misses because SRC falls through to pathdb. Add a snapshot-fast prefetcher stop mode for StopAndCollectWarmSnapshot. The new mode rejects new work, drops queued/unstarted tasks, avoids starting new trie/pathdb reads after stop is requested, and still waits for in-flight subfetcher goroutines to exit before reading trie witness maps. Keep the existing full-drain behavior for normal StopPrefetcher callers. Add tests covering queued-task discard, full-drain preservation, and the synchronous production terminateForSnapshot path.

Pipelined SRC only needs warm nodes that the execution prefetcher has already loaded. Missing warm nodes are safe cache misses because SRC falls through to pathdb, so the warm-snapshot stop path should not wait for one large speculative prefetch batch to finish. Split subfetcher account and storage prefetches into bounded chunks. In snapshot-fast mode, the subfetcher now checks for stop between chunks and exits before starting more trie/pathdb reads. Normal full-drain termination still processes every chunk, preserving existing StopPrefetcher semantics. Add tests covering snapshot-fast account/storage chunk cancellation and the full-drain invariant for chunked account prefetches.

Move pipelined import prefetcher shutdown out of the import thread. After CommitSnapshot, the execution StateDB now detaches its trie prefetcher and hands it to the SRC goroutine. SRC then synchronously drains/reports the detached prefetcher before computing the root. WarmSnapshot remains optional: when enabled, SRC builds the snapshot from the fully drained detached prefetcher; when disabled, SRC just waits for the prefetcher to finish and discards the warm nodes. This lets us A/B the prefetch lifecycle independently from the snapshot reader. Add explicit import-thread vs SRC-thread metrics: - chain/imports/pipelined/prefetch_detach - chain/imports/pipelined/src/prefetch_wait - chain/imports/pipelined/src/prefetch_report - chain/imports/pipelined/src/prefetch_subfetchers Remove the superseded snapshot-fast stop path and its tests. The remaining detached-prefetcher lifecycle uses full-drain semantics only, with focused tests for detach, nil/empty handles, and one-shot stop consumption.

sonarqubecloud · 2026-05-09T17:20:05Z

Quality Gate failed

Failed conditions
8.8% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

Restore the Bor timing split removed in 5d45f02: normal producers wait in Prepare until the parent slot boundary, while Giugliano+ primary producers return immediately from Seal so blocks can be announced before their own timestamp. Thread the required waitOnPrepare flag through the consensus Engine interface. Normal mining passes true, while speculative/prefetch pipeline paths pass false and perform their own parent-boundary wait before sealing. This preserves early announcement semantics without blocking speculative header preparation. The import-side future-block checks already match develop: post-Giugliano headers are accepted once local time has reached the parent timestamp, subject to the existing upper bound. Also add/update tests for Prepare wait behavior, Seal early return behavior, and explicit Prepare call sites.

Remove the production-side pipelined SRC config and CLI surface while keeping import-side pipelined SRC configurable. Miner pipeline eligibility now stays hard-disabled, the worker pipeline gauge reports disabled, and block-production interrupt timers use the normal non-pipelined boundary. Keep the previous production eligibility logic as a commented re-enable reference so the constraints are easy to recover if this path is revisited. Update CLI defaults and docs for import SRC: default import pipelining and verbose logs off, default warm-snapshot on, with an explicit note that warm-snapshot has no effect when import SRC is disabled.

…-bkp # Conflicts: # consensus/bor/bor.go # miner/worker.go # miner/worker_test.go

When pipelined SRC exposes the previous block through a FlatDiff overlay, accounts loaded from the overlay carry block N's post-state storage root. The miner/import prefetch readers, however, are opened at the committed parent root. For accounts that already existed in the committed parent, the overlay path resolves the committed storage root and uses that as the prefetch root. For accounts created only by the FlatDiff, there is no committed-parent storage trie to prefetch. The merge with BlockSTM v2 left those new FlatDiff accounts falling back to their post-state storage root. That lets storage prefetch scheduling hand a block-N root to a committed-parent reader, which pathdb reports as Unexpected trie node at the storage-trie root. Set the prefetch root to the empty storage root when the account is absent from the committed parent, and make all storage prefetch/get-prefetched/used paths skip empty roots. This keeps the execution overlay correct while preventing best-effort prefetch from opening an impossible trie. Update the FlatDiff-new-account regression test to pin the empty-root behavior.

…dsrc-and-blockstmv2

The combined pipelined-SRC + BlockSTM branch was still logging bursts of pathdb hash mismatches during startup: Unexpected trie node location=diff ... path=[] The previous FlatDiff prefetch-root fix covered the normal stateObject paths, but V2's FinaliseFastWithPrefetch still snapshotted dirty storage slots using obj.data.Root. For accounts loaded from a FlatDiff overlay, obj.data.Root is the previous block's post-state storage root. The execution prefetcher, however, is opened at the committed parent root, so scheduling a storage trie with the FlatDiff post-state root can ask pathdb for a root that does not belong to that reader's state. Carry the resolved prefetch root through snapshotDirtyStorageSlots and have FinaliseFastWithPrefetch schedule storage prefetches with that root. This preserves normal accounts by falling back to data.Root, uses the committed-parent root for FlatDiff overlay accounts, and skips new FlatDiff accounts whose storage trie did not exist at the committed parent root. Add regressions for both cases: existing FlatDiff accounts must prefetch with the committed storage root, and new FlatDiff accounts must not schedule a storage prefetch against the FlatDiff post-state root.

Add finer-grained timers around persistPipelinedImport so mainnet experiments can explain post-execution overhead instead of relying on the aggregate post_exec timer alone. The new metrics split out witness capture, collect bookkeeping, error-path prefetch cleanup, SRC block construction, pending-state publication, and a residual bucket for any post_exec time that is not covered by the known phases. Slow pipelined import logs now include the accounted and residual totals too, making it easier to tell whether a spike is a known phase or missing instrumentation.

sonarqubecloud · 2026-05-21T16:31:57Z

Quality Gate failed

Failed conditions
8.7% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

…elinedsrc-and-blockstmv2

Pipelined import can execute block N+1 against committed root N-1 plus the pending FlatDiff for N. V2 SafeBase reads were checking the shared storage cache before the FlatDiff, so a slot warmed from the committed root could mask the newer parent-block overlay value. That can make parallel execution charge gas/refunds from stale SSTORE state and fail ValidateStateCheap with a gas-used mismatch. A retry can appear clean once the pending SRC layer has been committed and the same block executes against parent root N directly. Route SafeBase and lazy StateDB committed-storage reads through a shared FlatDiff storage overlay helper. Explicit FlatDiff storage entries now win over shared trie caches, and FlatDiff destruct entries cover all slots by returning zero for old pre-destruction storage that was not rewritten by a resurrection. Add regressions for FlatDiff entries winning over SafeBase shared cache, destruct masks hiding shared-cache values, and lazy destruct+resurrect overlays not exposing old storage.

Pipelined V2 flatdiff import executes block N+1 over committed root N-1 plus the collected FlatDiff for block N. The previous FlatDiff SafeBase fix made storage slots consult that overlay before shared trie caches, but account scalar reads could still fall through to pooled StateDB copies. Those pooled copies may already hold stateObjects loaded from root N-1. In that case GetBalance, GetNonce, GetCode, GetCodeHash, Exist, or GetStorageRoot can observe stale pre-FlatDiff account data while storage reads observe the FlatDiff view. Mainnet devnode logs showed this as flatdiff-only gas mismatches that immediately disappeared on direct retry, with the divergent transactions being EIP-7702 type-4 calls that are sensitive to account/code/existence state. Add a FlatDiff account overlay helper and have SafeBase scalar account getters consult it before acquiring pooled StateDB readers. Account updates in the FlatDiff now provide balance, nonce, code hash, storage root, existence, and changed code bytes; destruct entries mask stale stateObjects as non-existent account data. Uncovered accounts still use the existing pooled read path. Add regression coverage for stale stateObjects loaded before the FlatDiff reference is attached, covering both updated-account and destructed-account cases. Tests: go test ./core/state

StateDB getters such as GetState, GetBalance, GetNonce, GetCode, GetCodeHash, Exist, and GetStorageRoot record database read failures internally and return zero-ish values to their caller. SafeBase previously cached those returned values unconditionally. If a pooled StateDB copy hit a transient missing-node, stale PBSS layer, or similar read failure during V2 execution, the zero-ish result could become a stable SafeBase cache entry for the rest of the block. That is unsafe for pipelined SRC and PBSS because SRC can advance or flatten pathdb layers while the next block is executing or prefetching. A stale-layer read should make the current speculative V2 result unusable, not silently convert missing account/storage/code data into consensus state. Track the first read error observed by SafeBase, avoid caching pooled StateDB read results unless the read completed cleanly, and replace any pooled StateDB copy that has recorded an error instead of returning it to the worker pool. ExecuteV2BlockSTM now carries the SafeBase/base read error in V2ExecutionResult, and V2StateProcessor aborts the block with v2: base read so the importer can retry through the normal fallback path. The FlatDiff overlay paths still cache their explicit overlay values directly because they do not perform a database read. The guarded cache writes only apply to fallback reads through StateDB. Add SafeBase regression coverage for storage, account scalar, and code read failures to prove failed zero-ish results do not poison caches. Update the V2 gas determinism fixture selection to skip incomplete embedded witnesses now that base read failures are surfaced instead of ignored. Tests: go test ./core/state ./core

SafeBase should be a concurrent read-through cache over the block's logical base, not the owner of FlatDiff semantics. A StateDB constructed with a FlatDiff reference is the ground truth for block N+1 execution on top of committed root N-1, so every SafeBase miss must go through StateDB getters. Remove the SafeBase storage cache and account-scalar FlatDiff bypasses. Those paths made SafeBase reason about pending system-contract writes, shared trie storage caches, and FlatDiff coverage directly, which duplicated StateDB rules and could let raw root-N-1 cached values win before StateDB applied the overlay. Move the remaining overlay ordering into StateDB and stateObject. FlatDiff account coverage now masks stale stateObjects loaded before the reference unless the current execution has already dirtied that account. FlatDiff-backed objects are marked so repeated reads reuse the overlay-backed object, and FlatDiff storage coverage now beats stale originStorage populated from committedParentRoot before the reference was attached. Keep SafeBase's read-error handling intact: failed StateDB reads are not cached and poisoned pooled copies are replaced, so a missing-node or stale-layer read cannot become a cached zero-ish base value. Tests: go test -count=1 ./core/state -run 'Test(StateDB_FlatDiff|SafeBase_)'; go test -count=1 ./core; git diff --check

…lockstmv2

Integrates BlockSTM-v2 parallel EVM with pipelined-import-SRC after a clean multi-day devnode soak (2026-06-11 .. 2026-06-17, no bad blocks). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Add direct observability for the pipelined import question: how much of SRC for block N actually overlaps execution of block N+1, and whether that overlap correlates with slower execution or SRC work. The implementation timestamps each pending SRC goroutine, carries the SRC handle into the next block's PipelineImportOpts, and records the intersection between the winning execution branch's Process window and the previous SRC window. This gives a per-block overlap signal instead of relying only on temporal dashboard correlation. New overlap metrics: - chain/imports/pipelined/overlap/execution: duration for which the previous block's SRC was running during the current block's execution. - chain/imports/pipelined/overlap/execution_percent: overlap/execution ratio for the current block, emitted as 0..100. - chain/imports/pipelined/overlap/blocks: count of blocks whose execution had positive overlap with previous SRC. - chain/imports/pipelined/overlap/no_overlap: count of pipeline-hit blocks where previous SRC had no execution overlap. New execution-path metrics: - chain/imports/pipelined/execution: winning execution branch duration, measured around the processor Process call only. - chain/imports/pipelined/execution/with_overlap: execution duration for blocks with positive previous-SRC overlap. - chain/imports/pipelined/execution/no_overlap: execution duration for blocks without previous-SRC overlap. - chain/imports/pipelined/execution/overlap_0_percent: execution duration for blocks with 0% overlap. - chain/imports/pipelined/execution/overlap_1_25_percent: execution duration for blocks with >0% and <25% overlap. - chain/imports/pipelined/execution/overlap_25_50_percent: execution duration for blocks with >=25% and <50% overlap. - chain/imports/pipelined/execution/overlap_50_75_percent: execution duration for blocks with >=50% and <75% overlap. - chain/imports/pipelined/execution/overlap_75_100_percent: execution duration for blocks with >=75% overlap. New SRC-path metrics: - chain/imports/pipelined/src/open_statedb: time to open the temporary StateDB used by SRC. - chain/imports/pipelined/src/apply_flatdiff: time to replay the FlatDiff into the SRC StateDB. - chain/imports/pipelined/src/commit: time spent in SRC CommitWithUpdate/root computation, also preserving the existing chain/state/commit parity sample. - chain/imports/pipelined/src/with_next_exec_overlap: total SRC wall-clock for SRCs that overlapped the next block's execution. - chain/imports/pipelined/src/no_next_exec_overlap: total SRC wall-clock for SRCs that did not overlap the next block's execution. The SRC with/no-next-exec split is intentionally one block delayed: SRC_N is classified when block N+1 records its execution overlap and then N is collected. The trailing pending SRC at the end of a short run may remain unclassified, which is acceptable for long catch-up windows. Update TestPipelinedImportMetrics to assert the new metric streams fire, the execution split matches overlap/no-overlap counters, the overlap percent buckets classify every pipeline hit, and the SRC next-exec split mirrors the overlap classification. Validated with: - go test ./core -run 'TestPipelinedImportMetrics|TestPipelinedImportSRC_MakeWitnessFalse|TestPipelinedImportSRC_MultipleBlocks' - go test ./core

sonarqubecloud · 2026-06-22T03:33:43Z

Quality Gate failed

Failed conditions
8.6% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

cffls and others added 5 commits March 5, 2026 21:00

Initial delayed SRC PoC

8bd82f9

Merge branch 'develop' of github.com:0xPolygon/bor into delay_src

b232538

miner, core, consensus/bor: pipelined state root computation (PoC)

d63daed

miner: run speculative fillTransactions concurrently with SRC and rem…

ceca519

…oved the post tx execution buffer time

miner: async DB write, concurrent fill, and interrupt timer improvements

3dcdf80

claude Bot reviewed Apr 1, 2026

View reviewed changes

pratikspatil024 requested review from a team and cffls April 1, 2026 15:57

llint fix

07345ad

pratikspatil024 added 2 commits April 2, 2026 10:32

addressed comments and fix test, lint

a86db50

core/stateless: (fix unit test) fix NewWitness zeroing breaking witne…

8d5ed1b

…ss manager hash matching

cffls reviewed Apr 8, 2026

View reviewed changes

pratikspatil024 changed the title ~~miner: pipelined state root computation (PoC)~~ miner, core, consensus/bor, eth, triedb: pipelined state root computation Apr 9, 2026

pratikspatil024 changed the title ~~miner, core, consensus/bor, eth, triedb: pipelined state root computation~~ miner, core, consensus/bor, eth, triedb: pipelined state root computation (PoC) Apr 9, 2026

pratikspatil024 added 2 commits April 10, 2026 09:37

tests/bor: add pipelined import SRC self-destruct integration test

b283227

Adds TestPipelinedImportSRC_SelfDestruct to verify that the FlatDiff Destructs check in getStateObject correctly handles self-destructed contracts during pipelined import.

cffls reviewed Apr 11, 2026

View reviewed changes

pratikspatil024 added 2 commits April 16, 2026 15:54

pratikspatil024 and others added 3 commits April 22, 2026 23:31

core: added metrics for preloadFlatDiffReads in pipelined SRC

946f137

pratikspatil024 added 4 commits May 7, 2026 09:55

pratikspatil024 added 5 commits May 13, 2026 15:58

Merge branch 'develop' of github.com:0xPolygon/bor into pipelined-src…

732cc8b

…-bkp # Conflicts: # consensus/bor/bor.go # miner/worker.go # miner/worker_test.go

miner: preserve disabled pipeline behavior

c58155c

miner: snapshot pipeline eligibility for cleanup

1f9419a

pratikspatil024 changed the base branch from delay_src to develop May 20, 2026 12:03

pratikspatil024 added 6 commits May 20, 2026 20:35

core, state: merge pipelined SRC with BlockSTM v2

d4c7d04

Merge remote-tracking branch 'origin/blockstm_redesign' into pipeline…

a8faaaa

…dsrc-and-blockstmv2

Merge branch 'pipelined-src' into pipelinedsrc-and-blockstmv2

52b0e3a

pratikspatil024 and others added 10 commits May 26, 2026 10:12

Merge branch 'blockstm_redesign' of github.com:0xPolygon/bor into pip…

7db144e

…elinedsrc-and-blockstmv2

core: log pipelined import mode on V2 failures

ef692db

eth: mask pipelined head state sync checks

279437a

Merge remote-tracking branch 'origin/develop' into pipelinedsrc-and-b…

1c39714

…lockstmv2

Merge pipelinedsrc-and-blockstmv2 into pipelined-src

a21ee56

Integrates BlockSTM-v2 parallel EVM with pipelined-import-SRC after a clean multi-day devnode soak (2026-06-11 .. 2026-06-17, no bad blocks). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

pratikspatil024 force-pushed the pipelined-src branch from 08b8c62 to 8a20f42 Compare June 22, 2026 03:32

Conversation

pratikspatil024 commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How it works - Miner (block production)

How it works - Import (block validation)

Config

Key changes beyond the miner pipeline

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

claude Bot commented Apr 1, 2026

Code Review

Bugs

Security Concerns

Uh oh!

claude Bot commented Apr 1, 2026

Code Review

Uh oh!

codecov Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

lucca30 commented Apr 6, 2026

Uh oh!

cffls commented Apr 7, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

claude Bot commented Apr 21, 2026

Code Review

Uh oh!

claude Bot commented Apr 21, 2026

Inline Review Comments

Uh oh!

sonarqubecloud Bot commented May 9, 2026

Quality Gate failed

Uh oh!

sonarqubecloud Bot commented May 21, 2026

Quality Gate failed

Uh oh!

sonarqubecloud Bot commented Jun 22, 2026

pratikspatil024 commented Apr 1, 2026 •

edited

Loading

codecov Bot commented Apr 2, 2026 •

edited

Loading