perf(cubestore): early compaction split by waralexrom · Pull Request #11079 · cube-js/cube

waralexrom · 2026-06-13T02:45:05Z

Check List

Tests have been run in packages where changes have been made if available
Linter has been run for changed code
Tests for the changes have been added if not covered yet
Docs have been added / updated if required

…multiplier, size-aware split - Import: cap a CSV data frame by estimated Arrow buffer size in addition to row count (CUBESTORE_WAL_SPLIT_SIZE_THRESHOLD, default 100MB) so wide rows bound peak ingest memory instead of accumulating 524K-row frames. - Compaction: CUBESTORE_COMPACTION_CHUNKS_THRESHOLD_MULTIPLIER (f64, default 3.0) scales the persisted-chunk count and total-size thresholds, letting more chunks accumulate and merge in one streaming k-way pass. - Compaction split now sizes children by the bytes actually written (existing main table + pending chunk bytes) so a partition splits by file size in a single pass instead of under-splitting and re-splitting. Ignores 4 tests whose partition-count assertions assume the old lagged split-by-file-size semantics (FIXME(early-split)).

claude · 2026-06-13T02:45:26Z

Claude finished @waralexrom's task in 1m 7s —— View job

Review: perf(cubestore): early compaction split

1 high, 2 medium, 2 low

Full review (click to expand)

Summary

The core change has three parts:

compaction.rs — sizing the partition split by partition_file_size + chunks_total_file_size so a partition whose chunks dwarf the main table splits in one pass instead of under-splitting and re-splitting.
config/mod.rs — a new compaction_chunks_threshold_multiplier (default 3.0) applied to the count + total-size compaction thresholds, deferring compaction so it runs on larger inputs that can be sized correctly.
import/mod.rs — a byte cap (CUBESTORE_WAL_SPLIT_SIZE_THRESHOLD, default 100 MiB) on WAL frames during ingestion, in addition to the existing row count threshold.

The shape is reasonable and the eager-split-by-bytes logic in compaction.rs:586-598 is a sensible fix. A few concerns below.

High — three tests left `#[ignore]` with "FIXME needs updating"

rust/cubestore/cubestore/src/sql/mod.rs:3313 (over_10k_join), :4047 (inactive_partitions_cleanup), and rust/cubestore/cubestore/src/store/compaction.rs:2402 (partition_compaction_decimal96), :2460 (partition_split_by_file_size) are all disabled with identical comments stating the assertions are tuned to the old lagged semantics and "need updating."

These are exactly the tests that validate the behavior this PR changes — splitting and partition counts. Shipping a perf change to the split path while silencing the tests that exercise it leaves no executed coverage of the new semantics. Please update the assertions to the new expected counts in this PR rather than deferring. If the test expectations are non-trivial to recompute, that itself is signal that the new behavior deserves a fresh, explicit test of the split-by-pending-bytes path (e.g. small/empty main table + large pending chunks → split count expected).

Medium — unrelated `autoRunMode` env var

packages/cubejs-backend-shared/src/env.ts:713-720 adds CUBEJS_AUTO_RUN_MODE ("Whether queries are executed automatically… Only used in Cube Cloud"). This has no apparent relation to "early compaction split" and isn't read anywhere in this diff. Either it's leftover from another branch and should be dropped, or it belongs in its own PR — bundling it makes the change history harder to bisect and review.

Medium — default multiplier of 3.0 is a behavior change for existing deployments

rust/cubestore/cubestore/src/config/mod.rs:1345-1348 defaults compaction_chunks_threshold_multiplier to 3.0, which triples both compaction_chunks_total_size_threshold (8 MiB → 24 MiB) and compaction_chunks_count_threshold (4 → 12) for every existing installation on upgrade. Compaction will fire later and on larger inputs, which is the intent, but this is a meaningful operational change: more memory pressure during compaction, longer chunk-list scans, more pending data at crash time.

At minimum this should be called out in the PR description and changelog. Consider whether 1.0 is a safer default and the 3.0 value should be opt-in per deployment until validated under production load.

Low — multiplier can floor count threshold to zero

config/mod.rs:715-719: (compaction_chunks_count_threshold as f64 * multiplier).floor() as u64 can produce 0 for small multipliers (e.g. 0.1 * 4 = 0.4 → 0). Not a real risk at the default, but a .max(1) would make the getter robust to any operator-supplied value.

Low — Float32 path: `as f64` cast precision

query_executor.rs:2181-2190: (a.value(i) as f64).into() is fine; just noting that f32 → f64 is exact so there's no precision loss to worry about. The added test at :2574 covers it. Nice addition, including the Float16/Float32/Float64 → ColumnType::Float test at :2570-2576.

Nits

import/mod.rs:90-101 — the estimate_arrow_value_size accounting (4 bytes per offset for varlen, 16 for Int96/Decimal*, 1 for bool) looks correct and the "exactness is unnecessary for a threshold" comment is the right framing.
compaction.rs:586-598 — the unwrap_or(0) + chunks_total_file_size form is clean; the comment explains why well.

· [Branch](https://github.com/cube-js/cube/tree/cubestore-early-split)

claude · 2026-06-13T02:45:57Z

Claude finished @waralexrom's task in 1m 32s —— View job

Claude Code Review

Verdict: Approach is sound (size-aware import split + eager compaction split by total bytes), but landing it disables four behavioral tests and silently rescales two existing env vars. Both should be addressed before merge. 1 high, 1 medium, 2 low.

Details

Tasks

Read PR diff
Review Rust changes (cubestore)
Review JS env config changes (none — only Rust env parsing)
Check for existing review threads (none)
Post review

High

sql/mod.rs:3314, compaction.rs:2403/2460 — Four tests #[ignore]d with FIXME ("assertions tuned to the old lagged semantics and need updating"). These are precisely the tests that cover splitting/compaction behavior this PR changes. Update them, or at minimum file a tracked follow-up and reference it in the ignore reason.

Medium

config/mod.rs:713–723 — compaction_chunks_threshold_multiplier (default 3.0) is folded into the accessors for both compaction_chunks_total_size_threshold() and compaction_chunks_count_threshold(). Existing deployments that tuned CUBESTORE_CHUNKS_TOTAL_SIZE_THRESHOLD / CUBESTORE_CHUNKS_COUNT_THRESHOLD will see their configured values silently 3×'d on upgrade. Either default the multiplier to 1.0, apply it only at the call sites that need the widening, or document the rescale.

Low

import/mod.rs:794–807 — estimated_bytes accumulates inside the visit_line closure regardless of is_data_row, while num_rows only counts data rows. Cosmetic given the 100MB threshold, but inconsistent.
compaction.rs:590–599 — chunks_total_file_size is capped by the in-memory size break at line 537, so on partitions with very large pending backlogs the split still reflects only the bytes in this pass, not the eventual total. The comment reads like the full pending set is counted; worth a one-line clarification.

Nits / observations

estimate_arrow_value_size: + 4 for variable-length offset is fine as an approximation; null strings still contribute 4 bytes/row, which is harmless at scale. Boolean = 1 is conservative (Arrow packs to bits). Decimal96 = 16 is slightly over (i96 = 12) — fine for a threshold.
wal_split_size_threshold_bytes uses env_parse_size (accepts 100M etc.) — good UX, matches sibling size knobs.
Comment at compaction.rs:586–589 clearly explains the under-splitting failure mode this fixes; nice.

· branch: `cubestore-early-split`

claude · 2026-06-13T02:47:48Z

    }

    fn compaction_chunks_total_size_threshold(&self) -> u64 {
-        self.compaction_chunks_total_size_threshold
+        (self.compaction_chunks_total_size_threshold as f64
+            * self.compaction_chunks_threshold_multiplier)
+            .floor() as u64
    }

    fn compaction_chunks_count_threshold(&self) -> u64 {
-        self.compaction_chunks_count_threshold
+        (self.compaction_chunks_count_threshold as f64
+            * self.compaction_chunks_threshold_multiplier)


Multiplier silently rescales two pre-existing env vars.

The multiplier (default 3.0) is applied inside the accessors for both compaction_chunks_total_size_threshold() and compaction_chunks_count_threshold(). That means every existing deployment that has tuned CUBESTORE_CHUNKS_TOTAL_SIZE_THRESHOLD or CUBESTORE_CHUNKS_COUNT_THRESHOLD will now see those values multiplied by 3 with no opt-in — a 4-chunk count threshold becomes 12, an 8MB size threshold becomes 24MB. That's a meaningful behavioral change for an upgrade.

A few options worth considering:

Apply the multiplier only at the call sites that need the early-split widening, leaving the trait getters returning the configured value verbatim.

Default the multiplier to 1.0 and let operators opt in.

At minimum, document this in the changelog / config docs so the rescale isn't a surprise.

Also worth a comment near the field explaining what the multiplier is widening and why — the name reads like a simple knob, but it changes the contract of two other knobs.

claude · 2026-06-13T02:47:49Z

            let line = line?;
            let is_data_row = parser.visit_line(line.as_str(), |insert_pos, column, value| {
                let builder = builders[insert_pos].as_mut();
+                estimated_bytes +=
+                    ImportFormat::estimate_arrow_value_size(column.get_column_type(), value);
                match value {
                    None => {
                        append_value(builder, column.get_column_type(), &TableValue::Null);


Minor: estimated_bytes accumulates on non-data rows.

The visit_line closure increments estimated_bytes unconditionally, but num_rows += 1 is gated on is_data_row. If headers / non-data lines ever invoke the per-column callback, their bytes silently leak into the running total and never get reset on continue. In practice the leak is tiny relative to a 100MB threshold so this is cosmetic, but it would be cleaner to either accumulate into a per-line scratch and commit it only when is_data_row, or reset both counters in the same place.

claude · 2026-06-13T02:47:50Z

+            let new_partitions_count_by_file_size = {
+                let total_file_size =
+                    partition.get_row().file_size().unwrap_or(0) + chunks_total_file_size;
+                if total_file_size > 0 {
                    let threshold = self.config.partition_size_split_threshold_bytes();
-                    (div_ceil(partition_file_size, threshold) as usize).min(16)
+                    (div_ceil(total_file_size, threshold) as usize).min(16)
                } else {
                    1
-                };
+                }
+            };


Note that chunks_total_file_size here is capped by the in-memory size threshold break above (line 537) — if pending chunks exceed that threshold, only the chunks taken into this compaction pass are counted, not the full pending set. That's likely the intended behavior (we split by what we're actually writing), and matches the comment, but it does mean a partition with a huge backlog of pending chunks will still under-split on any single pass relative to the eventual total. Worth a one-line note acknowledging this, since the comment currently implies the split reflects "the bytes actually being written" without flagging the cap.

…t budget wait Greppable by "csv-import-timing": logs the moment a CREATE TABLE request is received, and the CSV import job start both before and after the rate-limiter budget wait (with wait_ms), so the budget-wait gap is readable without trace.

claude · 2026-06-13T03:04:46Z

Claude finished @waralexrom's task in 50s —— View job

Review: latest push (info-level timing logs)

Verdict: New commit is a small diagnostic-logging-only change. No new substantive findings. The four prior unresolved review threads still apply to the current diff.

Details

What's new since the last review

Commit 7c17003 only adds two info! log sites tagged [csv-import-timing]:

sql/mod.rs:920 — logs every CREATE TABLE request received.
cluster/ingestion/job_runner.rs:318,322 — logs before and after the rate-limiter wait_for_allow call on CSV import jobs, with wait_ms.

The substantive logic of the PR (size-aware split, threshold multiplier, WAL byte cap) is unchanged.

Observations on the logging

Info-level on the CREATE TABLE hot path. [csv-import-timing] fires once per CREATE TABLE and twice per CSV import job at info-level. Greppability is fine, but on busy clusters this becomes persistent log volume for what looks like a temporary diagnostic. Consider debug! or gating behind a config flag if these are meant to stay; if they are temporary, a TODO/issue link in the message would make their lifecycle obvious.
Mixed macro styles. sql/mod.rs:920 uses log::info! while job_runner.rs:316 uses bare info! — minor inconsistency, both work but the file already imports one or the other; pick whichever matches the surrounding module.

Prior threads still applicable (not re-posting)

All four open Claude threads remain unresolved and the underlying code is unchanged:

config/mod.rs:723 — multiplier silently 3×'s CUBESTORE_CHUNKS_TOTAL_SIZE_THRESHOLD / CUBESTORE_CHUNKS_COUNT_THRESHOLD for existing deployments.
sql/mod.rs:3318 — four #[ignore]d tests covering the behavior this PR changes.
import/mod.rs:807 — estimated_bytes accumulates on non-data rows.
compaction.rs:599 — chunks_total_file_size is capped by the in-memory size break upstream; comment should flag this.

Re-affirming the four prior threads rather than posting duplicates.

· · Branch

… logs Inline DefaultHasher loc_hash on every csv-import-timing line (create_table loc_hashes, import job before/after budget wait) plus job completed/failed lines, so a location ties create_table to its import job lifecycle.

claude · 2026-06-13T03:10:29Z

Claude finished @waralexrom's task in 43s —— View job

Review: latest push (`bafd592` — csv-import-timing log enrichment)

Verdict: Diagnostic-logging-only change. No new substantive findings. Four prior unresolved review threads still apply to the current diff.

Details

Diff prior commit (7c17003..HEAD) reviewed
Existing review threads checked for duplicates
Logging-only change confirmed — substantive PR logic unchanged

What's new since the last review

bafd592 adds two enrichments to the existing [csv-import-timing] log lines, no logic change:

sql/mod.rs:920–931 — create_table log now emits loc_hashes=[…] (DefaultHasher-formatted hex of each location string) so the same load can be correlated across the create_table site and the per-location job sites.
cluster/ingestion/job_runner.rs:312–317 — computes a loc_hash once per job spawn from the location string.
job_runner.rs:325, 333 — start + budget-acquired logs now include loc_hash=… alongside table_id.
job_runner.rs:351–354, 359–362 — new import job completed (with data_loaded) and import job failed (with err) info-level lines, also tagged with loc_hash.

Observations on the new logging

Stable correlation key — good. DefaultHasher is non-cryptographic but deterministic within a single process run, which is all that's needed to join a create_table line to its later per-job lines. Note that hashes are NOT comparable across processes (Rust's RandomState reseeds), so cross-cluster log joins won't work — fine for in-process diagnostics, just be aware.
Failure path now logged at info-level. The new import job failed line at :359 is helpful — previously a job error returned from this branch with no [csv-import-timing] line, which made failures invisible to anyone grepping the timing tag. Good addition.
Same caveats as the prior logging commit still apply. These are info-level on the CSV import hot path and tagged as transient diagnostic markers. If they're meant to stay, consider debug! or a config gate; if temporary, a TODO/issue link in the message would make their lifecycle obvious. (Re-flagging rather than re-posting an inline thread.)
Hasher import inside the closure is fine but cluttered. use std::hash::{Hash, Hasher}; inside the closure body works; if these logs become permanent, a small location_hash(&str) -> String helper near the top of the module would dedupe the seven-line idiom that now appears in two files.

Prior threads still applicable (not re-posting)

All four open Claude threads remain unresolved and the underlying code is unchanged in this push:

config/mod.rs:723 — multiplier silently 3×'s CUBESTORE_CHUNKS_TOTAL_SIZE_THRESHOLD / CUBESTORE_CHUNKS_COUNT_THRESHOLD for existing deployments.
sql/mod.rs:3327 — four #[ignore]d tests covering the behavior this PR changes.
import/mod.rs:807 — estimated_bytes accumulates on non-data rows.
compaction.rs:599 — chunks_total_file_size is capped by the in-memory size break upstream; comment should flag this.

Re-affirming the four prior threads rather than posting duplicates.

· Branch ·

codecov · 2026-06-13T03:35:00Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.50%. Comparing base (f6b1f71) to head (97e756a).

Additional details and impacted files

@@                        Coverage Diff                        @@
##           cubestore-compaction-improvement   #11079   +/-   ##
=================================================================
  Coverage                             58.50%   58.50%           
=================================================================
  Files                                   216      216           
  Lines                                 17269    17269           
  Branches                               3524     3524           
=================================================================
  Hits                                  10103    10103           
  Misses                                 6651     6651           
  Partials                                515      515

Flag	Coverage Δ
cube-backend	`58.50% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…it tests Removes the temporary [csv-import-timing] info logs and re-enables the four tests ignored after the size-aware split change: - partition_compaction_decimal96, inactive_partitions_cleanup: pin partition_size_split_threshold_bytes high so the split stays row-based (these cover decimal handling / inactive-file GC, not size split). - over_10k_join: raise max_joined_partitions above the now-larger partition count (the test asserts join correctness, not the cap). - partition_split_by_file_size: assert the first compaction already splits (eager split-by-file-size), keeping the per-partition size invariant.

claude · 2026-06-13T15:07:25Z

Claude finished @waralexrom's task in 51s —— View job

Review: latest push (`97e756a` — drop timing logs + un-ignore tests)

Verdict: Strong push. Resolves the prior High finding (four #[ignore]d tests now re-enabled with updated assertions/config) and drops all [csv-import-timing] info-level diagnostic logs. Three prior threads still apply (multiplier rescale, non-data-row byte accounting, compaction comment). No new findings.

Details

What's new since the last review

97e756a does two things, no substantive logic change to the perf path:

1. Un-ignores the four behavioral tests that earlier commits had disabled, and adjusts them to the new eager-split semantics rather than masking them:

sql/mod.rs:3313 over_10k_join — lifts max_joined_partitions from 10 → 50 with an inline comment that the test asserts join correctness, not the partition cap. Reasonable: the right table now fragments further because of size-aware split.
sql/mod.rs:4047 inactive_partitions_cleanup — sets partition_size_split_threshold_bytes = 1 MiB so a single active partition remains; comment says it covers inactive-file GC, not size split. Reasonable scoping.
compaction.rs:2403 partition_compaction_decimal96 — same partition_size_split_threshold_bytes = 1 MiB knob; keeps split row-based to isolate decimal-handling coverage.
compaction.rs:2460 partition_split_by_file_size — assertion flipped from partitions.len() == 1 to partitions.len() > 1, which is exactly what the new "eager split on first compaction when pending chunks exceed the size threshold" behavior should produce. Comment explains why.

This addresses the prior High thread directly. The previously-outdated Four tests disabled, not updated. thread is now resolved.

2. Drops the [csv-import-timing] diagnostic logs added in 7c17003/bafd592:

sql/mod.rs:917-930 — create_table log + loc_hash collection removed.
cluster/ingestion/job_runner.rs:312-360 — start/budget-acquired/completed/failed log lines and the DefaultHasher block removed.

This addresses the "info-level on the hot path" caveat I raised on the prior two pushes — good outcome.

Prior threads still applicable (not re-posting)

The substantive perf code (config/mod.rs multiplier, import/mod.rs byte cap, compaction.rs size-aware split) is unchanged in this push, so:

config/mod.rs:723 — multiplier (default 3.0) silently 3×'s CUBESTORE_CHUNKS_TOTAL_SIZE_THRESHOLD / CUBESTORE_CHUNKS_COUNT_THRESHOLD for existing deployments.
import/mod.rs:807 — estimated_bytes accumulates on non-data rows.
compaction.rs:599 — chunks_total_file_size is capped by the in-memory size break upstream; the comment doesn't flag this.

Re-affirming these three rather than posting duplicates.

· Branch ·

waralexrom requested review from a team and keydunov as code owners June 13, 2026 02:45

github-actions Bot added cube store Issues relating to Cube Store rust Pull requests that update Rust code javascript Pull requests that update Javascript code labels Jun 13, 2026

waralexrom force-pushed the cubestore-early-split branch from 281cc13 to 11eb290 Compare June 13, 2026 02:45

claude Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread rust/cubestore/cubestore/src/sql/mod.rs Outdated

claude Bot reviewed Jun 13, 2026

View reviewed changes

vercel Bot deployed to Preview June 13, 2026 02:48 View deployment

Conversation

waralexrom commented Jun 13, 2026

Uh oh!

claude Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: perf(cubestore): early compaction split

Summary

High — three tests left #[ignore] with "FIXME needs updating"

Medium — unrelated autoRunMode env var

Medium — default multiplier of 3.0 is a behavior change for existing deployments

Low — multiplier can floor count threshold to zero

Low — Float32 path: as f64 cast precision

Nits

Uh oh!

claude Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Tasks

High

Medium

Low

Nits / observations

Uh oh!

claude Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claude Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: latest push (info-level timing logs)

What's new since the last review

Observations on the logging

Prior threads still applicable (not re-posting)

Uh oh!

claude Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: latest push (bafd592 — csv-import-timing log enrichment)

What's new since the last review

Observations on the new logging

Prior threads still applicable (not re-posting)

Uh oh!

codecov Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

claude Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: latest push (97e756a — drop timing logs + un-ignore tests)

What's new since the last review

Prior threads still applicable (not re-posting)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Jun 13, 2026 •

edited

Loading

High — three tests left `#[ignore]` with "FIXME needs updating"

Medium — unrelated `autoRunMode` env var

Low — Float32 path: `as f64` cast precision

claude Bot commented Jun 13, 2026 •

edited

Loading

claude Bot commented Jun 13, 2026 •

edited

Loading

claude Bot commented Jun 13, 2026 •

edited

Loading

Review: latest push (`bafd592` — csv-import-timing log enrichment)

codecov Bot commented Jun 13, 2026 •

edited

Loading

claude Bot commented Jun 13, 2026 •

edited

Loading

Review: latest push (`97e756a` — drop timing logs + un-ignore tests)