Skip to content

Add multi build support#2480

Draft
xiaoyu-work wants to merge 19 commits into
mainfrom
xiaoyu/builds-schema
Draft

Add multi build support#2480
xiaoyu-work wants to merge 19 commits into
mainfrom
xiaoyu/builds-schema

Conversation

@xiaoyu-work

@xiaoyu-work xiaoyu-work commented May 29, 2026

Copy link
Copy Markdown
Collaborator

This pull request introduces a new, flexible "builds" workflow to the Olive engine, enabling multiple independent build pipelines within a single run configuration. It adds support for per-build defaults, validation, and selective execution on model components, primarily targeting composite models. The changes also include robust schema validation and improved modularity for configuring builds.

Key changes include:

Builds Workflow and Configuration:

  • Added a new builds field to RunConfig, allowing users to define multiple named build pipelines, each with its own pipeline, component selection, and system/evaluator overrides. A special _default key enables partial defaults to be merged into sibling builds. (olive/engine/config.py, olive/workflows/run/config.py, olive/workflows/run/run.py) [1] [2] [3]

  • Introduced BuildConfigPartial and BuildConfig schemas for partial and full build configurations, and a merge_build_default function for merging defaults into builds. (olive/engine/config.py)

Validation and Reference Resolution:

  • Added pre- and post-validation to RunConfig to ensure build defaults are correctly merged and that all build references (passes, systems, evaluators) resolve to known entries. (olive/workflows/run/config.py) [1] [2]

Component Selection for Composite Models:

  • Implemented select_components methods in both ModelConfig and CompositeModelHandler to allow builds to operate on specific named components of a composite model, returning either a single component or a sub-composite as needed. (olive/model/config/model_config.py, olive/model/handler/composite.py) [1] [2]

Workflow Execution Logic:

  • Refactored the main run logic to dispatch to a new _run_builds function when builds are present, running each build as an independent workflow with its own engine, pipeline, and input model slice. Includes helper functions for validation, engine config construction, and reference resolution. (olive/workflows/run/run.py) [1] [2]

These improvements make the Olive engine significantly more flexible and modular, supporting advanced workflows for multi-component and multi-pipeline builds

xiaoyu-work and others added 2 commits May 28, 2026 15:27
Introduce a top-level �uilds section on RunConfig that lets users declare
multiple independent execution units (pipelines x devices x components) in one
workflow config.

* Add BuildConfigPartial / BuildConfig and a merge_build_default helper in
  olive/engine/config.py. _default lives inside �uilds as a sentinel key
  whose partial fields are merged into every sibling build with full-replace
  semantics (lists are not deep-merged).
* Add �uilds: dict[str, BuildConfig] to RunConfig with an
  �xpand_build_defaults before-validator that pops _default and merges it
  into siblings, plus a �alidate_builds_references after-validator that
  checks pipeline/host/target/evaluator string refs resolve to known entries.
* Schema-only change: the engine runner does not yet act on �uilds. Existing
  workflows without �uilds keep their current behavior.
* Add 8 unit tests in test/workflows/test_run_config_builds.py covering the
  merge, override, full-replace, missing-field, invalid-ref, absent-builds and
  empty-default cases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Execute the �uilds schema added in Phase 1.

* Add CompositeModelHandler.select_components(names) that returns the
  unwrapped child handler when one name is given and a sliced
  CompositeModelHandler otherwise. Unknown names raise a clear error.
* Add ModelConfig.select_components(names) so the runner can slice a
  composite input config without materializing the full handler.
* Add a builds-aware execution branch in olive/workflows/run/run.py. When
  �uilds is non-empty, the runner: validates components against the
  composite input model, then loops over builds. For each build it builds a
  per-build engine config (host/target/evaluator/search_strategy overrides
  resolved against systems/evaluators), a per-build pipeline subset from
  passes in the order declared by pipeline, the per-build accelerator
  spec, and calls engine.run with build.output_dir. Returns
  dict[build_name -> WorkflowOutput]. The no-builds path is unchanged and
  still returns a single WorkflowOutput.
* Tests:
  - 7 new composite handler / ModelConfig select_components cases in
    test/model/test_composite_model.py.
  - 7 new runner smoke tests in test/workflows/test_run_builds.py with
    mocked Engine.run covering: no-builds backward compat, multi-build
    dispatch, pipeline-subset ordering, per-build output_dir, host/target
    override, non-composite + components error, unknown component error.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xiaoyu-work xiaoyu-work changed the title Xiaoyu/builds schema Add multi build support May 29, 2026
xiaoyu-work and others added 14 commits June 2, 2026 12:56
…us HfModel

Support the two component-discovery paths from the multi-component design:

- Flow A Option 2 (two steps): load a Mobius export directory as a
  CompositeModel, using per-component subfolder names as component names.
  Adds discover_onnx_components() and directory auto-discovery in
  CompositeModelHandler and ModelConfig.get_components/select_components.
- Flow B (optimize then export): resolve an HfModel's components by querying
  Mobius (olive/common/mobius_utils.inspect_components, lazy import).
  HfModel.get_components returns Mobius component names; select_components
  tags the chosen component's submodule path in model_attributes for
  PyTorch-stage per-component passes.

No 'input' build dependency is used. Build component validation updated for
both sources.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…exports

When mobius exports a multi-component model, log each component's name and
its ONNX file path so the export layout is visible in the run log.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
mobius.inspect_components returns frozen ComponentInfo dataclasses, but Olive's coercion only handled its own ComponentInfo or a plain dict and crashed calling .get() on a mobius object. Broaden ComponentInfo.coerce (renamed from from_dict) to also accept duck-typed objects exposing name/kind/source_path. Add test/common/test_mobius_utils.py covering the object, dict, passthrough, and missing-mobius paths that the existing inspect_components mocks never exercised.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The autouse _stub_mobius_module fixture guarded on whether mobius was already in sys.modules, which is False when real mobius is installed but not yet imported at fixture setup. It then injected a non-package stub that shadowed real mobius, breaking test_write_genai_config_requires_real_mobius (imports mobius.integrations). This was masked in CI where mobius is absent and the test is skipped. Guard on _HAS_REAL_MOBIUS instead so the fixture is a true no-op when mobius is installed. Also drop the log-capture assertion from test_multi_component_returns_composite_handler (logging stays in the pass).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
builds.components could only target CompositeModel (directory) and HfModel (via mobius); a DiffusersModel input fell through to get_components()/select_components() returning None, so the design's per-component diffusion example failed validation. Add an optional components filter to DiffusersModelHandler (restricts get_exportable_components to a subset in canonical variant order) and resolve/select diffusion components in ModelConfig. select_components scopes the handler so each build's conversion emits just that component's ONNX, with later passes auto-mapping over the single-component composite. Variant detection only reads config files, so resolution stays cheap at validation time.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two runnable Flow A recipes plus a README documenting the two-step flow (CLI Mobius export, then an Olive config whose builds optimize each exported component): SD3 (transformer + VAE encoder/decoder; Mobius skips the CLIP/T5 text encoders) and Qwen3-VL-2B-Instruct (decoder/vision_encoder/embedding, the exact names Mobius produces). Both validate against RunConfig.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xiaoyu-work

Copy link
Copy Markdown
Collaborator Author

@copilot resolve the merge conflicts in this pull request

…i/base.py and passes/onnx/mobius_model_builder.py

Copilot AI commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

@copilot resolve the merge conflicts in this pull request

Merge conflicts resolved in commit 97da0323. Two files had conflicts:

  • olive/cli/base.py: Combined the builds-dict output handling (from this branch) with the save_discrepancy_check_results call (from main). The discrepancy check is now skipped when workflow_output is a dict (i.e., a builds run), since that function only handles a single WorkflowOutput.

  • olive/passes/onnx/mobius_model_builder.py: Kept the component_paths.append logging added in this branch, and adopted main's improvements: the component_additional_files variable rename with its descriptive comment, and the shared "additional_files" key on the composite handler to carry genai sidecar files to the package root.

Now that mobius exports text encoders (CLIP-L, CLIP-G, T5-XXL),
update the inference script to use ONNX Runtime for all components
instead of PyTorch text encoders and VAE.

Only tokenizers and the scheduler (pure logic, no neural network)
remain as Python dependencies.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@github-advanced-security github-advanced-security AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lintrunner found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants