Extend mypy type-checking across the foundry shared layer and rfd3 by lyskov-ai · Pull Request #327 · RosettaCommons/foundry

lyskov-ai · 2026-06-18T20:30:26Z

Extends static type-checking (mypy) coverage and adds unit tests, in two areas.

Shared layer (src/foundry, src/foundry_cli). Brings the remaining modules under strict checking (no untyped function definitions; function bodies are checked), and replaces the per-module opt-in list with foundry.* / foundry_cli.* wildcards so new modules are strict by default. Adds unit tests for the helpers this newly exercises (trainers, XPU plugins, CLI, callbacks, metrics, schedulers, common utils, …). Also removes a dead, import-broken module (rf3.data.paired_msa) and fixes a latent fallback-weights bug in utils/datasets — it now uses the computed sampler weights instead of silently-uniform ones (relevant to mpnn training, which is cluster-only and unverified in CI).

rfd3 model. Brings a large set of rfd3 modules into type-checking by removing their per-module exemptions and fixing the annotations. Most fixes are mechanical (Optional defaults, container/tuple types, narrowing), but honest typing surfaced several real latent bugs, fixed here:

an undefined logger name (NameError) on several log lines in utils/inference;
an h-bond helper that crashed on the documented None ("whole stack") selection;
two incorrect annotations — encode_residues_to: int → str (it sets a residue-name string), and residue_cache_dir: bool → a path type.

Adds CPU unit tests for pure rfd3 helpers (utils/io, trainer/trainer_utils).

Two rfd3 modules (engine, utils/vizualize) are intentionally left exempt: their type errors reflect genuine pre-existing bugs (a broken import to a function that no longer exists; .split called on a value that is already a list) that deserve dedicated fixes rather than being suppressed.

Behaviour is unchanged for working code paths — the bug fixes only affect cases that previously crashed.

instantiators is already fully annotated, so this adds it to the per-module strict mypy override (a lock-in for future defs) and fills the test gap with tests/test_instantiators.py covering the callback/logger instantiation control flow. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

…st logging Adds the three remaining small foundry.utils modules to the per-module strict mypy override. squashfs was already fully annotated (pure lock-in); ddp needed one annotation (RankedLogger.log varargs); logging needed five. New tests/test_logging.py covers the two pure helpers (CachedDataFilter.filter and condense_count_columns_of_grouped_df). Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

rf3.data.paired_msa no longer imports against the installed atomworks: MultiInputDatasetWrapper subclasses StructuralDatasetWrapper, which atomworks turned into a deprecated factory function, so subclassing it raises TypeError at import. The module was reachable only through domain_distillation.yaml, which is itself referenced only in commented-out lines of pdb_and_distillation.yaml, and its LoadPairedMSAs class was used nowhere. Remove the module, its orphaned domain_distillation.yaml config, the dangling commented references in pdb_and_distillation.yaml, and the stale pyproject mypy-exemption comment. rf3 now type-checks fully with no in-file suppressions. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

…+ tests In wrap_dataset_and_sampler_with_fallbacks the fallback-weights branch was gated on `"weights" in sampler_to_fallback_to`. Sampler defines no __contains__, so `in` iterates the sampler's integer indices and never matches the string — the documented "use the sampler's weights" branch was dead and the membership test needlessly drew samples. Switch to `hasattr(sampler_to_fallback_to, "weights")` (the idiomatic form atomworks itself uses); mypy then narrows the type so the `# type: ignore[attr-defined]` on `.weights` is dropped. Behaviour change: assemble_distributed_loader's reachable samplers (DistributedSampler / DistributedMixedSampler) have no `.weights`, so it stays uniform (behaviour-neutral). But models/mpnn/src/mpnn/train.py passes a WeightedRandomSampler built from computed PDB weights as the fallback sampler, so its training fallback sampling now uses those weights instead of silently-uniform ones — the function's documented intent. mpnn train.py is a cluster-coupled script (not in CI), so this is unverified here; flagged for mpnn owners to validate. Also add foundry.utils.datasets to the strict mypy override (already fully annotated — a pure lock-in) and add tests/test_datasets.py (18 tests, no prior coverage) covering the fallback-weights fix, the config-key sampler dispatch, and the distributed-loader sampler conversions + validation guards. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

…t mypy + tests Add foundry.training.{schedulers,EMA,checkpoint} to the per-module direction-(b) strict mypy override (disallow_untyped_defs/check_untyped_defs). schedulers was already fully annotated (pure lock-in); EMA and checkpoint get annotation-only type hints on their update/forward and decorator/wrapper signatures. Add unit tests for each module's real logic: the EMA update formula, frozen- param skip, verbatim buffer copy, training guard, and train/eval dispatch; the AF3 warmup/decay LR schedule and SchedulerConfig round-trip; and the activation- checkpointing wrappers (kwarg binding, both grad branches, gradient flow). Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

…on helpers Add foundry.common and foundry.constants to the per-module direction-(b) strict mypy override. No source change: common is already fully annotated (pure lock-in) and constants is data-only (the override is a forward-looking no-op there). version.py is VCS-generated/untracked and is not a target. Add tests/test_common.py covering the 10 previously-untested pure helpers, including the easy-to-miss edges: exists/default treat only None as absent, run_once fires exactly once, listmap consumes iterables, and ensure_dtype is a same-object no-op when the dtype already matches. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

Add foundry.metrics.metric and foundry.metrics.losses to the per-module direction-(b) strict mypy override. metric.py is already annotated (pure lock-in). losses.py gets honest annotations on Loss.__init__/forward and a documented cast(torch.Tensor, loss) for the int-zero accumulator (kept as int 0 so the running sum adopts the child losses' device/dtype via scalar promotion; a 0-d CPU tensor would break GPU training). Add tests for the MetricManager/Metric introspection machinery (tag filtering, result key-prefixing, compute_from_kwargs remapping + optional handling, tag-conflict validation, from_metrics) and the Loss aggregator forward (sum + dict merge, detached total_loss vs grad-carrying return). Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

Add the whole src/foundry/callbacks/ package (the BaseCallback base plus timing_logging, metrics_logging, train_logging, health_logging) to the direction-(b) disallow_untyped_defs/check_untyped_defs override and annotate the fallout. Hook annotation is mostly mechanical. Four hooks carry a documented # type: ignore[override]: subclasses override the base's positional-param hooks with **kwargs, which is safe because the trainer dispatches every hook by keyword (fabric.call(name, trainer=..., batch=..., ...)). check_untyped_defs also surfaced a real bug in LogAF3TrainingLossesCallback.start_time (None initialised, then used as a float) — typed float | None with an assert documenting that on_train_epoch_start always precedes on_train_epoch_end. Add tests/test_callbacks.py covering the one non-obvious CPU-portable helper, StoreValidationMetricsInDFCallback._load_and_concatenate_csvs (cross-rank CSV de-duplication keyed on example_id|dataset). Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

… + tests Add the small remaining src/foundry modules (inference_engines.base, inference_engines.checkpoint_registry, hydra.resolvers, model.layers.blocks) to the direction-(b) disallow_untyped_defs/check_untyped_defs override and annotate the fallout (annotation-only on source). check_untyped_defs surfaced one inference quirk in resolvers: a dict literal mixing two differently-signed functions joined to the bare 'function' type; annotating it dict[str, Callable[..., Any]] fixes it. Add tests/test_checkpoint_registry.py, tests/test_resolvers.py and tests/test_blocks.py for the previously-uncovered pure helpers (checkpoint dir search order, resolver attribute walking, FourierEmbedding/Dropout behaviour). Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

Add foundry_cli.download_checkpoints to the direction-(b) disallow_untyped_defs/check_untyped_defs override. Annotate the four typer commands with -> None and fix list_installed's total_size accumulator (int 0 -> 0.0); this also clears the long-standing download_checkpoints.py:201 annotation-unchecked note. Add tests/test_download_checkpoints.py covering the checkpoint-dir priority logic and the list-available / list-installed commands via typer's CliRunner. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

Add the Intel-XPU Lightning plugins (utils/xpu, already annotated -> pure lock-ins) and the testing/ pytest helpers to the direction-(b) disallow_untyped_defs/check_untyped_defs override. testing/ needed two annotations: gpu() -> bool and configure_pytest(config: pytest.Config, ...). Add tests/test_xpu.py (device-independent XPU accelerator/precision/strategy contracts, CPU-reachable only) and tests/test_testing_helpers.py (get_test_data_dir). This leaves trainers/fabric as the only shared-layer module not yet under the strict override. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

Add foundry.trainers.fabric to the direction-(b) disallow_untyped_defs/check_untyped_defs override. Only six return annotations (-> None) were missing; check_untyped_defs surfaced no real bugs since the file was already annotated for the baseline. Add tests/test_fabric_trainer.py for the static get_latest_checkpoint helper (missing/empty dir -> None, highest-epoch selection). This completes strict type-checking for the entire foundry shared layer. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

The direction-(b) per-module strict override (disallow_untyped_defs / check_untyped_defs) listed all 34 src/foundry + src/foundry_cli modules individually. That list is now exhaustive, so replace it with the wildcards foundry.* + foundry_cli.* — brief, and new modules are strict by default. Scoped via wildcards rather than the global [tool.mypy] switch because files also covers models/rf3 + models/rfd3, which are not strict-ready (rfd3 still carries an ignore_errors ratchet). The wildcard newly covers only the package __init__.py modules and the VCS-generated version.py, none of which define functions, so mypy stays green with no fallout. Config-only; no source or test changes. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

Clear the 8 type errors in rfd3.utils.io and drop it from the mypy ignore_errors ratchet (23 modules remain). The fixes mirror the near-identical rf3 utils/io sibling (cleared in an earlier task): - Path() wrap the to_cif_file path args (str -> PathLike). - Two documented cast(torch.Tensor, ...) on the weighted_rigid_align call; the trajectory_list param is loosely Tensor | np.ndarray but the body is tensor-only. Replace the always-true isinstance reassignment with a coords conditional (no Tensor -> ndarray reassignment). - Annotate find_files_with_extension's accumulator list[Path]. - Fix create_example_id_extractor's return type: it returns a closure, so -> str was wrong; -> Callable[[PathLike], str]. Add models/rfd3/tests/test_rfd3_io.py (15 tests) for the pure helpers: the example-id path/regex extractors, find_files_with_extension (incl. pinning the actual non-recursive behaviour), and the AtomArray-stack builder. CPU-only, no cluster data/GPU/checkpoints. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

Clear the 3 type errors in rfd3.trainer.trainer_utils and drop it from the mypy ignore_errors ratchet (22 modules remain). All three stem from one spot in _readout_seq_from_struc: `atom_names = np.array(atom_names)` reassigned the (restype, atom_names) loop variable (declared list[Any | None]) to an ndarray, which both broke the assignment type and left the later boolean-indexing reads typed as a non-iterable element. Bind the array to a distinct name (atom_names_arr); behaviour is identical since the original overwrote the loop variable before its only uses, and the ndarray form is required for the boolean mask indexing. Add models/rfd3/tests/test_rfd3_trainer_utils.py (8 tests) for the two pure helpers: _reorder_dict (fixed first/middle/last key ordering) and _remap_outputs (per-row coordinate scatter). Tensors/dicts only. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

Bundle three small rfd3 modules off the mypy ignore_errors ratchet (19 modules remain). All are implicit-Optional fixes (a None default on a non-Optional param), behaviour-preserving since None was already the default: - transforms/util_transforms: encode_residues_to int -> int | None - transforms/conditioning_base: additional set | list -> ... | None - trainer/dump_validation_structures: epoch str -> str | None, plus a type: ignore[override] on on_validation_batch_end for the Fabric keyword-dispatch hook (same pattern as the foundry callbacks). mypy-only; these modules are atomworks Transform/callback glue with no clean CPU-test targets. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

Clear the 2-error tier off the rfd3 mypy ignore_errors ratchet (17 remain): - transforms/training_conditions: the base TrainingCondition.name = None was inferred as type None, so subclasses' name = "ppi"/"subtype" were incompatible str assignments. Widen the base attr to name: str | None = None. - inference/symmetry/symmetry_utils: make_symmetric_atom_array reassigns its sym_conf (SymmetryConfig | dict) from the untyped check_symmetry_config, which widens it back to the union, so .is_symmetric_motif/.is_unsym_motif fail on the dict arm. Annotate convery_sym_conf_to_symmetry_config -> SymmetryConfig and add a documented assert isinstance re-narrow (annotating check_symmetry_config directly would be a circular import via checks.py). Behaviour-preserving. mypy-only; both modules are atomworks/AtomArray-coupled glue. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

Clear 12 rfd3 modules off the mypy ignore_errors ratchet (17 -> 5): dna_crop, hbonds, ppi_transforms, utils/inference, virtual_atoms, model/inference_sampler, trainer/rfd3, inference/datasets, transforms/design_transforms, model/RFD3, transforms/pipelines, run_inference. Mostly mechanical implicit-Optional / union / tuple / list-item / var-annotation fixes, plus documented casts/asserts/# type: ignore mirroring the rf3 0021/0024 precedents (DictConfig ** unpack, validation_step/run override, n_recycles_train cast, OmegaConf.to_container cast). Honest typing also surfaced several real latent bugs, fixed here: a logger -> ranked_logger NameError in utils/inference; an hbonds .sum()-on-None crash (guarded to match the docstring/biotite None handling); a wrong encode_residues_to int -> str; a wrong residue_cache_dir bool -> str | Path | None; a gamma_0 None > 0.5 TypeError (default added). Held back two modules whose mypy errors are genuine pre-existing bugs (filed, not papered over): rfd3.utils.vizualize imports a function that does not exist (broken import), and rfd3.engine's process_input calls .split on a list. mypy-only; these modules are atomworks Transform/AtomArray-coupled glue. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

lyskov and others added 18 commits June 16, 2026 19:43

lyskov-ai requested a review from woodsh17 June 18, 2026 20:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend mypy type-checking across the foundry shared layer and rfd3#327

Extend mypy type-checking across the foundry shared layer and rfd3#327
lyskov-ai wants to merge 18 commits into
RosettaCommons:productionfrom
lyskov-ai:0056-rfd3-3to8-error-modules-mypy

lyskov-ai commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lyskov-ai commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants