Strict type-checking + unit tests across the foundry shared library#326
Open
lyskov-ai wants to merge 11 commits into
Open
Strict type-checking + unit tests across the foundry shared library#326lyskov-ai wants to merge 11 commits into
lyskov-ai wants to merge 11 commits into
Conversation
instantiators is already fully annotated, so this adds it to the per-module strict mypy override (a lock-in for future defs) and fills the test gap with tests/test_instantiators.py covering the callback/logger instantiation control flow. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
…st logging Adds the three remaining small foundry.utils modules to the per-module strict mypy override. squashfs was already fully annotated (pure lock-in); ddp needed one annotation (RankedLogger.log varargs); logging needed five. New tests/test_logging.py covers the two pure helpers (CachedDataFilter.filter and condense_count_columns_of_grouped_df). Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
rf3.data.paired_msa no longer imports against the installed atomworks: MultiInputDatasetWrapper subclasses StructuralDatasetWrapper, which atomworks turned into a deprecated factory function, so subclassing it raises TypeError at import. The module was reachable only through domain_distillation.yaml, which is itself referenced only in commented-out lines of pdb_and_distillation.yaml, and its LoadPairedMSAs class was used nowhere. Remove the module, its orphaned domain_distillation.yaml config, the dangling commented references in pdb_and_distillation.yaml, and the stale pyproject mypy-exemption comment. rf3 now type-checks fully with no in-file suppressions. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
…+ tests In wrap_dataset_and_sampler_with_fallbacks the fallback-weights branch was gated on `"weights" in sampler_to_fallback_to`. Sampler defines no __contains__, so `in` iterates the sampler's integer indices and never matches the string — the documented "use the sampler's weights" branch was dead and the membership test needlessly drew samples. Switch to `hasattr(sampler_to_fallback_to, "weights")` (the idiomatic form atomworks itself uses); mypy then narrows the type so the `# type: ignore[attr-defined]` on `.weights` is dropped. Behaviour change: assemble_distributed_loader's reachable samplers (DistributedSampler / DistributedMixedSampler) have no `.weights`, so it stays uniform (behaviour-neutral). But models/mpnn/src/mpnn/train.py passes a WeightedRandomSampler built from computed PDB weights as the fallback sampler, so its training fallback sampling now uses those weights instead of silently-uniform ones — the function's documented intent. mpnn train.py is a cluster-coupled script (not in CI), so this is unverified here; flagged for mpnn owners to validate. Also add foundry.utils.datasets to the strict mypy override (already fully annotated — a pure lock-in) and add tests/test_datasets.py (18 tests, no prior coverage) covering the fallback-weights fix, the config-key sampler dispatch, and the distributed-loader sampler conversions + validation guards. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
…t mypy + tests
Add foundry.training.{schedulers,EMA,checkpoint} to the per-module direction-(b)
strict mypy override (disallow_untyped_defs/check_untyped_defs). schedulers was
already fully annotated (pure lock-in); EMA and checkpoint get annotation-only
type hints on their update/forward and decorator/wrapper signatures.
Add unit tests for each module's real logic: the EMA update formula, frozen-
param skip, verbatim buffer copy, training guard, and train/eval dispatch; the
AF3 warmup/decay LR schedule and SchedulerConfig round-trip; and the activation-
checkpointing wrappers (kwarg binding, both grad branches, gradient flow).
Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
…on helpers Add foundry.common and foundry.constants to the per-module direction-(b) strict mypy override. No source change: common is already fully annotated (pure lock-in) and constants is data-only (the override is a forward-looking no-op there). version.py is VCS-generated/untracked and is not a target. Add tests/test_common.py covering the 10 previously-untested pure helpers, including the easy-to-miss edges: exists/default treat only None as absent, run_once fires exactly once, listmap consumes iterables, and ensure_dtype is a same-object no-op when the dtype already matches. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
Add foundry.metrics.metric and foundry.metrics.losses to the per-module direction-(b) strict mypy override. metric.py is already annotated (pure lock-in). losses.py gets honest annotations on Loss.__init__/forward and a documented cast(torch.Tensor, loss) for the int-zero accumulator (kept as int 0 so the running sum adopts the child losses' device/dtype via scalar promotion; a 0-d CPU tensor would break GPU training). Add tests for the MetricManager/Metric introspection machinery (tag filtering, result key-prefixing, compute_from_kwargs remapping + optional handling, tag-conflict validation, from_metrics) and the Loss aggregator forward (sum + dict merge, detached total_loss vs grad-carrying return). Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
Add the whole src/foundry/callbacks/ package (the BaseCallback base plus timing_logging, metrics_logging, train_logging, health_logging) to the direction-(b) disallow_untyped_defs/check_untyped_defs override and annotate the fallout. Hook annotation is mostly mechanical. Four hooks carry a documented # type: ignore[override]: subclasses override the base's positional-param hooks with **kwargs, which is safe because the trainer dispatches every hook by keyword (fabric.call(name, trainer=..., batch=..., ...)). check_untyped_defs also surfaced a real bug in LogAF3TrainingLossesCallback.start_time (None initialised, then used as a float) — typed float | None with an assert documenting that on_train_epoch_start always precedes on_train_epoch_end. Add tests/test_callbacks.py covering the one non-obvious CPU-portable helper, StoreValidationMetricsInDFCallback._load_and_concatenate_csvs (cross-rank CSV de-duplication keyed on example_id|dataset). Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
… + tests Add the small remaining src/foundry modules (inference_engines.base, inference_engines.checkpoint_registry, hydra.resolvers, model.layers.blocks) to the direction-(b) disallow_untyped_defs/check_untyped_defs override and annotate the fallout (annotation-only on source). check_untyped_defs surfaced one inference quirk in resolvers: a dict literal mixing two differently-signed functions joined to the bare 'function' type; annotating it dict[str, Callable[..., Any]] fixes it. Add tests/test_checkpoint_registry.py, tests/test_resolvers.py and tests/test_blocks.py for the previously-uncovered pure helpers (checkpoint dir search order, resolver attribute walking, FourierEmbedding/Dropout behaviour). Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
Add foundry_cli.download_checkpoints to the direction-(b) disallow_untyped_defs/check_untyped_defs override. Annotate the four typer commands with -> None and fix list_installed's total_size accumulator (int 0 -> 0.0); this also clears the long-standing download_checkpoints.py:201 annotation-unchecked note. Add tests/test_download_checkpoints.py covering the checkpoint-dir priority logic and the list-available / list-installed commands via typer's CliRunner. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
Add the Intel-XPU Lightning plugins (utils/xpu, already annotated -> pure lock-ins) and the testing/ pytest helpers to the direction-(b) disallow_untyped_defs/check_untyped_defs override. testing/ needed two annotations: gpu() -> bool and configure_pytest(config: pytest.Config, ...). Add tests/test_xpu.py (device-independent XPU accelerator/precision/strategy contracts, CPU-reachable only) and tests/test_testing_helpers.py (get_test_data_dir). This leaves trainers/fabric as the only shared-layer module not yet under the strict override. Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Continues opting the Foundry shared library (
src/foundry+src/foundry_cli) into strict type-checking —disallow_untyped_defs+check_untyped_defs, module by module — and fills unit-test gaps in the code brought under the checker. After this change the entire shared layer is strictly type-checked except the largetrainers/fabricmodule.Modules brought under strict checking (annotations added as needed)
The rest of
utils/(instantiators, ddp, squashfs, logging, datasets),training/(schedulers, EMA, checkpoint), the top-levelcommon/constants,metrics/(metric, losses), the wholecallbacks/package,inference_engines/,hydra/resolvers,model/layers/blocks, thefoundry_clicheckpoint CLI, the Intel-XPU plugins (utils/xpu/), and thetesting/helpers.New unit tests
Cover the previously-untested logic in those modules — dataset/sampler fallback wiring, schedulers/EMA/checkpoint, the metric manager and loss aggregation, callback CSV merging, checkpoint-directory resolution, the Hydra resolvers, and the XPU plugin contracts.
Notable changes beyond annotations
utils/datasets. The weighted-sampler fallback was gated on"weights" in sampler, butSamplerhas no__contains__, so that branch was dead and the fallback silently used uniform weights. Switched tohasattr(...). Behaviour-neutral for the distributed-loader path, butmodels/mpnn/train.pypasses aWeightedRandomSamplerbuilt from computed PDB weights, so its fallback sampling now uses those weights (the documented intent). mpnn training is cluster-only and unverified in CI — flagged for mpnn owners.rf3.data.paired_msasubclassed an atomworks class that is now a factory function, so it raisedTypeErroron import and was reachable only through a disabled config; removed it along with its orphaned dataset config.floattotal typed asintin the checkpoint CLI, and aNone-initialisedstart_timelater used as afloatin a training-loss callback.