Skip to content

ngless-toolkit/ngless

Repository files navigation

NGLess: NGS Processing with Less Work

NGLess logo Ngless is a domain-specific language for NGS (next-generation sequencing data) processing.

Build & test MIT licensed Install with Bioconda Install with Bioconda Citation for NGLess

For questions and discussions, please use the NGLess mailing list.

If you are using NGLess, please cite:

NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language by Luis Pedro Coelho, Renato Alves, Paulo Monteiro, Jaime Huerta-Cepas, Ana Teresa Freitas, Peer Bork, Microbiome (2019) https://doi.org/10.1186/s40168-019-0684-8

NGLess cartoon

Example

ngless "1.5"
input = fastq(['ctrl1.fq','ctrl2.fq','stim1.fq','stim2.fq'])
input = preprocess(input) using |read|:
    read = read[5:]
    read = substrim(read, min_quality=26)
    if len(read) < 31:
        discard

mapped = map(input,
                reference='hg19')
write(count(mapped, features=['gene']),
        ofile='gene_counts.csv',
        format={csv})

For more information, check the docs. We also have a YouTube tutorial on how to use NGLess and SemiBin together (but you can learn to use NGLess independently of SemiBin).

Installing

See the install documentation for more information.

Bioconda

The recommended way to install NGLess is through bioconda:

conda install -c bioconda ngless 

Docker

Alternatively, a docker container with NGLess is available at docker hub:

docker run -v $PWD:/workdir -w /workdir -it nglesstoolkit/ngless:1.5.0 ngless --version

Adapt the mount flags (-v) as needed.

Linux

You can download a statically linked version of NGless 1.5.0

This should work across a wide range of Linux versions (please report any issues you encounter):

curl -L -O https://github.com/ngless-toolkit/ngless/releases/download/v1.5.0/NGLess-v1.5.0-Linux-static-full
chmod +x NGLess-v1.5.0-Linux-static-full
./NGLess-v1.5.0-Linux-static-full

This downloaded file bundles bwa, samtools and megahit (also statically linked).

From Source

Installing/compiling from source is also possible. Clone https://github.com/ngless-toolkit/ngless

NGLess is written in Rust and builds with a standard cargo toolchain:

git clone https://github.com/ngless-toolkit/ngless
cd ngless
cargo build --release      # produces target/release/ngless

The external tools NGLess drives (samtools, bwa, minimap2, prodigal, megahit) are not bundled: they are located on your $PATH (or via per-tool NGLESS_<TOOL>_BIN environment variables). The pinned versions used for testing are declared in pixi.toml, so a quick way to obtain them is pixi:

pixi run --environment default target/release/ngless --version

Running Sample Test Scripts on Local Machine

For developers who have successfully compiled and installed NGless, running the test scripts in the tests folder would be the next line of action to have the output of sample test cases.

cd tests

Once in the tests directory, select any of the test folders to run NGless.

For example, here we would run the regression-fqgz test:

cd regression-fqgz
ngless ungzip.ngl

After running this script open the newly generated folder ungzip.ngl.output_ngless and view the template in the index.html file.

For developers who have done this much more datasets for testing purposes can be referenced and used by reading these documentation links: Human Gut Metagenomics Functional & Taxonomic Profiling Ocean Metagenomics Functional Profiling Ocean Metagenomics Assembly and Gene Prediction

Implementation (Rust)

NGLess was originally written in Haskell and has been reimplemented in Rust; the Haskell implementation was removed at the 1.6 release. The Rust sources live at the repository root (Cargo.toml, src/). See rust-migration.md for the port history and a module-by-module account of what was ported.

Only ngless "1.5"+ scripts are supported. Behavioral parity with the former Haskell implementation (byte-identical output) is verified against the functional test suite under tests/.

Build & test

cargo build --release      # produces target/release/ngless
cargo test                 # unit tests
cargo fmt --all -- --check  # formatting is enforced in CI

Functional / parity test suite

The committed expected.* files in each tests/ directory were produced by the Haskell binary, so running the functional suite against the Rust binary is a parity check against Haskell. Point the harness at the build via NGLESS_BIN (it needs the external tools on $PATH; pixi run --environment default provides the pinned versions):

NGLESS_BIN=target/release/ngless ./run-tests.sh          # all tests
NGLESS_BIN=target/release/ngless ./run-tests.sh regression   # only tests/regression*

More information

Authors