Skip to content

Getting Started

Background

Lancet2 is a command line somatic variant caller (SNVs and InDels) for short read sequencing data implemented with modern C++. It performs joint multi-sample localized colored de-bruijn graph assembly for more accurate variant calls, especially InDels.

In addition to variant calling accuracy and improved somatic filtering, Lancet2 has significant runtime performance improvements compared to Lancet1 (upto ∼10x speedup and 50% less peak memory usage)

Installation

Build prerequisites

Build commands

git clone https://github.com/nygenome/Lancet2.git
cd Lancet2 && mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release .. && make -j$(nproc)

Static binary

Note

It is recommended to build Lancet2 from scratch on the target machine where processing is expected to happen for maximum runtime performance.

If you have a Linux based operating system and a CPU that supports AVX2 instructions. The simplest way to use Lancet2 is to download the binary from the latest available release. The binary from releases is static, with no dependencies and needs only executable permissions before it can be used.

chmod +x Lancet2
./Lancet2 --help

Docker images

Note

A CPU that supports the AVX512 instruction set is required to use the pre-built public docker images. Custom docker images for older CPUs can be built by the user by modifying the BUILD_ARCH argument in the Dockerfile.

Public docker images hosted on Google Cloud are available for recent tagged releases.

Basic Usage

The following command demonstrates the basic usage of the Lancet2 variant calling pipeline for a tumor and normal bam file pair on chr22.

Lancet2 pipeline \
    --normal /path/to/normal.bam \
    --tumor /path/to/tumor.bam \
    --reference /path/to/reference.fasta \
    --region "chr22" --num-threads $(nproc) \
    --out-vcfgz /path/to/output.vcf.gz

See here for more information on how to score and filter somatic variants using explainable machine learning models.

License

Lancet2 is distributed under the BSD 3-Clause License.

Citing Lancet2

Funding

Informatics Technology for Cancer Research (ITCR) under the NCI U01 award 1U01CA253405-01A1.