The Iqbal Lab

Software

Pling

Tool that uses rearrangement distances to build a relatedness network of plasmid genomes. Allows one to explore relatedness of plasmids in a way that respects the biological mechanisms through which they vary.

Code: https://github.com/iqbal-lab-org/pling

Main Paper: https://www.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.001300

Protocols for how to use it: https://www.biorxiv.org/content/10.1101/2025.09.02.673752v1
LexicMap

Alignment to millions of bacterial genomes, allowing BLAST-style queries to all current bacterial data (eg AllTheBacteria).

Code: https://github.com/shenwei356/LexicMap

Docs: https://bioinf.shenwei.me/LexicMap/

Paper: https://www.nature.com/articles/s41587-025-02812-8
Amira

AMR gene identification from long reads, designed to correctly identify multi-copy genes.

Code: https://github.com/Danderson123/amira

Preprint: https://www.biorxiv.org/content/10.1101/2025.05.16.654303v2
Viridian

Rigorous assembler for tiled amplicon sequencing of viruses.

Code: https://github.com/iqbal-lab-org/viridian

Paper: https://www.nature.com/articles/s41592-025-02947-1
Mykrobe

Tool for rapid light-weight analysis of Mycobacterium tuberculosis, Staphylococcus aureus, Shigella sonnei, Salmonella typhi and Salmonella enterica serotype Paratyphi B, giving species/lineage information and drug resistance predictions.

Code: https://github.com/Mykrobe-tools/mykrobe

Papers: DOI: 10.1038/ncomms10063, https://doi.org/10.12688/wellcomeopenres.15603.1
Pandora

Bacterial genomes can be remarkably variable even within a species, leading to the concept of a pan-genome. With standard tools, it is only possible to study SNP/mutation variation in the parts of the genome that are shared across all samples in a cohort (the "core"). Using a new genome graph implementation, we developed a new tool, pandora, for joint analysis of SNP and gene-presence information in the entire bacterial pan-genomes. Pandora supports nanopore and illumina data.

Code: https://github.com/rmcolq/pandora

Paper: DOI: 10.1186/s13059-021-02473-1
make_prg

Python implementation of the Recursive-Cluster-Collapse algorithm described in the Pandora paper, builds genome graphs from either MSA or VCF. Used by both gramtools and pandora.

Code: https://github.com/iqbal-lab-org/make_prg
Gramtools

Tool for joint analysis of SNP/indel variation in cohorts, allowing analysis of mutations on different haplotypes and on alternate backgrounds to long deletions. The underlying data structure is a generalised BWT. Application has been focussed primarily on surface antigens in P. falciparum). Gramtools supports illumina data.

Code: https://github.com/iqbal-lab-org/gramtools

Paper: DOI: 10.1186/s13059-021-02474-0
Minos

Tool for combining multiple callsets (VCF) made for the same sample using different variant callers (eg samtools, freebayes etc), and using a genome graph to adjudicate when the two callsets disagreed. Used heavily in the CRyPTIC project to analyse tens of thousands of M. tuberculosis genomes.

Code: https://github.com/iqbal-lab-org/minos

Paper: https://link.springer.com/article/10.1186/s13059-022-02714-x
Clockwork

Tool which runs multiple (Illumina) variant callers (usually Cortex and samtools, but you could change that) with different strengths, and then combines the results rigorously using minos.

Code: https://github.com/iqbal-lab-org/clockwork

Paper: see the Minos paper above, where it is evaluated on M. tuberculosis, S. aureus and K. pneumoniae.
Varifier

Tool (introduced in the minos paper) for evaluating a VCF file of calls when you have a high quality truth assembly (as is common with bacteria – no issues with phasing calls). A probe is constructed for each record in the VCF, with flanking sequence from the reference genome (with nearby variants applied), and then this is mapped to the truth genome. This allows varifier to measure precision. Measuring recall depends on having reliable true variants; varifier can use minimap and nucmer to compare the reference genome and truth assembly and find a conservative "truth set" of variants (and uses the above probe method to filter the minimap+nucmer calls to exclude errors), and then uses that truth set to measure recall.

Code: https://github.com/iqbal-lab-org/varifier/
BIGSI

Tool for creating kmer index of large sets of microbial sequence data.

Code: https://github.com/iqbal-lab-org/BIGSI

Paper: https://pubmed.ncbi.nlm.nih.gov/30718882/

Note that if you want to build a BIGSI of many samples, the method outlined in the paper is quite memory intensive. We have a better method of merging indexes, documented on the wiki. However, BIGSI is quite outdated now and mostly of historical interest only.
COBS

High performance (faster and less disk use) C++ reimplementation with new ideas, of BIGSI.

Code: https://github.com/iqbal-lab-org/cobs

Paper: https://arxiv.org/abs/1905.09624
Cortex

This rather venerable tool builds coloured de Bruijn graphs and uses them to detect variation between a sample and a reference, or between different samples.

Code: https://github.com/iqbal-lab/cortex

Paper: https://www.nature.com/articles/ng.1028

Cortex is no longer actively developed but it is heavily used, in particular our group have analysed large cohorts of M. tuberculosis by using both cortex and samtools, and then combining the results with minos - this is packaged in Clockwork. We don't think there is a modern reimplementation of this tool - ska comes close, and uses similar ideas, so if you are interested only in SNPs, we would recommend you use that; we are not sure what the best recommendation is if you care about indels.