Command Reference¶
This page documents the current gmlst CLI surface.
Help Behavior¶
-hand--helpare equivalent at all levels.- Running a command group with no subcommand prints its usage/help text:
gmlstgmlst schemegmlst utilsgmlst visual
Top-Level CLI¶
gmlst [OPTIONS] COMMAND [ARGS]...
Global options:
-V, --version-v, --verbose-q, --quiet-h, --help
Top-level commands:
typing- type FASTA/FASTQ samples against a schemescheme- scheme/provider/cache managementutils- extraction and sequence utility commandsvisual- local web visualization tools
typing¶
gmlst typing [OPTIONS] COMMAND [ARGS]...
Subcommands:
mlst- MLST schemes onlycgmlst- cgMLST/wgMLST schemes onlytgmlst- scheme-free typing mode
Examples:
gmlst typing mlst -s saureus_1 sample.fna
gmlst typing cgmlst -s vparahaemolyticus_3 sample.fna
gmlst typing cgmlst -s vparahaemolyticus_3 --prefilter-k 31 --prefilter-top-n 20 sample.fna
gmlst typing tgmlst sample.fna
Legacy compatibility:
gmlst typing -s saureus_1 sample.fna
gmlst typing -s schemefree sample.fna
mlst and cgmlst common options:
-s, --scheme TEXT(required)-b, --backend [blastn|kma|minimap2|nucmer]--format [tsv|json|pretty]-o, --output PATH-t, --threads INTEGER--max-workers INTEGER(sample-level parallel workers)-q, --quiet--data-dir, --output-dir PATH(preferred:--data-dir)--novel-allele--novel-profile(requires--novel-allele)-h, --help
cgmlst prefilter options:
--cgmlst-mode [standard|chew-fast|chew-ultrafast|chew-bsr|chew-balanced]--prefilter-k INTEGER--prefilter-top-n INTEGER--prefilter-min-loci-fraction FLOAT--cds-coordinates-out PATH(export predicted CDS coordinates as TSV)--call-policy [default|chewbbaca](chew-style output classification)--chew-cds-gate/--no-chew-cds-gate(only for--call-policy chewbbaca)
cgmlst defaults and performance notes:
- Default backend for
typing cgmlstisminimap2. --cgmlst-mode standard: conservative behavior, no forced chew-style overrides.--cgmlst-mode chew-fast: enables exact-hash + minimap2 hash prefilter plus automatic missing-locus minimap2 refinement (default cap: 500 loci), then targeted blastn evidence fallback for low-confidence loci (default cap: 500 loci).--cgmlst-mode chew-ultrafast: same aschew-fast, but uses representative-only main alignment, disables minimap2 FASTA CIGAR emission, applies an ultrafast minimap2 FASTA speed profile, performs a strict low-confidence rescue pass (default limit: 120 loci), and then runs a second targeted pass with an adaptive budget over remaining partial/closest loci.--cgmlst-mode chew-bsr: adds protein-level exact-hash pre-resolution on top ofchew-fast(including missing-locus refinement cap 500 and targeted blastn fallback cap 500). By default, no additional strict confirmation pass is performed (BSR_CONFIRM_MAX_LOCI=0), but you can enable targeted confirmation via environment variable when needed.--cgmlst-mode chew-balanced: enables exact-hash + minimap2 hash prefilter + targetedblastnfallback for low-confidence loci.- For FASTQ inputs,
typing cgmlstnow auto-switches-b minimap2to-b kmaand treats--cgmlst-modeas compatibility-only (standard) because chew-style mode optimizations are FASTA-oriented. --call-policy chewbbacarequires FASTA assemblies and keeps raw calls unchanged while rendering chew-style per-locus class labels in output.- By default,
--call-policy chewbbacaenforces CDS-gated classification (--chew-cds-gate). Use--no-chew-cds-gateto allow classification from any matched sequence context.
Architecture lock:
- FASTA: chew-style mode branches are active and interpreted normally.
- FASTQ: KMA-first policy is enforced at CLI layer; mode-specific chew branches are not interpreted as FASTQ features.
- Full contract and flow diagrams: see
docs/architecture.md.
Additional tuning:
GMLST_MINIMAP2_FASTA_SPEED_PROFILE=default|fast|ultrafastdefault: existing minimap2 behaviorfast: moderate seed/chaining acceleration (-w 15 -e 1000 -K 1G)-
ultrafast: aggressive speed profile (fast+-f 0.001 -U 50,1000) -
GMLST_CGMLST_MINIMAP2_ULTRA_SECOND_PASS_MAX_LOCI=adaptive|<int> adaptive(default): auto-scales second-pass budget by residual partial/closest burden-
<int>: forces a fixed budget for the ultrafast second pass -
GMLST_CGMLST_FASTQ_KMA_AUTO_THREADS=<int> - Default:
8 - FASTQ cgMLST with KMA auto-raises per-sample threads from
1to this value (capped by CPU count) -
Set to
1to disable auto-raise -
GMLST_CGMLST_KMA_FASTQ_MEM_MODE=1|0 - Default:
1 -
Enables KMA
-mem_modefor FASTQ cgMLST to accelerate single-thread mapping. -
GMLST_CGMLST_KMA_FASTQ_MEM_CONFIRM_MAX_LOCI=<int> - Default:
64 - After mem_mode pass, re-check up to this many
closestloci with strict KMA (without-mem_mode) to recover exact calls. - Prefilter auto-skip threshold is controlled by
GMLST_CGMLST_PREFILTER_MAX_LOCI(default:3000). - Set to
0to disable auto-skip and always attempt prefilter. - For
-b kmaand default-b minimap2, cgMLST prefilter is skipped and the persistent full-index path is used. - Set
GMLST_CGMLST_EXACT_HASH_PREFILTER=1to enable chewBBACA-style DNA exact-match pre-resolution (CDS hash first). - Set
GMLST_CGMLST_MINIMAP2_HASH_PREFILTER=1to enable experimental hash-first prefilter for minimap2 FASTA. GMLST_CGMLST_CDS_PREDICTION_MODE=single|metacontrols Pyrodigal CDS mode for cgMLST exact-hash pre-resolution (default:single).GMLST_CGMLST_CDS_TRAINING_FILE=/path/to/pyrodigal_training.trnuses a fixed training file; if unset and mode issingle, gmlst auto-creates and reusespre_computed/pyrodigal_training.trnon first run.GMLST_CGMLST_CDS_CLOSED_ENDS=1|0controls Pyrodigal closed-end prediction behavior (default:0).GMLST_CGMLST_CDS_COORDINATES_OUT=/path/to/cds_coordinates.tsvexports predicted CDS coordinates for chewBBACA coordinate comparison.GMLST_CGMLST_MINIMAP2_HASH_REFINE_MAX_LOCIcontrols max missing loci for second-pass refinement when mode override does not set it (default:0, disabled).GMLST_CGMLST_EVIDENCE_FALLBACK_BACKENDenables evidence-based targeted fallback for low-confidence loci (none/blastn/kma/nucmer, default:none).GMLST_CGMLST_EVIDENCE_FALLBACK_MAX_LOCIlimits fallback scope by locus count (default:300, set0for no limit).- For large cgMLST schemes with
-b kma, set-t(for example-t 8to-t 16);-t 1can be much slower.
tgmlst options (scheme-free):
--format [tsv|json|pretty]-o, --output PATH--no-header--hash-strategy [safe|fast|ultra|strict|blast]--save-scheme PATH--load-scheme PATH--stats--max-workers INTEGER--assemble-timeout FLOAT--error-report PATH--fail-on-error--summary-report PATH
Notes:
- JSON output includes per-locus
novel_sequencedata for downstream extraction. --count-same-copycurrently applies to blastn same-allele multicopy counting.- In
mlst/cgmlstmodes, FASTQ paired-end files are auto-detected and passed as paired input (no pre-merge) when naming matches common pairs: _R1/_R2_1/_2.1/.2- Supports
.fastq,.fq, and.gzvariants. minimap2FASTQ mode uses a candidate pass plus targeted validation on uncertain loci.GMLST_MINIMAP2_KMER_ENGINEcontrols minimap2 k-mer support scoring (python,kmc,auto).GMLST_TMPDIRcan be set to control where temporary files are created.
scheme¶
gmlst scheme [OPTIONS] COMMAND [ARGS]...
Subcommands:
listshowdownloadupdatecreateupdate-customexport
scheme download¶
gmlst scheme download [OPTIONS]
Options:
-s, --scheme TEXT(required)--force-q, --quiet--download-tool [auto|aria2c|curl|wget|httpx|requests]-x, --connections INTEGER--token TEXT--cache-dir PATH
scheme list¶
gmlst scheme list [OPTIONS]
Typical options:
-p, --provider [<registered-provider>|local|all]-t, --type [mlst|cgmlst|wgmlst|all]-n, --name TEXT-f, --format [text|table|csv|tsv|json]-a, --available--cache-dir PATH
Blocked scheme configuration:
scheme listfilters entries usinggmlst/data/blocked_schemes.json.scheme show,scheme download, andscheme updatereject blocked schemes.- Format: provider name → list of
scheme_namevalues to hide. - Template:
{
"_comment": "List of schemes that should be blocked/hidden from the user",
"pubmlst": ["salmonella_1"],
"pasteur": [],
"enterobase": [],
"cgmlst": []
}
Example (hide one scheme):
{
"pubmlst": ["vparahaemolyticus_3"],
"pasteur": [],
"enterobase": [],
"cgmlst": []
}
Notes:
- Values must use canonical
scheme_name(for examplesaureus_1,vparahaemolyticus_3). - Filtering is currently applied to
scheme listoutput.
scheme show¶
gmlst scheme show [OPTIONS]
Options:
-s, --scheme TEXT-f, --format [text|table|csv|tsv|json]--cache-dir PATH
Behavior:
- With
-s: show detailed information for one scheme. - Without
-s: show guidance, then fall back to listing output.
scheme update¶
gmlst scheme update [OPTIONS]
Options:
-s, --scheme TEXT-f, --force--download-tool [auto|aria2c|curl|wget|httpx|requests]-x, --connections INTEGER--token TEXT--cache-dir PATH
Behavior:
- Without
-s: refresh provider catalogs. - With
-s: provider-specific cached-scheme refresh/update.
Provider endpoint override (for self-hosted BIGSdb):
GMLST_PUBMLST_BASE_URL(default:https://rest.pubmlst.org/db)GMLST_PASTEUR_BASE_URL(default:https://bigsdb.pasteur.fr/api/db)GMLST_PRIVATE_BIGSDB_URL(register private BIGSdb provider)GMLST_PRIVATE_BIGSDB_NAME(optional, default:private)GMLST_PRIVATE_BIGSDB_LABEL(optional display label)
Example:
export GMLST_PUBMLST_BASE_URL="http://127.0.0.1:8000/api/db"
gmlst scheme list -p pubmlst
export GMLST_PRIVATE_BIGSDB_URL="http://127.0.0.1:9000/api/db"
export GMLST_PRIVATE_BIGSDB_NAME="labdb"
gmlst scheme list -p labdb
scheme create¶
gmlst scheme create [OPTIONS]
Options:
-t, --type [mlst](required)-s, --source TEXT(required)--data-dir, --datadir DIRECTORY(required; preferred:--data-dir)--desc TEXT--cache-dir PATH
scheme update-custom¶
gmlst scheme update-custom [OPTIONS]
Options:
-s, --scheme TEXT(required)--data-dir, --datadir DIRECTORY(required; preferred:--data-dir)--cache-dir PATH
scheme export¶
gmlst scheme export [OPTIONS]
Options:
-s, --scheme TEXT(required)--format [grapetree|original](required)-o, --output PATH(required)--cache-dir PATH
utils¶
gmlst utils [OPTIONS] COMMAND [ARGS]...
Subcommands:
extractconcatbenchmarkcheck
utils extract¶
gmlst utils extract [OPTIONS]
Primary modes:
- Allele extraction from sample FASTA/FASTQ
gmlst utils extract -i genome.fasta -s ecoli_1 [--allele dnaN,tsvA]
- Novel data extraction from typing JSON
gmlst utils extract -i typing_results.json --novel-allele --novel-profile --data-dir novel
- TSV fallback for novel allele extraction (re-typing mode)
gmlst utils extract -i typing_results.tsv -s ecoli_1 --novel-allele --novel-profile \
--samples-dir ./samples --data-dir novel
Key options:
-i, --input PATH(required)-s, --scheme TEXT(required for allele extraction and TSV fallback re-typing)-p, --provider TEXT--allele TEXT-b, --backend TEXT--novel-allele--novel-profile--data-dir PATH--samples-dir DIRECTORY(TSV fallback with--novel-allele)--cache-dir PATH
utils concat¶
gmlst utils concat -i genome_mlst.fasta [-o genome_mlst_concat.fasta]
Behavior:
- Concatenates input FASTA records in order into one FASTA sequence.
utils check¶
gmlst utils check -b blastn
Behavior:
- Runs backend dependency check and reports availability.
- Exits with non-zero status if dependency is missing.
utils benchmark¶
gmlst utils benchmark [OPTIONS] SAMPLES...
Options:
-s, --scheme TEXT(required)-b, --backends TEXT-r, --repeat INTEGER-f, --format [table|tsv|json]--cgmlst-gate--gate-max-mismatches INTEGER--gate-details-output PATH--gate-details-format [jsonl|tsv]-o, --output PATH--cache-dir PATH--force-reindex-h, --help
visual¶
gmlst visual [OPTIONS] COMMAND [ARGS]...
Subcommands:
web- start local HTTP app for minimal-spanning-tree visualization
visual web¶
gmlst visual web [OPTIONS]
Options:
--host TEXT(default:127.0.0.1)--port INTEGER(default:8787)--open-browser
Usage:
gmlst visual web --open-browser
Then paste or upload a cgMLST TSV file in the web UI and click Build MST.
Implementation:
- Backend: Flask routes (
/,/health,/api/mst) - Frontend: Vue 3 app built by Vite and served as static assets
(
gmlst/web/frontend->gmlst/web/static/visual/dist)
Behavior:
- Builds an MST from profile distances (per-locus allele differences).
- Supports missing-token penalty toggle (
LNF,NIPH,NIPHEM, etc.). - Supports two layouts in UI:
treeandradial. - Supports metadata-based node coloring (from TSV metadata columns).
- Supports SVG export from the UI.
- Accepts both gmlst TSV and GrapeTree-style profiles (
#Strainfirst column).
For deeper discrepancy-analysis and experimental helper scripts, see internal
docs under docs/internal/.