Data Sources & Download
Auto-Downloadable Sources (18)
bioingest download-all --all --max-size 2gb
Source
ID
Format
UniProt
uniprot
TSV, FASTA, XML
Open Targets
opentargets
Parquet
Reactome
reactome
TSV
ChEMBL
chembl
SQLite, TSV
DISEASES 2.0
diseases
TSV
MarkerDB
markerdb
TSV
TTD
ttd
TSV, XLS
Complex Portal
complex_portal
TSV
HPA / Olink
hpa_olink
TSV
MeSH
mesh
XML
ICD-10-CM
icd
ZIP
PDB-KB / SIFTS
pdb_complexes
IDX, TSV.GZ
UK Biobank (Pan-UKB)
ukb_disease_assoc
TSV.BGZ
SomaLogic SomaScan
somalogic
TSV, SQLite
Gene Ontology
gene_ontology
OBO
Disease Ontology
disease_ontology
OBO
MONDO
mondo
OBO
EFO
efo
OBO
Commands
bioingest download # list all sources
bioingest download uniprot # download default datasets
bioingest download uniprot --list # show available datasets
bioingest download uniprot --all # download everything
bioingest download-all --max-size 500mb # all sources, skip large files
Features
Resumable — interrupted downloads resume from where they left off
Checksums — SHA-256 recorded in manifest.json per source
Selective — download specific datasets with --datasets X Y
Size limits — skip large files with --max-size