BioIngest Data Explorer
Browse 28 biomedical data sources, 66 queryable tables, and 3 cross-reference mapping tables.
Click to preview real columns and rows before downloading.
Browse S3 Bucket
Browse GitHub
Open Athena
Request a Source
Browse Data
Proteins & Targets
| Dataset |
Description |
|
GitHub |
S3 |
uniprot/swissprot_tsv |
570K reviewed proteins (15 cols) |
|
|
|
uniprot/idmapping_human |
UniProt cross-references (5 cols) |
|
|
|
string/protein_links |
Protein-protein interactions (3 cols) |
|
|
|
interpro/protein2ipr |
Protein domain families (6 cols) |
|
|
|
chembl/uniprot_mapping |
ChEMBL target IDs (4 cols) |
|
|
|
complex_portal/homo_sapiens |
Protein complexes (19 cols) |
|
|
|
markerdb/proteins |
4K biomarker proteins (12 cols) |
|
|
|
Disease & Pathway Associations
| Dataset |
Description |
|
GitHub |
S3 |
diseases/knowledge |
Disease-gene associations (7 cols) |
|
|
|
diseases/experiments |
Experiment-based associations |
|
|
|
reactome/UniProt2Reactome |
2.5M pathway memberships (6 cols) |
|
|
|
ttd/targets |
Therapeutic targets & drugs |
|
|
|
ttd/drugs |
Drug-target-indication triples |
|
|
|
clinvar/variant_summary |
Variant-disease significance |
|
|
|
gtex/median_tissue_tpm |
Tissue expression matrix |
|
|
|
Ontologies
| Dataset |
Description |
|
GitHub |
S3 |
mondo/mondo |
47K disease terms + hierarchy |
|
|
|
disease_ontology/doid |
12K Disease Ontology terms |
|
|
|
efo/efo |
50K experimental factors |
|
|
|
gene_ontology/go_basic |
45K Gene Ontology terms |
|
|
|
mesh/descriptors |
MeSH medical vocabulary |
|
|
|
icd/icd10cm |
ICD-10-CM diagnosis codes |
|
|
|
Competitors & Assay Platforms
| Dataset |
Description |
|
GitHub |
S3 |
competitors_msd/msd_assays |
MSD assay specs (LOD, range) |
|
|
|
competitors_quanterix/assays |
Quanterix Simoa specs (8 cols) |
|
|
|
competitors_alamar/all_targets |
Alamar NULISAseq targets (26 cols) |
|
|
|
competitors_nomic/omni_1000 |
Nomic Omni 1000 targets |
|
|
|
somalogic/somascan_11k |
SomaScan 11K menu |
|
|
|
Olink & HPA Proteomics
| Dataset |
Description |
|
GitHub |
S3 |
hpa_olink/proteinatlas |
HPA full proteomics |
|
|
|
Genomics & Variants
| Dataset |
Description |
|
GitHub |
S3 |
gnomad/constraint |
Gene constraint (pLI, LOEUF) |
|
|
|
ensembl/gene_info |
Gene annotations (GRCh38) |
|
|
|
dbsnp/variant_freq |
Population variant frequencies |
|
|
|
ukb_disease_assoc/pan_ukb |
UK Biobank GWAS |
|
|
|
Mapping Tables
Cross-reference tables linking all sources through shared identifiers:
| Table |
Rows |
Cols |
What it maps |
|
S3 |
| protein_map |
26,499 |
9 |
UniProt ↔ Ensembl ↔ STRING ↔ ChEMBL ↔ HGNC |
|
|
| disease_map |
31,884 |
10 |
MONDO ↔ EFO ↔ DOID ↔ MeSH ↔ ICD-10 |
|
|
| drug_map |
42,939 |
5 |
TTD drugs ↔ targets ↔ indications |
|
|
Don't see what you need?
Request a New Data Source
We add new sources regularly. Tell us what database you'd like and we'll build the connector.