Skip to content

Configuration

All configuration via environment variables or a .env file in the project root.

Full Reference

Neo4j (KG construction)

Variable Default Used by
NEO4J_URI bolt://localhost:7687 ingest, ontology build
NEO4J_USER neo4j ingest, ontology build
NEO4J_PASSWORD ingest, ontology build
NEO4J_DATABASE neo4j ingest, ontology build

LLM

Variable Default Used by
LLM_SERVICE bedrock ingest, ontology extract
LLM_MODEL_ID Override default model
BEDROCK_MODEL_ID us.meta.llama3-1-8b-instruct-v1:0 Bedrock default
BEDROCK_REGION us-east-1 Bedrock API region
OLLAMA_API_URL http://localhost:11434 Ollama base URL

PubMed / PMC

Variable Default Used by
ENTREZ_EMAIL bioingest@example.com ingest pubmed, ingest pmc
ENTREZ_API_KEY Optional, 10x rate limit

AWS (Publish)

Variable Default Used by
AWS_PROFILE dsinternal publish
AWS_REGION eu-north-1 publish

Scraping

Variable Default Used by
FIRECRAWL_API_KEY scrape commands

Data Directory

Variable Default Used by
BIOINGEST_DATA_DIR ./data All commands

Example .env

# KG construction
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=secret
NEO4J_DATABASE=olink3

# LLM
LLM_SERVICE=bedrock
BEDROCK_REGION=us-east-1

# PubMed
ENTREZ_EMAIL=your@email.com
ENTREZ_API_KEY=your-key

# AWS
AWS_PROFILE=dsinternal
AWS_REGION=eu-north-1

# Scraping
FIRECRAWL_API_KEY=fc-your-key

CLI Overrides

Most config can be overridden via CLI flags:

bioingest ingest pubmed -q "..." --service local --database mydb
bioingest ontology build --uri bolt://remote:7687 --user admin --password pass
bioingest publish --profile other-profile --region us-west-2