GaiaLab — AI-Powered Biological Intelligence Platform

Drug Repurposing Engine

Six-factor weighted scoring — target overlap, clinical evidence, mechanism alignment, pathway relevance, safety profile, and disease context — ranks FDA-approved and investigational agents across Tier I, II, and III. AlphaFold pLDDT-gated structural druggability provides an orthogonal signal. CIViC and OncoKB evidence directly calibrate confidence. Precision@10 = 100% across 22 disease areas.

Prediction Accountability

Every therapeutic candidate is logged prospectively with a timestamp and confidence score, then automatically cross-referenced against ClinicalTrials.gov. Results are published at /validation with no login required. No comparable open-access drug repurposing platform makes this data public. Current dataset: 7,652 predictions · 77% directional alignment · AUROC 0.545.

CIViC + OncoKB Clinical Evidence

Each gene is cross-referenced against CIViC (peer-reviewed clinical variant evidence, Levels A–E, Washington University) and OncoKB (FDA-recognized precision oncology biomarker levels from Memorial Sloan Kettering — Level 1 companion diagnostics through Level R2 resistance). Drug associations, AMP/ACMG tiers, and oncogene/tumor-suppressor classifications surface automatically. No token required for CIViC; optional token for OncoKB.

Mechanism Derivation

Checkpoint and resistance genes are mapped to immunotherapy escape mechanisms through pathway enrichment, protein interaction topology, and cross-validated source agreement. All mechanistic assignments derive from structured database outputs — not generative text synthesis.

Interaction Network

Force-directed protein–protein interaction graph with temporal overlay. Surfaces hub centrality, computationally predicted edges, and topological context aggregated across multiple curated interaction databases.

Evidence Ledger

Per-claim PMID-linked evidence trail spanning pathways, therapeutic signals, and mechanistic hypotheses. Each citation is assigned a polarity classification — supporting, contradicting, or mixed — with full traceability to the primary source.

Immutable Snapshots

Every analysis run is captured as a tamper-evident snapshot encoding gene inputs, model versions, scoring parameters, and complete outputs. Snapshots can be diffed against prior runs or replayed independently for methods reproducibility and audit compliance.

Credibility Gating

Evidence Depth Score (EDS), Contention Index (CI), and grounded ratio collectively gate every output. Claims below quality thresholds are flagged or suppressed with transparent, auditable rationale — not silently discarded.

16-Channel Evidence Consensus

Every conclusion is cross-validated across 16 independent data channels — genomics, protein structure, pathway enrichment, literature, drug bioactivity, clinical trials, disease association, interaction networks, expression, safety, and more. Channel agreement elevates confidence; divergence triggers a contradiction flag and downgrades the claim. No single source drives a conclusion.

Structured Export

Analysis results are exportable as JSON evidence packages or formatted briefs. Each export includes scoring context, PMID citations, contradiction annotations, and complete model configuration metadata for methods-section reproducibility.

Adaptive Learning

Evidence-driven recalibration loop: prediction outcomes are verified against ClinicalTrials.gov, hypothesis outputs are cross-checked against PubMed, and calibration drift is detected and corrected automatically. Confidence estimates improve with each completed analysis cycle.

TCGA Survival Stratification

Kaplan-Meier overall survival curves stratified by mutation status across 15 TCGA cancer cohorts — BRCA, LUAD, GBM, PAAD, and 11 additional. Log-rank p-value, hazard ratio, and median overall survival are returned in seconds via the cBioPortal public API. No institutional subscription required.

Hypothesis Feedback Loop

Researchers can submit confirmed, refuted, partial, or inconclusive outcomes directly from the results page. GaiaLab aggregates submissions into a per-decile calibration curve and applies a recalibration multiplier at the next server start — closing an active learning loop not available on any comparable open platform.

Living Knowledge Graph Alerts

Each analysis is compared against prior conclusions in the knowledge graph for the same disease context. A contradiction alert is raised when a therapeutic score shifts more than 20 points relative to prior runs. Weekly digests surface the most materially changed conclusions before teams proceed to publication — persistent institutional memory that updates in real time.

CIViC Clinical Evidence Integration

Each gene is cross-referenced against CIViC (Clinical Interpretation of Variants in Cancer) — the peer-reviewed, community-curated database maintained by Washington University in St. Louis. Evidence levels A (validated association) through E (inferential), AMP/ACMG tier, and therapeutic drug associations are returned per variant and per gene.

MSigDB Hallmark Enrichment

Gene panels are enriched against the MSigDB Hallmark collection (50 curated cancer hallmark gene sets), KEGG 2021 Human, and WikiPathways via the Enrichr API. Statistical significance is assessed with Benjamini-Hochberg FDR correction, surfacing which canonical oncogenic programs are active beyond what any single-collection enrichment reports.

OncoKB Precision Oncology

When an API token is configured, each gene is queried against OncoKB — the FDA-recognized precision oncology knowledge base from Memorial Sloan Kettering Cancer Center. Level 1 biomarkers (FDA-approved companion diagnostics), Level 2 (standard of care), and Level R1/R2 (resistance markers) are returned alongside oncogene versus tumor suppressor classification.

Your analysis, delivered in under 60 seconds

🔬

Enriched Pathways

Top pathways ranked by hypergeometric p-value with Benjamini-Hochberg FDR correction and q-values

💊

Scored Therapeutic Candidates

FDA-approved and investigational agents scored 0–100 by target match, clinical evidence, and mechanism alignment. Precision@10 = 100% · AUROC 0.545 across 22 disease areas · view accuracy data

🧠

Mechanistic Hypotheses

Debate-refined hypotheses with calibrated confidence scores and testable experimental designs

📋

PMID Evidence Ledger

Every claim traced to primary PubMed citations with supporting, contradicting, or mixed polarity classification

🔁

Self-Calibrating Confidence

Predictions validated against ClinicalTrials.gov · hypotheses cross-checked against PubMed · calibration updated automatically

Export formats: JSON evidence package · PDF brief · CSV drug table · Reproducible snapshot for methods sections

Signal Credibility Metrics — live from the most recent analysis run

Citation Coverage ⓘ

94%

Grounded Ratio ⓘ

87%

Contradiction Rate ⓘ

2.1%

Evidence Depth Score ⓘ

62/100

Live metrics from most recent run · Run an analysis to refresh · Full methodology →

Independently validated · published openly · no login required · gaialabai.com/validation

7,652

drug predictions logged and validated against ClinicalTrials.gov

77%

of predicted research directions matched an active or completed clinical trial

100%

Precision@10 — every top-ranked prediction matched a clinical trial across 22 disease areas

0.545

AUROC vs. 0.500 random baseline · bootstrap 95% CI: 0.526–0.562

No other open-access drug repurposing platform publishes this data. Raw JSON: /api/predictions/calibration · Benchmark source: scripts/benchmark-auroc.js in the public repository.

Platform

The only open drug repurposing engine that publishes its accuracy.

GaiaLab converts a gene list into scored therapeutic candidates in under 60 seconds — grounded in 40 live biological databases including CIViC clinical variant evidence, OncoKB FDA-recognized biomarkers, MSigDB Hallmark enrichment, AlphaFold structural druggability, DepMap CRISPR essentiality, and TCGA survival stratification. Six specialized AI agents independently evaluate and debate every conclusion. Every drug prediction is logged prospectively and validated against ClinicalTrials.gov — the accuracy data is publicly available. No subscription. No account. No waiting.

Houston, TX · Applied to glioblastoma, AML, Alzheimer's disease, breast cancer, pancreatic cancer, and more · Research use only · partnerships@gailabai.com

Live biological databases queried per analysis

P@10=100%

Top-10 prediction accuracy across 22 disease areas

Cost to run your first analysis

Background

Built in Houston, TX to give translational research teams the analytical depth of large pharmaceutical informatics groups — without the institutional cost, turnaround time, or reproducibility gaps of manual literature synthesis. Applied to GBM, AML, Alzheimer's disease, breast cancer, and pancreatic cancer.

Mission

Make drug repurposing intelligence accessible without a subscription, institutional login, or computational background. A graduate student in Lagos, a biotech founder in Houston, and a pharma team in Basel should run the same analysis in the same 60 seconds — and receive identical, transparent accuracy data.

Guiding Principles

Accuracy-first: AUROC, Precision@K, and calibration curves are published openly. No comparable open platform does this.
Reproducibility: Every run produces an immutable snapshot — inputs, model versions, scoring parameters, and complete outputs — for independent verification.
Honest limitations: LLM-assisted synthesis, not trained prediction models. Public APIs, not proprietary databases. Transparent about both.

Methods & Scoring

Confidence tiers are derived from cross-source agreement, study design classification, and citation depth across 40 databases.
Evidence polarity scoring identifies contradictions and contention where published data diverges.
Per-claim evidence ledger with PMID traceability, polarity classification (supporting/contradicting/mixed), and full scoring context.
Immutable run snapshots capture database versions, model configuration, credibility gate outcomes, and all scored outputs.

Inspect a Sample Snapshot

Download a complete audit snapshot containing evidence packages, scoring context, data sources, and model configuration metadata.

Download sample snapshot Load melanoma IO panel

Includes reproducible gene inputs, data source versions, and full model configuration details.

Melanoma IO Resistance Panel

Oncology reference panels

Internal demos / reference panels

Anti-PD-1 Resistance Audit: Melanoma

10-gene IO resistance panel: PDCD1, CD274, CTLA4, LAG3, HAVCR2, PTEN, B2M, JAK1, STK11, BRAF
IO Response Score + TCGA SKCM mutation frequencies (n=440) queried live from cBioPortal
Export a reproducible JSON evidence package with per-claim PMID traceability

Breast Cancer Panel

TP53, BRCA1, EGFR analyzed in breast cancer disease context
Inspect pathway enrichment rankings and grounded ratio
Diff against a prior snapshot for run-to-run stability

Colorectal KRAS Panel

KRAS, NRAS, BRAF analyzed in colorectal cancer context
Explore 3D interaction network hub centrality
Review mechanism classifications and therapeutic overlap

IO Responder Profile: Inflamed TME

Inflamed panel: CD8A, CXCL9, CXCL10, PDCD1, LAG3, TIGIT — cytotoxic T-cell infiltration with chemoattractant signature
IO Response Score 100/100 (strong response likelihood) — contrast with 16/100 resistance panel
Identify actionable checkpoints: LAG3 → relatlimab, TIGIT → tiragolumab

Configure Analysis

Enter any gene list and disease context. The pipeline queries 35+ biological databases in parallel — including PubMed, ChEMBL, OpenTargets, ClinicalTrials.gov, OpenFDA, and PharmGKB — then synthesises pathways, therapeutic candidates, mechanistic hypotheses, and a confidence-scored evidence ledger. No account required.

Gene Symbols

Enter 2–15 gene symbols separated by commas · or · or

Try:

Disease Context

Be specific — disease, subtype, and mechanism context improve output quality

Try:

Output Perspective

Workspace Memory

Optional. Reuse the same workspace ID to track prior runs, contradictions, and changed conclusions over time.

Workspace IDs can stay anonymous, or you can create a protected workspace with invite-based team access.

Include therapeutic signal mapping (drug targets, trials, safety)

Adds DGIdb, ChEMBL, DrugCentral, ClinicalTrials.gov, OpenFDA, and PubChem. Adds ~10s to analysis time.

Clinical Biomarkers FDA-Approved IO Predictors optional

TMB (mut/Mb) FDA-Approved

≥10 mut/Mb = TMB-H (KEYNOTE-158). ≥20 = very high.

MSI / MMR Status FDA-Approved

Not tested MSS / pMMR MSI-L MSI-H / dMMR

PD-L1 Expression FDA-Approved

CPS (Combined Positive Score)

TPS (Tumor Proportion, %)

CPS≥1 nivolumab eligible · CPS≥10 pembrolizumab preferred · TPS≥50% monotherapy

No account required · Results stream in ~30 seconds · Data not stored beyond your session

Running Analysis

Databases

Literature

AI Synthesis

Evidence Gate

Assembly

Initializing pipeline...