Datasets
Curated, citable datasets backing SciDEX papers and hypotheses — ML benchmarks, omics tables, structured cohorts. While the dedicated `dataset` artifact type ships, the closest mapping is the wiki + paper corpus filtered to dataset-like content.
Browsing datasets.
- dataset
ROSMAP — Religious Orders Study and Memory and Aging
CC-BY-NC-4.0 csv 3,500 rows alzheimersTwo longitudinal cohorts of older adults with annual cognitive testing, brain donation, RNA-seq, proteomics, methylation, and detailed neuropath assessment.
- dataset
Tabula Sapiens — Human Single-Cell Atlas
CC-BY-4.0 h5ad 1,100,000 rows neuroscienceCross-tissue single-cell transcriptomic atlas of healthy human donors — 1.1M cells, 24 organs, used as reference for cell-type deconvolution.
- dataset
DepMap 24Q4 — Cancer Cell Line Dependency Map
CC-BY-4.0 csv 1,100 rows neurodegenerationGenome-wide CRISPR knockouts (Chronos) across 1,100+ cancer cell lines plus drug sensitivity panels — foundational resource for target validation.
- dataset
Allen Brain Atlas — Mouse Adult Brain ISH
CC-BY-4.0 csv 26,000 rows neuroscienceIn situ hybridization expression-energy maps for ~20K genes across mouse adult brain regions — the canonical brain-region-specific expression reference.
- dataset
GTEx v10 — Genotype-Tissue Expression
CC-BY-4.0 h5ad 17,382 rows neuroscienceBulk RNA-seq across 54 human tissues including 13 brain regions; eQTL maps used to instrument MR analyses.
- dataset
DIAN — Dominantly Inherited Alzheimer Network
CC-BY-NC-4.0 csv 600 rows alzheimersAutosomal-dominant AD families with longitudinal CSF, plasma, MRI, PET, and cognitive trajectories from pre-symptomatic through dementia stages.
- dataset
UK Biobank — Population Cohort with Genomics + Imaging
CC-BY-NC-4.0 parquet 500,000 rows neurodegeneration500K participants with genome-wide genotyping, neuroimaging (~50K), biofluid biomarkers, and longitudinal clinical outcomes — primary cohort for late-onset disease genetics.
- dataset
Answer ALS — Multi-omic Profiling of ALS Patients
CC-BY-NC-4.0 h5ad 1,000 rows alsMulti-omic ALS patient profiling with iPSC-derived motor neurons, RNA-seq, ATAC-seq, proteomics, and clinical longitudinal data.
- dataset
PPMI — Parkinson's Progression Markers Initiative
CC-BY-4.0 csv 1,500 rows parkinsonsLongitudinal observational cohort of de novo Parkinson's patients with imaging, biofluids, genetics, clinical, and digital biomarkers.
- dataset
ADNI — Alzheimer's Disease Neuroimaging Initiative
CC-BY-4.0 csv 2,400 rows alzheimersLongitudinal multisite study with structural MRI, PET, CSF, plasma biomarkers, cognitive evaluations across cognitively-normal, MCI, and AD subjects.
- dataset
ROSMAP — Religious Orders Study and Memory and Aging
CC-BY-NC-4.0 csv 3,500 rows alzheimersTwo longitudinal cohorts of older adults with annual cognitive testing, brain donation, RNA-seq, proteomics, methylation, and detailed neuropath assessment.
- dataset
Tabula Sapiens — Human Single-Cell Atlas
CC-BY-4.0 h5ad 1,100,000 rows neuroscienceCross-tissue single-cell transcriptomic atlas of healthy human donors — 1.1M cells, 24 organs, used as reference for cell-type deconvolution.
- dataset
DepMap 24Q4 — Cancer Cell Line Dependency Map
CC-BY-4.0 csv 1,100 rows neurodegenerationGenome-wide CRISPR knockouts (Chronos) across 1,100+ cancer cell lines plus drug sensitivity panels — foundational resource for target validation.
- dataset
Allen Brain Atlas — Mouse Adult Brain ISH
CC-BY-4.0 csv 26,000 rows neuroscienceIn situ hybridization expression-energy maps for ~20K genes across mouse adult brain regions — the canonical brain-region-specific expression reference.
- dataset
GTEx v10 — Genotype-Tissue Expression
CC-BY-4.0 h5ad 17,382 rows neuroscienceBulk RNA-seq across 54 human tissues including 13 brain regions; eQTL maps used to instrument MR analyses.
- dataset
DIAN — Dominantly Inherited Alzheimer Network
CC-BY-NC-4.0 csv 600 rows alzheimersAutosomal-dominant AD families with longitudinal CSF, plasma, MRI, PET, and cognitive trajectories from pre-symptomatic through dementia stages.
- dataset
UK Biobank — Population Cohort with Genomics + Imaging
CC-BY-NC-4.0 parquet 500,000 rows neurodegeneration500K participants with genome-wide genotyping, neuroimaging (~50K), biofluid biomarkers, and longitudinal clinical outcomes — primary cohort for late-onset disease genetics.
- dataset
Answer ALS — Multi-omic Profiling of ALS Patients
CC-BY-NC-4.0 h5ad 1,000 rows alsMulti-omic ALS patient profiling with iPSC-derived motor neurons, RNA-seq, ATAC-seq, proteomics, and clinical longitudinal data.
- dataset
PPMI — Parkinson's Progression Markers Initiative
CC-BY-4.0 csv 1,500 rows parkinsonsLongitudinal observational cohort of de novo Parkinson's patients with imaging, biofluids, genetics, clinical, and digital biomarkers.
- dataset
ADNI — Alzheimer's Disease Neuroimaging Initiative
CC-BY-4.0 csv 2,400 rows alzheimersLongitudinal multisite study with structural MRI, PET, CSF, plasma biomarkers, cognitive evaluations across cognitively-normal, MCI, and AD subjects.