Storife / medical-datasets

tracking medical datasets, with a focus on medical imaging

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

List of Medical Datasets

I maintain this list mostly as a personal braindump of interesting medical datasets, with a focus on medical imaging.
Rather than try to group / cluster datasets, I'm going to try to maintain a set of keywords for each.
See commit log for a list of additions over time.

Disclaimer: please remember to solve real clinical problems ☺

Main Medical Imaging List

CheXpert

224,316 chest radiographs of 65,240 patients, with labels from reports
Keywords: very-large, X-ray, labels

ChestXray-NIHCC

100000 radiographs
Keywords: very-large, X-ray, labels

MIMIC-CXR

371,920 chest x-rays associated with 227,943 imaging studies
3/16/2019: Not yet linked with MIMIC ICU data. See news article
Need to request access
Keywords: very-large, X-ray, labels

PadChest

160,000 images from 67,000 patients that were interpreted and reported by radiologists
labeled with 174 different radiographic findings, 19 differential diagnoses and 104 anatomic locations organized as a hierarchical taxonomy mapped to standard Unified Medical Language System (UMLS)
Keywords: very-large, X-ray, labels

Cancer Image Archive

Several collections
Tons of Images of various kinds, including CT, MR, Pathology, PT, with diagnoses
Keywords: vary-large, CT, MR, labels

National Lung Screening Trial

Part of Cancer Imaging Archive
50000+ patients with CT data, some pathology, limited availability
Keywords: very-large, CT, labels

DeepLesion

32000+ CT scans with annotations, meta-data, semantic labels from radiological reports
Keywords: very-large, CT, labels

ABCD Neurocognitive Prediction Challenge

MRI for 8500 yound (9-10yo) subjects (about 4100 for training)
Keywords: large, MRI

MRNet

1,370 knee MRI exams with diagonsis (healthy/ACL tear/meniscal tear)
Keywords: large, MRI, labels

fastMRI

k-space data
1500 fully sample knee MRIs and 10K clinical MRIs
Part of a challenge
Keywords: large, MRI, k-space

PREVENT-AD

1704 MRI, 556 amyloid and tau CSF samples, blood markers, genetic info and longitudinal cognitive data on ~400 at risk individuals
Keywords: medium, MRI, genetics, labels

Medical Segmentation Decathlon

10 Medical image datasets with segmentations
2000+ CT & MR images of various organs from different sources
Keywords: medium, MRI, segmentations

MASSIVE

Multiple Acquisitions for Standardization of Structural Imaging Validation and Evaluation
8000 diffusion-weighted volumes
10 3D FLAIR, T1-, and T2-weighted datasets of a single healthy subject
Keywords: large, MRI

MRIdata

List of mri k-space datasets

Studyforrest

Few subjects, but many modalities (T1,T2,SWI,Angio,DWI, fMRI during Forrest Gump at 3T (audio+visual+eyetracking+physio) and 7T (audio+physio only), some audio tasks, and other important visual tasks)
Keywords: small, multi-modal

Lung Image Database Consortium

LIDC-IDRI consists of diagonstic and lung cancer screening CTs.
1018 cases with some Radiologist Annotations/Segmentations and nodule counts
Also available through LUng Nodule Analysis (LUNA) challenge
Keywords: large, CT, labels

UK Biobank

All imaging
Fundus imaging
Keywords: very-large

ADNI

Various imaging (longitudinal MRI), Genetics, Clinical data
Several thousand patients
Keyworks: large, MRI, genetics, clinical

VISCERAL

~120 image volumes (whole body CT and MRI images)
more than 1900 annotated anatomical structures
Keywords: medium, MRI, CT, whole-body, manual-segmentation

Mindboggle

Seems like 101 manually labelled brain MRIs
Keywords: medium, MRI, brain, manual-segmentation

Neuromorphometrics

63 manually labelled brain scans. Costs ($1500?) Discussion
Keywords: medium, MRI, brain, manual-segmentation, costly

Automatic Non-rigid Histological Image Registration

This is a challenge for ISBI2019

7-Tesla rs-fMRI

22 particiapnts with cognitive and physiological mreasures, and 7T rs-fMRI

SpineWeb

200+ subjects across several datasets (CTs, Xrays, MRIs)

Whole-Heart and Great Vessel Segmentation from 3D Cardiovascular MRI in Congenital Heart Disease

20 cardiac MR images in Congenital Heart Disease

Longitudinal Neuroimaging on arithmetic processing in children

paper
3T fMRI 132 typical dev children, 2 time points, four tasks
Keywords: medium, fMRI, longitudinal

ATLAS: Anatomical Tracings of Lesions After Stroke

229 T1-weighted MRI scans (n=220) with lesion segmentation
MNI152 standard-space T1-weighted average structural template image
A .csv file containing lesion metadata
paper
Keywords: medium, MRI, segmentations

SIMON

Single voluneer, 73 Sessions at multiple sites over ~17 years
MRI, at least T1 at each session, with other modalities varying by session.
Phenotype file provided
Keywords: small, MRI, longitudinal

100 micron MRI of Human Brain

Single volume, ultra-high resolution MRI dataset (100-micron)
Keywords: small, MRI, brain

Brain Catalogue

(ex-vivo) brain MRIs or brains of different animals
Keywords: small, MRI, brain, animals

Non-imaging

PhysioNet / Computing in Cardiology 2019 Challenge

predict sepsis in an ICU population
5000 ICU patients in three separate hospital systems

eICU-CRD

detailed information about critical care stays for over 200,000 admissions at 200+ hospitals across the US.
With access to MIMIC, can access eICU-CRD immediately after signing an updated DUA.
paper

Non-medical but useful / fun

Moment in time

Other lists or pooling resources (relevant xkcd)

About

tracking medical datasets, with a focus on medical imaging