AnVIL Makes Groundbreaking Genomic Datasets Available on AWS for Free
NHGRI's AnVIL platform has partnered with Amazon Web Services (AWS) to make major genomic datasets available at no cost, eliminating data transfer fees that previously exceeded $15,000 for some datasets.
This initiative removes financial barriers that previously prevented many researchers from accessing critical genomic information, democratizing access to essential research materials and reducing barriers for scientists with limited funding.
Available Datasets
All open-access AnVIL datasets are currently available through this program:
- AnVIL 1000G PRIMED Data Model
- AnVIL 1000G High Coverage 2019
- AnVIL GTEx Public Data
- AnVIL HPRC
- AnVIL NIA CARD Coriell Cell Lines Open
- AnVIL T2T
- AnVIL T2T ChrY
- AnVIL ENCORE 293T
- AnVIL ENCORE RS293
- AnVIL IGVF Mouse R1
- AnVIL MAGE
Additional datasets and updates to existing datasets will be made available as they are released by NHGRI's AnVIL Project.
Accessing the Free Datasets
Free datasets can be browsed in the AnVIL Data Explorer by filtering for open-access consent groups:
Data can be downloaded using any of the following methods:
- Dataset Download via curl — Download a complete dataset using the
curlcommand or usecurlto download files of selected types across multiple datasets simultaneously. - TSV File Manifest — Export a manifest containing the S3 URI for each file.
- Individual File Download — In the Explorer's Files tab, use the download icon to select individual files.