Requesting Data Access


Data Access Types

AnVIL provides three types of data access:

  1. Open Access - Open access datasets are accessible to all in the AnVIL Data Explorer.
  2. Controlled Access - Controlled Access datasets matching the data's dbGaP consent codes are accessible to researchers via the AnVIL Data Explorer. Access is granted via dbGaP or DUOS within the Data Explorer using the process outlined below.
  3. Consortium Access - Consortium Access datasets are accessible to consortia members under the consortium data sharing agreement.

Accessing Controlled Access Data

This document outlines the process by which external, non-consortium members can gain access to a given cohort housed within the AnVIL. To learn more about AnVIL data and best practices for working with AnVIL data, see Finding and using data in AnVIL.

Goals

This article is to inform a novice user how to

  1. Set up a Terra account with Google billing and data access via a linked eRA Commons address.
  2. Find data in the AnVIL Data Explorer.
  3. Submit a Data Access Request (DAR) within the AnVIL Data Explorer using either dbGaP or DUOS.
  4. Gain access to the data (with a valid, approved DAR).

1. Set up billing and access in Terra

AnVIL recommends storing and analyzing data within the AnVIL ecosystem on the Terra platform. The step-by-step instructions (linked below) include guidance about setting up an account and billing in Terra.

Linking Your Terra Account And Your eRA Commons Address

  1. You must have an eRA Commons or NIH account to access controlled AnVIL data. Go here for instructions to set up an eRA Commons or NIH account.
  2. Establish a link in Terra to your eRA Commons/NIH Account. To link an eRA Commons to your Terra account, go to your Profile page in Terra and log in with your NIH credentials. (Note: Once per month, you will need to relink these accounts to ensure that you still have proper access).

Instructions

  • For step-by-step instructions on how to set up an account in Terra including how to set up and link Google billing to cover the costs of cloud resources and how to link your dbGaP authorization for controlled data, see Part 1: Set up billing and access in Terra.

2. Find data and request access in the AnVIL Data Explorer

The AnVIL Data Explorer provides faceted search and selection of released AnVIL data, including by dataset, donor, biosample, and file formats. You can request access through dbGaP or DUOS right from the Data Explorer.

Instructions

  • For step-by-step instructions how to search and filter data and explore studies in the AnVIL Data Explorer, see Part 2 in Terra Support.
  • For details on how to check/request access to controlled data in the AnVIL Data Explorer, click here.
  • Is the dataset not listed in the AnVIL Data Explorer?
    If a desired dataset is not in the AnVIL Data Explorer, note that recently released data may not yet be indexed in the AnVIL Data Explorer. For step-by-step instructions on how to use DUOS to find and access your data, see How to access GTEx data in Terra. Note that this documentation uses GTEx data as an example; you will use the same steps to access any data stored in TDR via DUOS, including AnVIL data before it is available in the AnvIL Data Explorer.

Next steps: Analyzing AnVIL data

Once you have access to controlled data, you can analyze it in Terra to keep it within the AnVIL ecosystem.

Instructions

Accessing Consortium Access Data

Many consortia have data-sharing agreements between members, granting each member access to every other member's data within the consortium.

The AnVIL is offering a streamlined access process for consortium members in data-sharing consortia. See Consortia Access guidelines for more details and instructions.

The consortium bringing the data in designates a contact person, and that person is added to an access list as an access control admin for the consortia's datasets.

The admin can add or remove users as their needs demand, and any users added to that list will see all the workspaces for their group.

For example, someone added to the CCDG access list will be able to see all the CCDG workspaces.

Participating Consortia

The following consortia currently participate in AnVIL’s consortia data-sharing program:

If you are a member of a participating consortium and would like access to a consortium’s data, please reach out to your consortium leadership to request access.

Requester Pays

  • All AnVIL buckets have Requester Pays enabled, meaning that you will need to provide a billing account in order to cover any costs associated with egress, storage, or compute.
  • If working in gsutil, using the -u argument will be critical to provide this billing account.

Troubleshooting

If you are having trouble with your access to AnVIL data, please email our help desk at help@lists.anvilproject.org, and someone will reach out to you as soon as we are able.


Help us make these docs great!
All AnVIL docs are open source. See something that’s wrong or unclear? Submit a pull request.
Make a contribution
NHGRINIHHHSUSA.GOV
HelpPrivacy
v2.11.12-22a805f