Data Access

Table of Contents

Data access guide

Currently, Globus is our preferred mode of data transfer.

Update: We have stopped using iRODS server for data distribution from October 2022.

Check our new data access guide: Accessing data from IGF

Click to expand slides here

Globus based transfer

Imperial College’s Research Data Store is now linked to Globus which allowes the following options

  • Transfer large volumes of data between the RDS, your personal computer and Globus-accessible storage at other institutions
  • Share RDS project allocation data with selected third parties, without requiring them to have a College account (Globus identity required)

Check our new slides for Globus transfer: globus data transfer

Click to expand slides here

Requirements

For users from Imperial College London

We will require your Colleges username (e.g., username@ic.ac.uk) for this mode of data sharing. Please note that data sharing will fail if you provide us alternate user names (e.g. following will not work your.fullname@imperial.ac.uk or user.name@ic.ac.uk ). For more details, please have a look at Imperial College’s guideline for Globus data transfer: Transferring data to other sites with Globus

For users without Imperial College account

Please send us the email id linked to your Globus account.

Globus transfer process

  • We will create a new Globus collection after receiving the user’s request and copy existing data to it.
  • The new collection will be shared with the user’s Globus account (i.e., email id or username). We can add more than one user to the same Globus collection.
  • User needs to follow Globus or Imperial College’s documentation and transfer data (to any preferred storage location).
  • Files from any new sequencing run or analysis will get added to the existing Globus collection directory.
  • User needs to transfer any new batch of data separately after receiving email notification from us.
  • Each file on the Globus collection directory will be removed after 30 days of file creation.
  • We may remove old Globus collections without any prior notice, once all the files are removed from the collection directory.
Go to Top

Imperial College Research Data Store based transfer

Please note: We can only use RDS transfer at the very end of the project when we have data from all the sequencing runs and analysis pipelines

Imperial College now offers a new central service for storing large volume of research data. Please follow these steps to setup a new storage volume for your sequencing project:

Step 1: Check the documentation about Research Data Store (and wiki page) and setup a new allocation for your peoject. Also, we have few slides regarding setting-up RDS project allocation

Click to expand slides here

Step 2: Add Imperial BRC Genomics Facility (username: igf) as a new member of the research data storage, once its available

Step 3: Update IGF regarding your new RDS storage path in HPC

Step 4: Data will be copied to the top level of the storage using the layout RDS_PATH/live/PROJECT_NAME

Step 5: Remove IGF user from the RDS allocation when all the sequencing runs are finished and data transfer is over (IMPORTANT)

How to remove IGF user from the RDS allocation

Follow these steps for removing IGF user from the RDS allocation:

  • Login to RCS selfservice portal portal using your Imperial College credentials
  • Click on the Research Data Storage projects on the left panel
  • Click on the correct “rds-xyz” id to access the ‘Membership” info for the selected RDS project
  • Check if you have admin priviledges for this project or not (i.e., if Admin? column has yes value or not)
  • Go to the row which has entry for user ‘igf’ and select the checkbox for Remove? column and click the Update button at the bottom of this page
Go to Top

Illumina Basespace Sequence Hub based file transfer (Discontinued)

Please note: We can only transfer fastq files via Basespace.

Fastq files from the sequencing runs can be uploaded to Illumina BaseSpace Sequence Hub based on your request. Following information are required for this specific mode of data transfer:

  • Your basespace account email (existing account or a new free basic subscription account)
  • Confirmation regarding the sample consent type

BaseSpace configuration:

Go to Top

Data access via iRODS server (Discontinued)

A local installation of iRODS server is used for the data handover to the users. A copy of the data is kept in this server only for a limited time and then automatically removed after the data transfer deadline. Access to this server is restricted by the Imperial College’s firewall. Users are only allowed to access this server, once they are connected to the college’s network (either direct or VPN access).

Command line file transfer

Steps for setting up iRODS client in HPC CX1

Please follow these steps to set up the iRODS clients in hpc for the first time

  • Create directory .irods under home (e.g. mkdir -p ~/.irods)
  • Create iRODS environment file ~.irods/irods_environment.json
  • Copy following configuration to the above mentioned file (replace USERNAME with your actual username) and validate file format using JSONLint
Authentication Type: Standard

Use your IGF login password for setting up iRODS account in HPC, if the authentication type is Standard. You should be receiving the account credentials in a separate email from IGF.

Click to expand

{
      "irods_host": "eliot.med.ic.ac.uk",
      "irods_port":1247,
      "irods_default_resource": "woolfResc",
      "irods_user_name": "YOUR_IGF_USERNAME",
      "irods_zone_name": "igfZone"
}

  

Authentication Type: PAM

Use your Imperial login credential for setting up iRODS account in HPC, if the authentication type is PAM

Click to expand

{
      "irods_host": "eliot.med.ic.ac.uk",
      "irods_port":1247,
      "irods_default_resource": "woolfResc",
      "irods_user_name": "YOUR_HPC_USERNAME",
      "irods_zone_name": "igfZone", 
      "irods_ssl_ca_certificate_file": "/apps/irods/certs/igf-chain.pem",
      "irods_ssl_ca_certificate_path": "/apps/irods/certs",
      "irods_ssl_verify_server": "cert",
      "irods_authentication_scheme": "PAM"
}

  

Steps for command line transfer in HPC CX1

Step 1: Load irods tool (e.g. module load irods/4.2.0)

Step 2: Set up your iRODS account using command iinit and specify your password

Step 3: Download data using commandline tool iget (e.g. iget -Pr /igfZone/home/USERNAME/PROJECT_NAME/PATH)

Step 3.1: Download fastq data using commandline tool: iget -Pr /igfZone/home/USERNAME/PROJECT_NAME/fastq

Step 3.2: Download analysis data using commandline tool: iget -Pr /igfZone/home/USERNAME/PROJECT_NAME/analysis

Go to Top

Access QC report pages

QC report pages for the raw and anlysed data files are accessible from our ftp site and accessible via the following url format http://eliot.med.ic.ac.uk/report/project/PROJECTNAME. You have to use the same login credentials for accessing these pages. For more details, please check Project QC Report section of the help page.

You can access these pages from your mobile device if you are connected to wifi network Imperial-WPA.

Go to Top

List of resources

Go to Top

Change logs

  • None