AhmedElsherbini / download_hmp_data

Download Human microbiome project (WGS) data from hmpdacc.org via https

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

download_hmp_data

This simple script aims to download data from HMP portal. Although the presence of tools to download data from this website like HMP client or portal client, they do not work with me for some files. So, I made this simple script to work around it. And you can use it in case the main tools do not work with you.

Note: Till now, this script supports only valid HTTPS (not Amazon s3, FTP clients).

Usage

First, after you get your manifest file, put it in the same directory of the script and run this python3 script.

python3 download_urls.py -i example_manifest.tsv

As dependencies, you need to have (via pip3 or conda) pandas , agrpase and get

To examine some current manifest HTTPS validity, you have two options.

Randomly pick a few of them.

1- Manually, on the website itself like in example, try the manual download button per individual file, if it works, a good sign.

2- From your TSV file copy and paste the link (https://.............bz2) in your browser, if you can see be downloaded, then this is a good sign.

As an output, your manifest will be divided into one successful manifest and one failed manifest file (to list the samples that were not downloaded).

Contributing

Everything is clear, right? But anyhow, contact me here or directly via email: drahmedsherbini@yahoo.com

License

This tool aims to help others. Kindly, cite my GitHub page!

About

Download Human microbiome project (WGS) data from hmpdacc.org via https


Languages

Language:Python 100.0%