vidboda / BamCramConvert

script to convert bam2cram and reverse using find and samtools

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BamCramConvert

DOI

Script to convert bam2cram and reverse using find, samtools (requires UNIX find and samtools...), and optionally a slighlty modified forked version of bam2cram-check. BamCramConvert can also apply Crumble compression to your files!

Usage : bash bcc.sh

Mandatory arguments :

	* -d|--directory	<path to search dir>: root dir for find command    
	* -s|--size		<File size to search (man find -size)>: ex: +200000000k or +20G will search for files greater than 20Go; see man find -size argument    
	* -mt|--modif-time	<File last modif to search (man find -mtime): ex: +180 will search for files older than 6 months; see man find -mtime argument    
	* -f|--file-type	<bam|cram>: file type to find and convert from (bam will search for bam files and convert to cram)

Optional arguments :

	* -rm|--remove		:removes original file and index (in case of full conversion success implies bam2cram-check) - default: false
	* -st|--samtools	<path to samtools> - default: try to locate in PATH
	* -fa|--ref-fasta	<path to ref genome .fa>: path to a fasta file reference genome (the directory containing the fasta file must also contain samtools index) - default: /usr/local/share/refData/genome/hg19/hg19.fa
	* -drc|--disable-ref-check      :disable fasta reference file checking. Currently bcc can make a checking for hg19 and hg38 based on chr1 length. Disable for other assemblies.
	* -c|--check		: uses bam2cram-check (slightly modified) to check the conversion - implicitely included with -rm - if fails and -rm: rm canceled) - requires python >3.5 and samtools > 1.3
	* -p|--python3		<path to python3> - used in combination with -c or -rm: needed to run submodule bam2cram-check - default: /usr/bin/python3 - python version must be > 3.5
	* -uc|--use-crumble     : uses crumble to compress the converted BAM/CRAM file - Note: a file that already contains "crumble" in its name will not be converted again
	* -cp|--crumble-path    <path to crumble> - used in combination with -uc: needed to run crumble - default: try to locate in PATH
	* -v | --verbosity 	<integer> : decrease or increase verbosity level (ERROR : 1 | WARNING : 2 | INFO : 3 | COMMAND [default] : 4 | DEBUG : 5)

General arguments :

	* -sl|--slurm   : when running in SLURM environnment, generates srun commands - default: false
	* -th|--threads : number of threads to be used for samtools -@ option (0 => 1 total thread, 1 => 2 total threads...)
	* -h		: show this help message and exit
	* -t		: test mode (dont execute command just print them) - default: false

Installation:

see INSTALL.md

Use cases:

  • convert BAM files in directory dir (recursively) into CRAM, BAM size min 100Mo and older than 7 days, do not remove original BAM files, Reference genome is hg38 (must match with BAM files to be converted, non matching files based on chr1 length will be ignored - supports hg19 and hg38 only):
bash bcc.sh -d path/to/dir/ -mt +7 -s +100M -f bam -fa /path/to/hg38.fa
  • the same in SLURM environnment using 4 threads for samtools commands:
bash bcc.sh -d path/to/dir/ -mt +7 -s +100M -f bam -fa /path/to/hg38.fa -sl -th 3
  • the same in dry run mode and providing optional samtools path (otherwise is searched in PATH) - it is highly recommanded to use the dry run mode before launching a command, just to be sure of what will be done:
bash bcc.sh -d path/to/dir/ -mt +7 -s +100M -f bam -fa /path/to/hg38.fa -sl -th 3 -st /special/place/samtools -t
  • convert CRAM files in directory dir (recursively) into BAM, CRAM size min 5Go, from today, remove original CRAM files (implies bam2cram-check):
bash bcc.sh -d patho/to/dir/ -mt -1 -s +5G -f cram -fa /path/to/hg38.fa -rm
  • the same as above without removing the original files and now we explicitely apply bam2cram-check to the new files (and provide optional python3 (>3.5) path):
bash bcc.sh -d patho/to/dir/ -mt -1 -s +5G -f cram -fa /path/to/hg38.fa -c -p /usr/bin/python3
  • convert BAM files in directory dir (recursively) into CRAM, BAM size min 100Mo and older than 7 days, apply crumble to the new CRAM and provide optional crumble path (otherwise is searched in PATH), remove original file:
bash bcc.sh -d path/to/dir/ -mt +7 -s +100M -f bam -fa /path/to/hg38.fa -uc -cp /special/place/crumble -rm

About

script to convert bam2cram and reverse using find and samtools

License:GNU General Public License v3.0


Languages

Language:Shell 100.0%