Python,Awk, Sed Snippets
import itertools
with open(File1) as file1:
for line1,line2,...,lineN in itertools.izip_longest(*[file1]*N):\
- Greedy search and insert "NN" into every "\t\t" pairs.
perl -pe 's/(?:^|\t)\K(?=\t|\r)/NN/g' file
- Split function
perl -lne 'print ((split(/_/,$_))[0]) if /^@/' file
- Use column1 in file1 as keys to fetch rows in file2
awk -F'|' 'NR==FNR { a[$1]=1; next } ($3 in a) { print $3, $1 }' file1 file2
- Log2 transform of $x+1
awk '{print log($1 +1)/log(2)}' filename.txt >filename.norm
- Multiple fasta to one line fasta
awk '/^>/ {printf("\n%s\n",$0);next; } { printf("%s",$0);} END {printf("\n");}'
- Split one file into small files by column value
awk -v name_prefix="name_prefix" '(NR>1){print > name_prefix$3}' input_file
- Calculate average value of column2
awk '{ total += $2 } END { print total/NR }' yourFile.whatever
- Fing "gene_id" value from GTF file
awk -F 'gene_id' '{print $2}' GTF.file | awk -F '"' '{print $2}'
- Interleave lines from multiple files
awk '{print;getline < "file2"; print}' file1 file2
you can also use paste command
paste -d'\n' file1 file2
- Split and print array in one line
awk '{ split($3,a,",");{printf "%s ",$1}{for(key in a){printf "%d ",key}};print "\r";next}' file
- Split large file and keep the header in every individual file
cat gencode.v31.primary_assembly.cryptic.exon.csv | parallel --header : --pipe -N 10000 'cat > gencode.v31.primary_assembly.cryptic.exon.batch{#}.csv'
- Replace word1 by word2
sed -i '1s/word1/word2/g' NHEK_pAplus.rci
- Insert header in file.
header="header line"
sed -i "1s/^/$header\n/" $i
- Replace gtf header
sed -i 's/^>[0-9]\+[^ ]/>/g' mm9_salmon_index.fa
- Interleaver lines from multiple files
sed Rfile2 file1
- Print certain line of a file
sed -n LINE_NUMBERp file.txt
- Generate combinations of two files
while read a
do
while read b
do
echo $a"\t"$b
done < hACTB-R1.bed
done < hACTB-F1.bed