CrescentLuo / PASS

Python, Awk, Sed Snippets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PASS

Python,Awk, Sed Snippets

Python Snippets

Read file every N lines

import itertools
with open(File1) as file1:
	for line1,line2,...,lineN in itertools.izip_longest(*[file1]*N):\

Perl one liner

  1. Greedy search and insert "NN" into every "\t\t" pairs.
perl -pe 's/(?:^|\t)\K(?=\t|\r)/NN/g' file
  1. Split function
perl -lne 'print ((split(/_/,$_))[0]) if /^@/' file

AWK snippets

  1. Use column1 in file1 as keys to fetch rows in file2
awk -F'|' 'NR==FNR { a[$1]=1; next } ($3 in a) { print $3, $1 }' file1 file2
  1. Log2 transform of $x+1
awk '{print log($1 +1)/log(2)}' filename.txt >filename.norm
  1. Multiple fasta to one line fasta
awk '/^>/ {printf("\n%s\n",$0);next; } { printf("%s",$0);}  END {printf("\n");}'
  1. Split one file into small files by column value
awk -v name_prefix="name_prefix" '(NR>1){print >  name_prefix$3}' input_file
  1. Calculate average value of column2
awk '{ total += $2 } END { print total/NR }' yourFile.whatever
  1. Fing "gene_id" value from GTF file
awk -F 'gene_id' '{print $2}' GTF.file | awk -F '"' '{print $2}' 
  1. Interleave lines from multiple files
awk '{print;getline < "file2"; print}' file1 file2

you can also use paste command

paste -d'\n' file1 file2
  1. Split and print array in one line
awk '{ split($3,a,",");{printf "%s ",$1}{for(key in a){printf "%d ",key}};print "\r";next}' file
  1. Split large file and keep the header in every individual file
cat gencode.v31.primary_assembly.cryptic.exon.csv | parallel --header : --pipe -N 10000 'cat > gencode.v31.primary_assembly.cryptic.exon.batch{#}.csv'

Sed

  1. Replace word1 by word2
sed -i '1s/word1/word2/g' NHEK_pAplus.rci
  1. Insert header in file.
header="header line"
sed -i "1s/^/$header\n/" $i
  1. Replace gtf header
sed -i 's/^>[0-9]\+[^ ]/>/g' mm9_salmon_index.fa
  1. Interleaver lines from multiple files
sed Rfile2 file1
  1. Print certain line of a file
sed -n LINE_NUMBERp file.txt

Pure bash

  1. Generate combinations of two files
while read a
do
    while read b
    do
        echo $a"\t"$b
    done < hACTB-R1.bed
done < hACTB-F1.bed

About

Python, Awk, Sed Snippets