r3fang / SnapATAC

Analysis Pipeline for Single Cell ATAC-seq

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Removing b'chr1' from peak files generated from macs2 in R for homer/chromVar

kendyhoangucsc opened this issue · comments

Hello,

I was wondering If its possible to remove the b and the single quotes from b'chr1' to do the homer analysis and chromVar analysis in R. I know this is an issue with the different python versions used when running macs2. I was able to remove the b by sed 's/b//g' and the single quotes by sed -e "s/'//g" by using the command lines to get them to the right format to run homer on the command line.

when i look at peaks.ls in R: I get the following:
[[17]]
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 b'chr1' 3094619 3095395 atacbrain.17_peak_1 57 . 5.92560 8.32681 5.77795 626
2 b'chr1' 3399994 3400470 atacbrain.17_peak_2 57 . 5.81395 8.26990 5.73217 91
3 b'chr1' 3515080 3515280 atacbrain.17_peak_3 35 . 4.60880 5.89366 3.56619 72
4 b'chr1' 3670863 3671185 atacbrain.17_peak_4 148 . 8.71212 18.15270 14.83340 186
5 b'chr1' 3671811 3672270 atacbrain.17_peak_5 117 . 7.57576 14.76630 11.70870 172
6 b'chr1' 4414219 4414715 atacbrain.17_peak_6 71 . 6.25000 9.79414 7.12422 176
7 b'chr1' 4492148 4492443 atacbrain.17_peak_7 44 . 4.61538 6.86932 4.45786 161
8 b'chr1' 4519579 4519984 atacbrain.17_peak_8 41 . 4.93827 6.50488 4.12535 160
9 b'chr1' 4544039 4544514 atacbrain.17_peak_9 78 . 7.05128 10.59360 7.86153 150
10 b'chr1' 4571480 4572267 atacbrain.17_peak_10 120 . 8.33333 15.12260 12.03810 374
11 b'chr1' 4770105 4770305 atacbrain.17_peak_11 22 . 3.75000 4.41054 2.26143 48
12 b'chr1' 4785410 4786233 atacbrain.17_peak_12 172 . 9.61539 20.75500 17.22340 535
13 b'chr1' 4807646 4808447 atacbrain.17_peak_13 187 . 9.92647 22.39720 18.72320 409

I was wondering if there was a way to loop it in R to remove the b and the single quotes from all the peaks called in each cluster to make it look like:

[[17]]
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 chr1 3094619 3095395 atacbrain.17_peak_1 57 . 5.92560 8.32681 5.77795 626
2 chr1 3399994 3400470 atacbrain.17_peak_2 57 . 5.81395 8.26990 5.73217 91
3 chr1 3515080 3515280 atacbrain.17_peak_3 35 . 4.60880 5.89366 3.56619 72
4 chr1 3670863 3671185 atacbrain.17_peak_4 148 . 8.71212 18.15270 14.83340 186
5 chr1 3671811 3672270 atacbrain.17_peak_5 117 . 7.57576 14.76630 11.70870 172
6 chr1 4414219 4414715 atacbrain.17_peak_6 71 . 6.25000 9.79414 7.12422 176
7 chr1 4492148 4492443 atacbrain.17_peak_7 44 . 4.61538 6.86932 4.45786 161
8 chr1 4519579 4519984 atacbrain.17_peak_8 41 . 4.93827 6.50488 4.12535 160
9 chr1 4544039 4544514 atacbrain.17_peak_9 78 . 7.05128 10.59360 7.86153 150
10 chr1 4571480 4572267 atacbrain.17_peak_10 120 . 8.33333 15.12260 12.03810 374
11 chr1 4770105 4770305 atacbrain.17_peak_11 22 . 3.75000 4.41054 2.26143 48
12 chr1 4785410 4786233 atacbrain.17_peak_12 172 . 9.61539 20.75500 17.22340 535
13 chr1 4807646 4808447 atacbrain.17_peak_13 187 . 9.92647 22.39720 18.72320 409

my python version is 3.8