bjmt / universalmotif

Motif manipulation functions for R.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

importing pwd by read_cisbp

avergro opened this issue · comments

Hi,

Thank you for write this package!
I am having troubles importing pwms data from cis_bp database. I already tried with both, the Bioconductor and the Github version.
What I am looking for a way to obtain the concensus sequences of all the downloaded pwms files from cis_bp database. One example:

read_cisbp("M11070_2.00.txt")

I am gettting the following warning:

Error in mapply(function(x, y) raw_lines[x:y], meta_starts, meta_stops, :
zero-length inputs cannot be mixed with those of non-zero length

How M11070_2.00.txt file looks?

Pos A C G T
1 0.304347826086957 0.0869565217391304 0.565217391304348 0.0434782608695652
2 0.0 0.0 0.0 1.0
3 1.0 0.0 0.0 0.0
4 1.0 0.0 0.0 0.0
5 0.0 1.0 0.0 0.0
6 0.0434782608695652 0.565217391304348 0.326086956521739 0.0652173913043478
7 0.0 0.0 1.0 0.0
8 0.108695652173913 0.173913043478261 0.0 0.717391304347826
9 0.0888888888888889 0.177777777777778 0.0 0.733333333333333
10 0.0666666666666667 0.0444444444444444 0.0 0.888888888888889
11 0.146341463414634 0.170731707317073 0.024390243902439 0.658536585365854
12 0.292682926829268 0.170731707317073 0.024390243902439 0.51219512195122
13 0.275 0.1 0.075 0.55

Could you please help me to find the issue here?
thank you!

Thanks for checking out the package and letting me know about the error. I've figured out the issue and will try and patch the issue in the near future.

In the meantime, you can get the motif to read by adding a few extra lines of metadata (which I assumed all cisbp motifs had when I wrote the parser). In your case:

TF placeholder
TF Name placeholder
Gene placeholder
Motif placeholder
Family placeholder
Species placeholder
Pos A C G T
1 0.304347826086957 0.0869565217391304 0.565217391304348 0.0434782608695652
2 0.0 0.0 0.0 1.0
3 1.0 0.0 0.0 0.0
4 1.0 0.0 0.0 0.0
5 0.0 1.0 0.0 0.0
6 0.0434782608695652 0.565217391304348 0.326086956521739 0.0652173913043478
7 0.0 0.0 1.0 0.0
8 0.108695652173913 0.173913043478261 0.0 0.717391304347826
9 0.0888888888888889 0.177777777777778 0.0 0.733333333333333
10 0.0666666666666667 0.0444444444444444 0.0 0.888888888888889
11 0.146341463414634 0.170731707317073 0.024390243902439 0.658536585365854
12 0.292682926829268 0.170731707317073 0.024390243902439 0.51219512195122
13 0.275 0.1 0.075 0.55

You can replace the "placeholder" instances with whatever you wish. I will update you when the patch is live.

Is working now after modify the file.
Thanks a lot.

Just pushed the patch which will allow the motif to read without any header metadata. You can either install the package from github now or wait a couple of days before Bioconductor updates the package to version 1.6.4.