Issue with parenthesis in print, using python 3

Question

Issue with parenthesis in print, using python 3

sahilseth opened this issue 8 years ago · comments

Here is an example error:

 File "utils/classify_WHAM_vcf.py", line 46
    print '##INFO=<ID=WC,Number=1,Type=String,Description="WHAM classifier variant type">'
                                                                                         ^
SyntaxError: Missing parentheses in call to 'print'

Zev Kronenberg · Answer 1 · Wed Apr 13 2016 05:26:12 GMT+0800 (China Standard Time)

Are you using WHAM for SV discovery? If so just i'd advise you to use WHAM-GRAPHENING -k.

sahil seth · Answer 2 · Wed Apr 13 2016 06:00:22 GMT+0800 (China Standard Time)

I have a tumor normal pair, and would like to explore translocations.

Thanks,
Sahil

On Apr 12, 2016, at 4:26 PM, Zev Kronenberg notifications@github.com wrote:

Are you using WHAM for SV discovery? If so just i'd advise you to use WHAM-GRAPHENING -k.

—
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub

Zev Kronenberg · Answer 3 · Wed Apr 13 2016 06:01:48 GMT+0800 (China Standard Time)

That'd be the correct use case for WHAM. Did the classifier run for you after changing the quote?

ejodude · Answer 4 · Wed Apr 13 2016 06:20:15 GMT+0800 (China Standard Time)

@Sahil, Your current set of errors stem from incompatibilities in syntax between python 2X and python 3. I had not tested the code against any versions of python3 unfortunately (just noted that we claim Python3 is supported in the docs, so I apologize for that error).

Do you happen to have a version of python 2.7 installed on your system? You can have multiple different versions of python on your machine and so running an instance of python2.7 is probably your easiest fix. I can also try and port the code over to python 3, but I expect that this will probably take up to a week to update.

My recommendation would be to install an anaconda distribution of python2.7 here: https://www.continuum.io/downloads https://www.continuum.io/downloads ; it will come with all of the packages you need to run the classifier and will not overwrite your default python that is installed on your machine.

-EJ

On Apr 12, 2016, at 3:01 PM, Zev Kronenberg notifications@github.com wrote:

That'd be the correct use case for WHAM. Did the classifier run for you after changing the quote?

—
You are receiving this because you were assigned.
Reply to this email directly or view it on GitHub #30 (comment)

sahil seth · Answer 5 · Wed Apr 13 2016 06:43:44 GMT+0800 (China Standard Time)

Yes, I just created a new python2 env, and compiled again to be sure. Now it works, but shows a lot of warnings:

python2.7/site-packages/sklearn/utils/validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)

I do see the file is getting created, here is the first call:

# wham bam
1       10004   .       N       AACCCCNANCCCNACCCCAACCCCCACCCCAN        .       .       LRT=0;WAF=1,0.500001,0.750001;GC=1,1;AT=1,0.0434783,0.0434783,0,0,0,0,0,0,0,0,0,0.0434783,0,0.475082;CF=0.173913;CISTART=9965,10041;CIEND=10203,10203;PU=3;SU=0;CU=13;RD=23;NC=3;MQ=13.4783;MQF=1;SP=1,0,0;CHR2=1;DI=f;END=10204;SVLEN=201      GT:GL:NR:NA:NS:RD       0/1:-49.7373,-13.8629,-19.7461:2:18:18:20       1/1:-255,-255,-0.374464:0:3:3:3
# classifier:
1       10004   .       N       AACCCCNANCCCNACCCCAACCCCCACCCCAN        .       .       LRT=0;WAF=1,0.500001,0.750001;GC=1,1;AT=1,0.0434783,0.0434783,0,0,0,0,0,0,0,0,0,0.0434783,0,0.475082;CF=0.173913;CISTART=9965,10041;CIEND=10203,10203;PU=3;SU=0;CU=13;RD=23;NC=3;MQ=13.4783;MQF=1;SP=1,0,0;CHR2=1;DI=f;END=10204;SVLEN=201;WC=INR;WP=0.254,0.158,0.372,0.216    GT:GL:NR:NA:NS:RD       0/1:-49.7373,-13.8629,-19.7461:2:18:18:20       1/1:-255,-255,-0.374464:0:3:3:3

interpretation
WC=INR; this probably means insertion.
WP=0.254,0.158,0.372,0.216: not sure of the sequence of probabilities.

Sorting the last column of training data (lexicographically), I get: DEL, DUP, INR, INV. In this example, the variant was classified as INR, with prob of 0.3 - which seems to be highest in this case. So I can assume that the labels of the prob. are also DEL, DUP, INR and INV?

info from docs
WP:
The probabilities for each class label generated by the random forest classifier.
The format field is comprised of six colon-delimited fields.

This is comma separated, and the number depends on training data supplied, am I getting this right?

thanks!

Zev Kronenberg · Answer 6 · Wed Apr 13 2016 07:36:25 GMT+0800 (China Standard Time)

@sahilseth That is correct.

Zev Kronenberg · Answer 7 · Wed May 11 2016 23:27:15 GMT+0800 (China Standard Time)

@ejodude Any movement on this EJ?

ejodude · Answer 8 · Thu May 12 2016 00:23:16 GMT+0800 (China Standard Time)

Thanks @sahilseth for the heads up. It looks like the code is running fine, but that we will need to add an update before scikitlearn moves to v0.19. I've also changed the wiki highlight the requirement for 2.7 and not 3.0+