metageni / SUPER-FOCUS

A tool for agile functional analysis of shotgun metagenomic data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Joining files makeblastdb error 803.7

bashirhamidi opened this issue · comments

Thanks for your prompt response on the conda superfocus_downloadDB.py issue. I've overcome that but as part of the unpacking and joining I'm running into this issue. Any insight would be appreciated.

...
...
...
inflating: clusters/100_clusters/216_cluster.faa  
inflating: clusters/100_clusters/70_cluster.faa  
inflating: clusters/100_clusters/1198_cluster.faa  
inflating: clusters/100_clusters/71_cluster.faa  

Joining files

Formatting DB
blast: DB_100


Building a new DB, current time: 07/18/2018 11:07:42
New DB name:   /home/bah29/packages/SUPER-FOCUS-0.30/db/static/blast/100.db
New DB title:  100.db
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
Error: (803.7) [makeblastdb] Blast-def-line-set.E.title
Bad char [0x96] in string at byte 151
fig|419947.9.peg.1104__1009__Mycobacterial_MmpL5_membrane_protein_cluster__Rv0678__MarR_family_transcriptional_regulator_associated_with_MmpL5?MmpS5_efflux_system
Error: (803.7) [makeblastdb] Blast-def-line-set.E.title
Bad char [0x96] in string at byte 151
fig|419947.9.peg.1104__1009__Mycobacterial_MmpL5_membrane_protein_cluster__Rv0678__MarR_family_transcriptional_regulator_associated_with_MmpL5?MmpS5_efflux_system
                                            
Adding sequences from FASTA; added 7143907 sequences in 333.523 seconds.
blast: DB_98

Building a new DB, current time: 07/18/2018 11:13:15
New DB name:   /home/bah29/packages/SUPER-FOCUS-0.30/db/static/blast/98.db
New DB title:  98.db
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 5234971 sequences in 249.484 seconds.
blast: DB_95

Building a new DB, current time: 07/18/2018 11:17:25
New DB name:   /home/bah29/packages/SUPER-FOCUS-0.30/db/static/blast/95.db
New DB title:  95.db
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 4520139 sequences in 214.329 seconds.
blast: DB_90

Building a new DB, current time: 07/18/2018 11:21:00
New DB name:   /home/bah29/packages/SUPER-FOCUS-0.30/db/static/blast/90.db
New DB title:  90.db
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 3865833 sequences in 184.21 seconds.
Done! Now you can run superfocus.py

@Mark1plus1 hey mark. Sorry. I'm being super busy, so I will need sometime before answering you.

The quick solution to all that is to not use bioconda version for now. Instead, download version 0.28 (https://github.com/metageni/SUPER-FOCUS/releases/tag/0.28), run the download database script that is there, and run super-focus from the folder. Sorry about that.

@metageni I can confirm that I was able to successfully get SUPER-FOCUS 0.30 from github to run and would leave it up to you to resolve or close this issue as you see fit.

@Mark1plus1 I have refactored SUPER-FOCUS, and it should be working fine now. I still need to update it on bioconda, but it should be good to go just by running the superfocus.py or installing it using setuptools.

Let me know in case you have any feedback.