metageni / SUPER-FOCUS

A tool for agile functional analysis of shotgun metagenomic data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Very very large .m8 temporary files

theo-allnutt-bioinformatics opened this issue · comments

I am running superfocus on 96 samples with ~20GB read files each. As it is running, each sample gives a ~60Gb .m8 file, so space is becoming an issue. Is three any way to alter this behaviour? There is no way I can store 96 x 60Gb files before the run has finished.

Thanks,

T.

@theo-allnutt-bioinformatics Bummer!

I created a PR with a solution for it here (#43)

The quick solution deletes these large files.

just add -d when you run superfocus. it should delete these large files after every alignment is done and data was parsed.

You probably want to keep the database and not have to download it.

Let me know if you have any problems cloning the new version etc. I will merge it into master once I address other suggestions that people have asked.

Best

I actually merged it into master, but no version was released it.

Thanks very much. I will try it on my next run. Does this mean that I can delete the alignment files for completed samples while it is still running?

Yes, once the alignment is done and the file is parsed, it can be deleted. So that is what the -d will do. I left the original alignments because some people like having it.

So if you have 90 samples, alignment of sample 1 will be deleted before the alignment of sample 2 start, and parsed results will be stored in the dict.

Great, thanks for your quick reples.