Failed to build Diamond-database with the taxonomy files
emilhaegglund opened this issue · comments
Thanks for creating these taxonomy files. I was trying to use the files from R207 to build a Diamond database, however it failed when reading the names.dmp
with the following error message: Failed to allocate sufficient memory. Please refer to the manual for instructions on memory usage.
.
I was just wondering if you have tried this taxonomy files with Diamond, and if you had any success?
Cheers, Emil
Seems it needs a lot of memory which is not available in your machine.
Strange, I'm on a machine with 128Gb Ram and building with the NCBI taxdumps where no problems. Will check with the Diamond-developers then.
Thanks,
Emil
GTDB taxonomy has 47894 species in r202 which belong to 28073+ species of NCBI taxonomy. Is this the cause?
Hi, I asked the developers of Diamond if they had any clue. The reason is that the taxids in GTDB-taxdump is larger than 2^31, which is not supported in Diamond. See bbuchfink/diamond#611 (comment). Will see if they can create a fix for this issue.
Thanks for the quick replies.
It could happen, I use 'uint32' to store taxids.
I plan to use int32
.
Check taxonkit v0.14.0