ecotaxa / ecotaxa_front

Front end of the EcoTaxa application

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

"Skip tsv files that have already been imported" option is implemented in a too primitive way, if useful.

grololo06 opened this issue · comments

Below an extract of "loadedfiles" in DB after many imports on the same directory, but at different levels:

_work/m158_mn19_n5_d1_3_sur_5_1/ecotaxa_m158_mn19_n5_d1_3_sur_5_1.tsv 
m158_mn19_n5_d1_3_sur_5_1/ecotaxa_m158_mn19_n5_d1_3_sur_5_1.tsv

Above this simple path problem, the main concern is that there is no guarantee that a same name TSV contains the same data as a previous one, especially when it comes from e.g. a re-process of the toolchain upstream (ZooProcess not to name it).

I think that we should store some TSV signature and care more about consistency. Or maybe drop the option and keep only "Skip objects that have already been imported " which has a clear signification (but takes more time to scan the TSVs).