razorx89 / roco-dataset

Radiology Objects in COntext (ROCO): A Multimodal Image Dataset

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

zlib error

lfb-1 opened this issue · comments

commented

Hi,

I keep getting "zlib.error: Error -3 while decompressing data: invalid code lengths set" or other zlib errors during the download. Is there any fix on the script or this is a data problem?

Cheers

It seems to work fine for me on Linux, which OS are you using? Can you show the exact command line you are using to run the script?

commented

I am also running on Linux. the command is "python scripts/fetch.py".

I am able to run the command but the error happens occasionally during the download. If I run the same command again it will resume download with no problem, but the same error will pop up after some iteration.

Could it be a network issue? I will see if I can maybe add some way to validate the downloaded packages and / or re-try on extraction failure or something.

commented

Hi after several attempts, I still cannot retrieve images via scripts. However, I found it on Kaggle. So, problems solved, I guess...

Same issue here!

Thanks for reporting, I have not been able to reproduce this but I will look again into implementing the abovementioned measures to re-try on failure.

I have now added "fixes" for these errors (i.e., instead of aborting the whole process, the download will be re-tried), however the best and simplest fix to avoid these problems in the first place is reducing the number of processes (i.e. parallel downloads) in my local tests on a slow internet connection – I have added this in the README.
@sarahESL Let me know if it works for you now!

Hi after several attempts, I still cannot retrieve images via scripts. However, I found it on Kaggle. So, problems solved, I guess...

excuse me can you write the link , please ?

commented

Hi after several attempts, I still cannot retrieve images via scripts. However, I found it on Kaggle. So, problems solved, I guess...

excuse me can you write the link , please ?

https://www.kaggle.com/datasets/virajbagal/roco-dataset. I did not verify is it exactly like the script downloaded data or not, but it seems to.