allada / bsc-archive-snapshot

Free public Binance Smart Chain (BSC) Archive Snapshot

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Download Issues

Prof-SD opened this issue · comments

commented

Hi,

I've tried to download the snapshot twice over the last couple of days and it's failed each time. The error message is the same (below) but at different points. The first time it failed at around 680GB, the second time was at 1.24TB. Do you have any ideas what this could be, please? Is there an issue with the file?

/stdin\ : Decoding error (36) : Corrupted block detected <=> ]
1.24TiB 20:22:22 [17.7MiB/s] [ <=> ]
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
download failed: s3://public-blockchain-snapshots/bsc/erigon-latest.tar.zstd to - [Errno 32] Broken pipe

commented

The first time failed after around 14 hours and the most recent was around 20 hours. Funnily enough, I think the failure was at around the same time of day both times though.

Is it safe to assume the downloaded data is of no use and it's not possible to use in parts and sync from the missing bit? Sorry, I'm not very familiar with ZFS and how it stores files.

commented

I cleared it out and started the download again. I'll let you know how it goes.

Thanks again for all of your help and running this project!

The fact that this is taking so long makes me think you are downloading this to outside the us-west-1 data center? If so, I might suggest spinning up an EC2 instance first, placing it on the instance manually then copy it to your final location.

I suspect the problem is that AWS terminates any active downloads when a new version is uploaded. Another idea is to download a specific version instead of the active latest version.

commented

Thanks for getting back to me again, really appreciate it!

Yes, I'm downloading to a server outside of was. I think you're right about the specific version, I think maybe I'm somehow downloading the oldest version and it's being removed before it finishes.

Could you point me in the direction of how I can download a specific version, please? I'm only seeing the command in your script about downloading the file without a version.

commented

I called aws s3 ls s3://public-blockchain-snapshots/bsc/ --request-payer=requester to see available files but I only see erigon-latest.tar.zstd in there.

commented

Just a quick update. It failed again at around 9:30pm PST, similar to the other times. Is that around the time you update it?

I've followed your suggestion of keeping it inside us-west-1 and have created a bucket that I'm transferring it to. At current speeds it will probably take around 20 hours still but will hopefully get there just before the new version knocks it off.

I don't create different files, instead I rely on S3's version system.

I have updated the permissions of the bucket. You should now be able to run:

aws s3api list-object-versions --bucket public-blockchain-snapshots --prefix bsc/

To get the versions available to download. You can try downloading a specific version by running that command, finding the latest version (don't use oldest), then run:

aws s3api get-object --bucket public-blockchain-snapshots --key bsc/erigon-latest.tar.zstd --version-id "VERSION_ID_HERE" /path/to/location/to/download.tar.zstd --request-payer=requester

9:30 seems plausible. It starts uploading at midnight UTC time and takes about 2-3 hours to finish. This would put the upload finishing around 7pm PST. AWS might do some internal bookkeeping that might delay it too.

You may run into issues with using the list-object-version command because it doesn't appear to support requester payer in the CLI (but does on the API layer). Here's the latest version to test out if needed... This should be good for ~3 days: vHFTVtAgp9Nxe1PWdzZT0B_9vsy0gQb0.

In any case I suggest doing what I posted above to see if it helps.

commented

Thanks so much for this. Before I saw your message, I was able to get the file onto a EC2 server and upload it to an S3 bucket (I was having problems going straight to an S3 bucket because of the list-objects permissions).

Depending on how things go though I may still try to give this a try to know for the future if I need a quick snapshot. Thank you for all of the help!

FYI, you might want to checkout: #14

It likely take a few weeks to get a new version uploaded with this fix. It should not cause issues with most people's use case, but if you are a purist, be warned.