Script finishes with no errors at 568GB.
Breisoft opened this issue · comments
The script finishes and launches with no errors at 568GB, I noticed another person opened up a similar issue, but you said changes had been made since. I'm using the most up to date version of the repo, and am having the same issue as far as I can tell. When I re-run the bash script, it starts deleting the files. Thank you!
Yeah, I have seen this too, it appears to be related to the way multi-process download works. I'll look into it when I have the time.
Thanks allada. I tried again with half as many processes and still ran into the same issue, if that helps. I'd try to fix it myself and make a pull request, but I don't know shell.
You can try on a larger instance with more cores, it may help... The issue is likely it being too aggressive in parallel downloads.
You can also download the mdbx file manually.
I was using im4gn.4xlarge, but I can try a larger one. Mdbx file? is that in the s3 bucket?
Yes, you should be able to make your own server quite easily, just install the needed software and run:
# Downloads the mdbx file.
aws s3 cp --request-payer=requester s3://public-blockchain-snapshots/bsc/erigon/archive/latest/v1/chaindata/mdbx.dat.zstd - \
| pv \
| zstd -q -d -o /erigon/data/bsc/chaindata/mdbx.dat
# Download the snapshot files.
aws s3 sync --request-payer=requester s3://public-blockchain-snapshots/bsc/erigon/archive/latest/v1/snapshots/ /erigon/data/bsc/snapshots/
Then start erigon with:
erigon --chain bsc --snapshots=true --db.pagesize=16k --datadir=/erigon/data/bsc --txpool.disable
You'll need to setup the disk drives and install everything manually though.
Do those commands give the full archive history of BSC? I only see data going back 2 months
Yes they should
I believe I've discovered the issue with the script. Everything appears to be working fine now, here's a detailed explanation of the problem:
- I ssh in directly after creating an ec2 instance and clone this repository and then run the script directly
- The script installs all of the required dependencies, but fails before it starts downloading because AWS configure hasn't been run yet due to being a fresh instance
- Although the script fails before downloading starts, it still creates the zfs file system before it fails.
- The snapshot folder appeared to download fine even before, hence me getting 568GB and then exiting successfully. That's because the 568GB number is the proper size of the snapshot folder.
- The chaindata file did NOT download properly, I believe due to the
download_database_file()
function
This function, unlike the download functions for snapshots, nodes, parlia, etc. returns if the chaindata folder already exists. Since the script created the folder already before download_database_file
was executed, this led to the download never starting.
Hope this helps! I would write a PR myself, but I'm not familiar with bash or zfs commands. Appreciate your help as well.
This should be done now in the latest code.