ipfs-inactive / archives

[ARCHIVED] Repo to coordinate archival efforts with IPFS

Home Page:https://awesome.ipfs.io/datasets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Alpine-linux packages

victorb opened this issue · comments

Made an effort around half year ago to mirror all the alpine-linux packages, which I completely forgot about but would be worth continuing if someone would like to pick it up.

It mirrors and adds all the packages for alpine-linux 3.4 and 3.5. Each one is about 30GB big.

Hash for 3.4: QmRsvEpJggeu4HhoafzRFobV4sbwVVTXMrdb2p8XWv7bCS
Hash for 3.5: QmQeE2YmmmWakwXs42NpcEzTYtEWYXDHjrZ1U5CfhZzjYq
Hash for edge: QmQ3HQWa3yySXstXLbRZmaRPj9G3KG19voBXvqpNCCgCvW
Full mirror: QmWMpm2dmzSfVphzZGJBAYfSPZha8S3skkJeLEzbJ1BGRj

Code and steps to reproduce: https://github.com/VictorBjelkholm/alpine-mirror
Pinned on: Pollux

Sounds really cool, have you considered using IPNS and a mirroring machine to get periodical updates?

@MarkusTeufelberger right now it's fully pinned on one of our hosts, pollux and right now there is no IPNS publishing the latest hash. That would make sense for the edge mirror to update continuously. Added a issue in the alpine-mirror repository: victorb/alpine-mirror#1

@victorb If you're still interested in this and running updates: I've set up a collab cluster for ArchLinux-packages a month ago.

It's basically an rsync to ipfs and ipfs cluster script, which also maintains snapshots after every sync with changes. The cluster will automatically unpin snapshots older than 2 month, but users can obviously pin and provide older snapshots if they like.

The listing of the snapshots (as a simple html page) will never be purged, so old CIDs can be found, even after the cluster unpinned and garbage collected them.

The whole script is somewhat lengthy, since I want to pin every single package file instead of pinning full directories recursively. This had the advantage of sharding (if you just want to hold 10 copies of each file in a cluster) and you can set different pin expire dates for files and folders.

The script is also very efficient, by just modifying the least amount of data with ipfs commands corresponding to every file add/change/delete rsync has fetched.

So it's not tediously adding recursively all files to IPFS and let it figure out the changes and redundancies.

Note that it's currently not supported to run multiple Linux distributions with this script, since it would wipe the folders on the second --force-full-add while the directory structure is meant to be able to handle multiple architectures, distributions, repositories.

This limitations is probably going away soon, when I'm looking into supporting different pacman based distributions.

But your usage scenario look probably very different anyway :)

https://github.com/RubenKelevra/pacman.store/blob/master/toolset/repo2cluster.sh