addypy / datagovindia

Python Client for India’s - Open Government Data (OGD) (https://data.gov.in/) platform APIs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimize repo cloning

sayanarijit opened this issue · comments

Hi, thanks for publishing the library. I wanted to contribute to this repo, however, cloning takes an absurd amount of time on a general internet connection, making it impractical to clone without using --depth. I think this will only get worse as more "Automatic Meta-Update" commits are added to the repo. Github also doesn't recommend such usage.

So, I was wondering, if we can can stop committing this data directly to the repo, and instead, provide a simple script to download it from a source (govt site or github releases or custom CDN) on individual machines.

Also, I think it'd be best to hard reset this repo to 33ab07a, so that it becomes easy to clone and contribute.

git reset --hard 33ab07ab65583c76f11c684aab3213152099acf1
git push --force

Hi @sayanarijit

Thank you for your interest in our library and for bringing up this important issue. I couldn't agree more about the challenge posed by the long cloning time of our repository due to the substantial increase in API resources since its inception.
Your suggestion of not committing the data directly to the repo and instead providing a script to download it from an external source is an excellent one, particularly the CDN option you mentioned might be a better way to update the compressed data dictionaries. Would love to explore this further with your insights.

Regarding the hard reset of the repository, while it might make the repo smaller and easier to clone, I want to tread carefully so that it doesn't cause potential issues for others who have already cloned or forked the repository. I'll get back to you on this as soon as I can.

Thanks!