mario-dg / dvc-test

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simple Repository that shows how DVC can be used to integrate large binary file tracking into git

Setup DVC for Git Repo

Run

dvc init

inside the already initialized git repo. Make sure that your large binary files are not added or committed to version control.

Add binary files or directories containing binary files to DVC

dvc add /path/to/large_directory

Setup a remote(Amazon S3, Google Cloud, Google Drive, SSH/SFTP Server) or local storage

dvc remote add -d storage_name path/to/storage

Push large binary files to remote

dvc push

Add all created DVC files to git version control

git add ./dvc/ .dvcignore /path/to/large_directory/*.dvc /path/to/large_directory/.gitignore
git commit -m "Initialized and setup DVC for large file tracking"
git push -u origin main

Clone repository on other machine

git clone https://path/to/git_repo

Fetch and Checkout large binary files from DVC

cd git_repo
dvc pull -R /path/to/large_directory

Verify all binaries are downloaded to the specified directory and check if pipeline is up to date

dvc status

About


Languages

Language:Python 100.0%