Utility scripts to convert a Subversion repository to a Git repository.
These scripts require the following software to be installed:
- Bash
- Git
- Git Flow
- Curl
In Subversion commits are identified by username but in Git users are identified by email. To have correct author information a translation table is required.
To get all users names from Subversion execute the following command:
$ ./get_authors_file.sh <svn_url> <output_file>
This will return a file were the keys are the Subversion usernames and the values are empty. For example:
john.doe =
jane.doe =
john.smith =
The value should be the Git user name and email. For example:
john.doe = John Doe <john.doe@mail.com>
jane.doe = Jane Doe <jane.doe@mail.com>
john.smith = John Smith <john.smith@mail.com>
All values must be set before proceeding.
To convert a Subversion repository to a Git repository execute the following command:
$ svn_to_git.sh <root_url> <repo_path> <authors_file> <output_dir>
The arguments are detailed bellow:
- Root URL: The Subversion root URL (e.g., http://svn.apache.org/repos/asf)
- Repository Path: The Subversion repository path (e.g., subversion, xml/axkit)
- Output Directory: The directory were the Git repository is output
This command will clone the Subversion repository, create branches and tags, clean up Subversion metadata, initialize Git Flow and optimize the repository.
Git provides tool to clone Subversion repositories. The conversion script executes the following command:
$ git svn clone <root_url>/<repository_path> \
--authors-file=<authors_file>
--std-layout -s <output_dir>
The command clones a Subversion repository assuming a standard directory
structure (trunk
, branches
and tags
). The commit information will be
based in the authors file.
The Subversion the trunk
branch is equivalent to Git master
branch. Cloning
a Subversion repository leaves an unnecessary remote reference to trunk
. The
tool simply removes the reference:
git branch -D -r "origin/trunk"
Cloning a Subversion repository also creates remote references to tags. The tag remote references can be found by executing:
git for-each-ref --format='%(refname:short)' refs/remotes/origin/tags
Which may result in the following output:
origin/tags/v1.0
origin/tags/v2.2
In Subversion a tag is created by copying a branch (e.g., trunk
,
branches/feature1
) to the tags folder. This results in an additional commit.
In Git a tag is simply a reference to a commit. Based on this, a valid tag is a
commit without differences compared to the previous commit:
git diff --quiet "$remote_ref" "$remote_ref"~1
The above command returns zero if there are no differences between the two commits, making the tag valid.
The tool creates a lightweight tag of the previous commit if the tag is valid and deletes the tag remote reference.
# Removes the 'origin/tags/' part
tag_name="${remote_ref/origin\/tags\/}"
# Creates a lightweight tag of the previous commit of the tag remote reference
git tag "${tag_name}" <remote_ref>~1
# Deletes the remote reference
git branch -D -r <remote_ref>
Note that invalid tags are ignored. These should be handled manually.
Cloning a Subversion repository creates remote references to branches. The branch remote references, excluding trunk, can be found by executing:
git for-each-ref --format='%(refname:short)' refs/remotes/origin \
| grep -v 'origin/trunk' | grep -v 'origin/tags'
For example, the result could be:
origin/feature1
origin/feature2
For each branch remote reference, the tool creates a branch and removes de remote reference:
# Removes the 'origin/' part
branch_name=${remote_ref/origin\//}
# Creates a branch
git branch ${branch_name}" <remote_ref>
# Deletes de remote reference
git branch -D -r <remote_ref>
Git has a tool to import Subversion ignore properties. The Subversion ignore properties are imported as follows:
# Import svn ignore
git svn show-ignore > " .gitignore"
# Commit
git add ".gitignore"
git commit -m "Add ignore"
At this point any Subversion metadata is not required and can be safely deleted. The Subversion metadata is removed as follows:
# Removes Subversion configuration
git config --remove-section "svn"
git config --remove-section "svn-remote.svn"
# Removes Subversion metadata
rm -r .git/svn"
Git Flow is an extension that provides high level operations for Vincent Driessen's branching model. Git Flow is initialized as follows:
# Creates the develop branch
git branch develop
# Initializes Git Flow
git flow init -d
After everything is done the repository should be optimized. Optimization is acheived by executing:
git gc
Too convert multiple Subversion projects from the same root URL first create a file with the repository paths. For example:
main/project1
main/project2
utils/project3
This file can be input to the following script:
$ ./multi_svn_to_git.sh <root_url> <authors_file> <repo_paths_file> <repos_output_dir>
The arguments are detailed bellow:
- Root URL: The Subversion root URL (e.g., http://svn.apache.org/repos/asf)
- Authors file: Subversion to Git username translation file
- Repository Paths File: The Subversion repository paths file
- Output Directory: The directory to output the converted Git repositories
In Subversion projects are structured as a filesystem but in Git each project
should be stored in it's own repository. For that reason the name of the Git
output repository is the Subversion path with slashes replaced by dashes (e.g.,
master/project1 => master-project1
).
For each Subversion clone a log file is output in the projects output directory. This log is colored and can be viewed with a regular unix tool:
less -r <log_file>
After the convertion from Subversion to Git and checking everything is correct, the code can be pushed to host service. The next sections show how to to this with Bitbucket.
The scripts require Git and Curl non interactive password authentication. To
configure curl add the ~/.netrc
with the following content:
machine <host> login <username> password <password>
For example:
machine mybitbucket.com username joedoe password mypassword
In Git one option is to store passwords and login once:
git config credential.helper store
After the repositories are pushed you may want to delete these configurations.
To create a Bitbucket repository and push all branches and tags, based on local repository, execute the following command:
$ ./push_repo.sh <bitbucket_host> <username> <repo_name> <repos_dir>
For example:
$ ./push_repo.sh mybitbucket.com joedoe test repos
Would execute the following steps:
- Create a public empty
test
repository owned by userjoedoe
hosted inmybitbucket.com
- Set the origin of the local
test
repository, in therepos
directory, as the newly created Bitbucket repository - Push all branches and tags to origin
This script assumes the local repository does not have origin set and that the Bitbucket repository does not exist.
To execute the previous script for multiple repositories, run the following command:
$ ./multi_push_repo.sh <bitbucket_host> <username> <repos_dir>
For example:
$ ./multi_push_repo.sh mybitbucket.com joedoe repos
Would execute the push_repo
command for each repository found in the repos
directory. Additionally a log file is created in the repos
directory for each
execution.
If something went wrong you may want to delete all the Bitbucket repositories and unset the origin of the local Git repositories. To do that for a single repository execute:
$ ./delete_repo.sh <bitbucket_host> <username> <repo_name> <repos_dir>
For example:
$ ./delete_repo.sh mybitbucket.com joedoe test repos
Would execute the following steps:
- Delete the
test
repository owned by userjoedoe
hosted inmybitbucket.com
- Unset the origin of the local
test
repository in therepos
directory
This script assumes the local repository does not have origin set and that the Bitbucket repository does not exist.
You can also execute the same script for multiple repositories:
$ ./_multi_delete_repo.sh mybitbucket.com joedoe repos
For example:
$ ./multi_push_repo.sh mybitbucket.com joedoe repos
Would the delete_repo
command for each repository found in the repos
directory. Like the multiple push command, a log file is created in the repos
directory for each execution.