When importing a Subversion repository into Git with git-svn, if you want Git to use the commit author’s name and E-mail instead of Subversion username you are required to create a file (Called the “authors file” in the git-svn manual) with the author’s details. This is all good for the most part, except you are required to enter details for everyone that has made a commit and for a large public project that can be quite a few people. To make matters worse Subversion doesn’t store that information itself, so you are required to find it elsewhere.
Luckily Git provides some services that allow you to change commits in a live repository fairly easily. Below is a small script I wrote that allows you to change only the details of a few select commit authors from a Subversion import, leaving all others as their old usernames. The file that stores the author details is compatible with the git-svn authors file.
Warning: Running this script on a public repository will cause it to become incompatible with other clones and fetches. All other developers will be required to rebase their work, use with caution.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | #!/usr/bin/env python
##############################################################################
# Change Git commit authors
# Copyright (C) 2008 Lucas Murray
# http://www.undefinedfire.com
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
##############################################################################
import os, sys, re, commands
# Read arguments
try:
authorsfile = sys.argv[1]
except:
print 'Usage: git-new-authors AUTHORSFILE'
print
print 'Syntax of the AUTHORSFILE is compatible with the file used by'
print 'git-cvsimport and git-svn:'
print ' loginname = Joe User <user@example.com>'
sys.exit(1)
# Read authors file
authors = {}
RE_AUTHOR = re.compile(r'^(.+?|\(no author\))\s*=\s*(.+?)\s*<(.+)>\s*$')
try:
f = open(authorsfile)
for line in f:
match = RE_AUTHOR.match(line)
if match:
if match.group(2) == 'NAME' or match.group(3) == 'USER@DOMAIN':
continue # Don't change placeholder entries
authors[match.group(1)] = [match.group(2), match.group(3)]
f.close()
except:
print 'Error reading authors file'
sys.exit(1)
# Generate the author conditions
authorStatements = []
for author, info in authors.iteritems():
authorStatements.append("""if [ "$GIT_AUTHOR_NAME" = "%s" ];
then
export GIT_AUTHOR_EMAIL="%s"
export GIT_AUTHOR_NAME="%s"
fi""" % (author, info[1], info[0]))
# Create a temporary Bash script to get around the maximum command length
scriptFile = os.path.expanduser('~/__git-new-authors-shell')
if os.path.exists(scriptFile):
print '%s already exists, please delete before continuing' % scriptFile
sys.exit(1)
f = open(scriptFile, 'w')
f.write("""#!/bin/bash
%s
export GIT_COMMITTER_EMAIL=$GIT_AUTHOR_EMAIL
export GIT_COMMITTER_NAME=$GIT_AUTHOR_NAME""" % '\n'.join(authorStatements))
f.close()
os.chmod(scriptFile, 0755)
# Actually run it
print commands.getoutput('git filter-branch --env-filter \'. %s\' -- --all' % scriptFile)
# Clean up
os.unlink(scriptFile)
|
The script leaves the original refs in the repository until they expire (Default 30 days), if you want to remove them immediately run the commands that follow. Note that the first command is only needed if you have run `git gc` before deciding to remove the original refs.
1 2 3 | git for-each-ref --format='(%refname)' refs/original | xargs -i git update-ref -d {}
rm -rf .git/refs/original
git gc
|
To make things even easier Josh from Technical Pickles has a script that will generate an authors file with placeholder information directly from a Subversion checkout. If using his script in conjunction with mine you will not have to worry about removing the placeholder data for accounts you do not know the details for, my script automatically ignores them.
Comments
Have your say