Test repository for Git wrappers
Does your Git wrapper support commit messages encoded in latin-1? (encoding
header) In Shift-JIS? Actually, any encoding you can think of, given that Git doesn't care.
Do you know about multiline headers? Like in signed commits? (gpgsig
header)
What about merge tags (mergetag
), then? (another multiline header)
Did you know that a commit could have multiple mergetag
headers? (If you happen to merge multiple signed tags at once)
Oh, and of course, one can sign one of those multiple signed tags merge commit! And write the message for that commit in any encoding they like!
Because I want to write the best possible Git wrapper, I've started to collect all the weird commits (and other objects) I could find in a special repository: https://github.com/book/git-test-repository/.
Actually, rather than manually adding commits to that weird repository, I made a tool to write them for me, write the description, and package them all in a nice bundle I can simply ship with my tests. And because my tests need some way to get to the data I want to parse, I also wrote a tool to produce some Perl out of that bundle. The tools are available at https://github.com/book/git-test-repo-tool/.
Ironically, Git::Repository is such a thin wrapper around Git that I don't need that test repository to torture-test it. Git::Repository::Plugin::Log, on the other hand, tries to parse the output of git log
, and that's when having a few edge-cases is handy for testing. Check out the its test suite for a nice example of using that test bundle.
In conclusion:
If you're using a Git wrapper (other than
system
andqx
, that is), you might consider using Git::Repository, which has allegedly been tested against all those weird commits.If you're writing a Git wrapper, you're welcome to use that bundle for your tests. And if you find some weirder objects, please know that I welcome them in my collection.
:-)
54fff looks like junk in my browser. Maybe GitHub could use your repository!
That's the "shift-jis content, shift-jis message" commit. I guess GitHub does not convert between encodings, or more likely blindly assumes all commits to be in UTF-8.
Note that all the ids change whenever I regenerate and push a new test repository to GitHub. And the current repository is likely to change soon, since I just came up with new test commits. :-)
For the record I just found out that even though Git will refuse an empty GIT_AUTHOR_NAME or GIT_COMMITTER_NAME, it will accept a single space as a value, which seems to produce author/committer lines that make Git::Repository::Log warn. Which obviously I must test and fix.