Migration considerations

Felipe Contreras felipe.contreras at gmail.com
Mon Feb 7 19:33:51 EST 2011


I saw John Bailey's mail[1] regarding the final (?) decision to move
to hg, and I have a few comments.

> Also, as promised I'll explain the choices I made for the hg conversion that
> have influenced Richard's massive contributions thus far.  I chose to use hg's
> 'convert' extension to convert directly to hg.  I *could* have relatively easily
> used monotone's 'git_export' command and piped it into the hg fast-import
> extension, but I chose not to do so.  Using hg's convert, although slow, allows
> us to do some history cleanup.  Yes, this is modifying our history, but in this
> case it's for the better.

I wrote that svn-patch-authors file, and a simple 'git filter-branch' script
fixes the authors in no time:

There is no advantage from hg's convert whatsoever.

> As many of you know, we have a number of revisions in our monotone history that
> have multiple changelog certs.  I've been slowly reviewing these and carefully
> merging them as best I can to reduce duplication and potential for loss.  (This
> part could be done for a git conversion as well.)  As I mentioned previously in
> this thread, I expanded upon Felipe's prior work on an author map to get
> consistent author names across our entire history.  I've also patched the
> convert extension to handle comment certs (which monotone's git_export does
> already) and a new "committer" cert that's come in handy for Richard's extended
> work.  Richard has gone on to take a partial map of our ancient SVN history and
> use it to set author and committer certs on revisions where a developer
> committed someone else's patch.  The idea here is that our history can now be
> more accurate with respect to authorship.

I wrote that file, as you can see in the hg log. We have been collaborating, if
you take a closer look a the pidgin-mtn-conv-files repo you would see that I
have contributed a lot of work.

> One unfortunate note here is that hg doesn't directly support a revision whose
> committer wasn't it's author, which I wasn't aware of until Richard pointed it
> out.  Sure, we can use 'hg commit -u SomeUserName', but there is no revision
> metadata that says which one of us actually committed the revision.  Since no
> such concept exists in hg to map to, the behavior I've patched into the convert
> extension adds the committer cert's value to the converted changelog entry.
> This is consistent with convert's behavior with git conversions and is supported
> by some tools, such as hgk.

In git, there are standard tags, such as 'Signed-off-by'. Say, if you apply a
patch by me, the s-o-b's would be:

Signed-off-by: Felipe Contreras <felipe.contreras at gmail.com>
Signed-off-by: John Bailey <rekkanoryo at rekkanoryo.org>

> To get equivalent behavior out of monotone's git_export command, I'd also have
> needed to implement a patch.  It would have been more difficult to patch
> monotone's C++ code for the git fast-export stuff than it was to patch convert's
> python code.

No, you could just use 'git filter-branch' as I do in pidgin-git-import.

Even if you pick hg, my recommendation woud be to use 'mtn git_export' as I'm
confident that tool outputs a correct repo. I wrote a conversion tool in ruby
some time ago, and we (Daniel Lenski and I) have spent considerable amount of
time making sure that the output is correct. You can then easily convert that
to hg.

Now, one of the proposals was to provide both git and hg repos, however, it
would be great if you store data that can be easily exported to git.

 1) Provide both committer and author. Either with Signed-off-by or some other
    method, but you have to provide both somehow.
 2) Always provide valid authors. In git, they are in the form "Real Name
    <email at box.com>", and the mail part is _mandatory_. If there's no mail
    address, <unknown> would make sense.

Also, remember that the idea was to start a repository from scratch, so, even
if the conversion tool is not finished yet, that doesn't matter as it can
always be fixed later on.


P.S. Again, please 'reply to all'

[1] http://pidgin.im/pipermail/devel/2011-February/010135.html

Felipe Contreras

More information about the Devel mailing list