Monotone analysis

Richard Laager rlaager at wiktel.com
Tue Jul 8 18:56:03 EDT 2008


On Tue, 2008-07-08 at 23:17 +0300, Felipe Contreras wrote:
> A cert is basically metadata signed by some person. It could be anything, it's
> a key string, and a string value, that's it.

This opens up some really interesting workflow possibilities that
wouldn't exist otherwise. That said, I'll stipulate that nobody seems to
be interested in doing these things in the real world right now, which
is really sad.

I have a great example of where this would be useful, though: Right now,
the kernel developers are doing Signed-Off-By markers (and the like) in
commits. Monotone has built-in support for this (with real crypto
backing it, even). It's also possible for someone like Linus to
configure Monotone to ignore all revisions that aren't signed by his
Lieutenants.

> there is no sane way to traverse a branch
> but to search through the whole repository for the commits.

I don't know what you mean by "traverse a branch".

> A similar thing happens with commits between certain date ranges.
> Since dates are
> just certs, finding revisions between certain ranges requires cheking all the
> revisions.

It involves checking all the certs, which can be indexed (it's a SQL
database, after all): SELECT * FROM certs WHERE ...

> It's not efficient.

I'm curious how this looks in Git.

> Another curious thing aobut certs is that you can actually set many different
> changelogs. Each developer can set a different message for the same revision.
> 
> It's confusing, and annoying.

This is an excellent feature. It allows you to go back and add more
information after the fact.

> In other DSCMS (git, bzr, hg) a revision is not complete without commiter,
> author, date, and message. So you can't have more than one of these
> values or the
> revision would be different.

How does git deal with this case?

Imagine a branch with two heads.
A
|\
| \
|  \
|  |
B  C

Two people each merge them:
A
|\
| \
|  \
|  |
B  C
 \ /
  D

Then they push their changes. In Monotone, they both created the same
revision D with a set of certs, so this collapses nicely. Does git lead
to two new heads? (Or, since you can't have two heads in git, does it
force a merge?)

> In other DSCMS (git, bzr) revisions are stored individually, and if they are
> similar they are compressed. That achieves the same goal as just storing
> deltas; saving space. But also achieves the goal of efficiency on different
> operations.

This makes it more expensive to show a diff between two adjacent
revisions, a common operation. That said, from what I hear, git's
compression these days is SO GOOD that it achieves huge wins here.

> In Bazaar each branch is developed separately. That means there's only one
> head on each branch

Imagine a branch like this:

A -> B -> C -> D

If I have two checkouts from that and I make a different commit from
each, what happens? In Monotone, I get two heads. Is BZR/git/hg/whatever
going to stop me from committing the second change until I merge it up
with the new head? If so, I think this is a fundamental design flaw
because I can't easily preserve the state of the code BEFORE a risky
merge.

> So each branch is simple; you can traverse it by looking at the parents of the
> head of the branch.

If this is your definition of traversing a branch, you can do the same
thing in Monotone, so I don't understand your concern above.

> In Git, each branch has only one head. So it's similar to bzr; branches are
> simple.

Same question as above. How do you prevent multiple heads from
happening?

> = Popularity =

> The more popular DSCMs right now are git and bzr, with hg not so far away.
> Choosing anything else doesn't seem to be a wise choice.

Right. I think this is also a potentially compelling reason to switch.

On Wed, 2008-07-09 at 01:05 +0300, Felipe Contreras wrote:
> Keith Packard understood this, and chose git knowing that the
> fundamentals where right [1].

Half of that document explains why a DVCS is better than a CVCS, so
that's not really a difference here.

Monotone has a much more flexible representation (with certs being
generic). Here your argument cuts against you. If you want to do
something new with git, you have to write support for another
"column" (using SQL terminology) in your "table", whereas with Monotone,
you just create the cert. For example, let's say you had Signed-Off-By
(which you don't) and now you want to add Reviewed-By...

"Files containing object data are never modified. Once written, every
file is read-only from that point forward."

s/Files containing/ and this applies to Monotone. In fact, that's where
it gets its name. There's always the danger that the database could be
corrupted. Basically, here we're comparing the potential for corrupting
a single file database vs. a directory tree of files. In all reality, I
don't think it's that big of a difference, but I do tend to trust ext3
more than sqlite. Of course, with this being about *distributed* version
control systems, you end up with backups all around the world regardless
of which system you choose.

However, if the files are write-once, then you must end up with at least
one file per revision (except, perhaps for imports where you could batch
up the existing revisions). This leads to the criticism being applied
against SVN: "The FSFS backend places one file per revision in a single
directory; a test import of Mozilla generated hundreds of thousands of
files in this directory, causing performance to plummet as more
revisions were imported." How does git do write-once files without
having one file per revision?

> In any case; the important thing as you say is the cost of switching.
> I bet switching to mtn was painful, but that doesn't mean switching to
> git would be

Converting the repository from SVN was painful, which wouldn't be
incurred again. Learning DVCS concepts was some work, which also
wouldn't be duplicated.

I don't think that switching would be all that difficult and I like the
idea of choosing something that contributors would be more familiar with
(as well as something that has good hosting opportunities so said
contributors can really take advantage of the branch-sharing workflow).
I think the really killer application here is something like Launchpad,
with integrated code hosting, code review, bug tracking, etc.

If we decide to switch, I'd like to wait just a little longer before we
do it to see how git vs. bzr is going to turn out so we don't end up
switching twice.

Richard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://pidgin.im/pipermail/devel/attachments/20080708/14d3280e/attachment.sig>


More information about the Devel mailing list