Re: I'm not quite sure I buy that.
> % arch is now not only the most
> % and distributed SCM system, but the
> % fastest and more space efficient too.
> % You'll use only a fraction of the
> % you need with other tools but without
> % lose of speed ;)
> tla 1.1 is certainly much faster than
> prior versions of tla, and there are
> many operations which it certainly does
> much faster than other tools. That said,
> I'm not sure I'd agree to "fastest"
> without hard numbers including
> comparisons of competitors (that is, svn
> and bk) --
In usual operations like get (checkout) and changes (diff -R), if you have a working tree (a project) hardlinked to the previous revision, it's almost impossible to be faster than arch. I don't say you can't be, but neither bk nor svn architectures allow this kind of optimization. Even without hardlinks, the inode signatures of each file allow for lightening comparaisons. There's room for improvement in certain tagging methods like explicit tagging, but the current status is certinly very competitive.
> and I certainly don't agree
> with arch using a minimal amount of disk
> space, at least if one is using a
> filesystem which penalizes having a
> large number of small files --
ext3 has these kind of problems with all the small files, this is not an arch issue. I see a problem if you use explicit tagging because you double the number of files (one file to store each id), but this is surmountable and one of the points to improve during 1.2 development. I think the ext3 gourp is working in improving ths situation; BTW, reiserfs does not suffer from it.
> and if
> one has very large revision libraries
> (ie. the complete gcc tree history),
> it's also for the better to be running a
> filesystem without a fixed maximum inode
With the new sparse library revisions you do not need to keep all the revisions in the library. That's a big win because you usually don't need the ancient revisions but only the recent ones. Now it's up to you to decide wich ones you do need. And if you choose it, whenever you need a revision (i.e. to diff against), it will be automatically added. Moreover, you can have libraires in multiple locations, a very useful feature when you work in different devices or computers or when you want to share libraries in a development group.
One more plus: with tla-1.1 you won't have more pristine trees (those are full copies of a revision) that took a lot of space. They have been replaced by spares revision libraries (these are the recommendations, in fact they are there but I'm sure they'll dissappear in a couple of releases). Revison libraries are much more useful because they are hard linked among them (les space, more efficiency), and shared (pristines belong to a working tree and, although you searched for them in the slibing directories, taht was only a good hack now luckily overcome).
> Keep in mind that many of Arch's design
> decisions emphasize robustness and
> simplicity over localized "common-case"
> optimizations which make presumptions
> about intended use-cases, with Tom's
> observations wrt how such optimizations
> are inappropriate given the state of
> modern computing hardware as the usual
> backing evidence. I certainly agree that
> this focus is The Right Thing to do, but
> it also means that arch isn't really
> optimized for tiny disk usage (at the
> expense of other things) in the same way
> that some of competing systems are.
I don't agree with you in this point. Revisions are fully optimized caches for revision control use. They are like working trees that share all the not-changed inodes with the ancestor revision. You can even hardlink your working tree to a cached revision, making it stil more effective. Caches should improve speed and, in this case, they do it with the maximum optimization.
The archive is not stored in a fully optimized binary format, but, as you say, this is a good decision (ext3 will catch up some day). It allows to have arch repositories withut special servers and retrieve the information easyly if you wanted to ove to another system. A different backend server (svn, for instance) could be plugged into arch withou a great pain, as stated by Tom some times. The space trade off is worth the simplicity and openness.
> Observe recent posts on arch-users
> regarding the cumulative disk usage of
> patch logs for an example of a case
> where arch uses more disk resources than
> might otherwise be the case. Is it an
> appropriate design decision for a modern
> revision control system, given the
> constraints within which it was
> designed? Absolutely. Does it make arch
> an optimally space-efficient revision
> control system? No, not really.
When you put a project under revision control, you should plan for a good storage for the repository. If you have hundreds of changes, some space will be used. There are tricks to cut those numbes down if you suffer from that syndrome: recycle your archives from time to time, or create new tagged branches, and store a cached revision in the archive so that you do not need to keep the older pachlogs around. The solution is alredy there and I doubt you can have the same kind of optimization with other SCM tools.
It's time to move to arch and ofrget about CVS, bk and co
arch is now not only the most flexible and distributed SCM system, but the fastest and more space efficient too. You'll use only a fraction of the space you need with other tools but without lose of speed ;)
Come on what are you waiting for?