[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [subversion-dev] Subversion design document up

From: Zack Weinberg <zack_at_wolery.cumb.org>
Date: 2000-06-05 18:28:17 CEST

On Mon, Jun 05, 2000 at 07:58:27AM -0400, Jonathan S. Shapiro wrote:
> > On Sun, Jun 04, 2000 at 11:08:29AM -0400, Jonathan S. Shapiro wrote:
> > > The SCCS storage format isn't terribly efficient. Actually, it's famous
> for
> > > having sucky performance.
> >
> > I am certain this is an artifact of the implementation.
>
> Actually, it's the nature of the file format. RCS stores the most recent
> version with deltas to get to old versions. SCCS stores the original and
> requires that all deltas be re-applied on each checkout (and checkin, for
> that matter). The SCCS file format is a bit cleaner, but after 100 deltas or
> so it's a profound loose.

This is not true. It stores all the versions sort of mixed together.
As I said in the original message, it takes the same time to extract
*any* revision, and that time is always the same or better than RCS,
if implemented properly. We ran test repositories with 10,000 deltas
in a file and it was still faster than RCS.

I'll believe that the original SCCS was slower than RCS after a few
hundred deltas, but only because the implementation was lame.

> > I remember being annoyed at some version of PRCS because
> > the archives were always binary... In my opinion, if the data
> > stored in the file is 7-bit ASCII, the entire archive file should
> > be 7-bit ASCII.
>
> We have a significant disagreement here. In my opinion, the archive file
> format is the business of the archive, and it's none of the user's business
> whether that format is binary, ascii, or morse code.

I agree with the first half of that sentence but not the second. It
is the user's business. Why? Most obviously, because the repository
*will* get corrupted. Not because of filesystem or disk problems, but
because of bugs in the VC system. It is infinitely easier to debug a
text file.

A secondary but still important reason is that a text file can be
embedded in email with no special care. If you have a binary file,
you are forced to use MIME or uuencode. This is less important for
the archive files - but I've done tech support for VC, and it was a
great help to be able to ask someone to email me the problem s.file
without needing to explain MIME. It is essential for patches; emailed
distribution of patches _in human readable format_ is central to just
about everyone's development process. Now that format doesn't
necessarily need to be the same as the format used in the update
protocol, but wouldn't it be simpler to have just one?

> A binary convention within the repository actually simplifies things, as the
> server side of the repository system ceases to be conditioned on file type,
> which has proven a useful simplifcation for my stuff.

You're confusing treating all files the same with storing the metadata
in binary. It is easy to treat all files the same and still store the
metadata as text.

zw
Received on Sat Oct 21 14:36:05 2006

This is an archived mail posted to the Subversion Dev mailing list.