[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: SVN Book Method for Splitting Repos doesn't work

From: Roland Besserer <roland_at_motorola.com>
Date: 2004-12-30 21:57:25 CET

Sorry about mangling the name - honest typo :-)

Almost everything boils down to tradeoffs - we all have to make them in
our designs or implementations. I can certainly see the reason for
efficieny in the implementation of svn and the way data is stored and
managed. I do not think the efficency argument is nearly as strong
for a dump utility and dump file format which is, by implication, used
sporadically. Not being part of the frequent operations performed as
part of running svn on a daily basis, efficiency concerns should not
supercede what I would call 'easy of use' - the ability to easily
process the dump file with the widest variety of tools. I simply fail
understand why one would want to place any arbitrary restrictions
or limitations on the dumpf file format - particularly on that is
announced as human readable. In my opinion the dump file
format should be 7-bit ASCII (back to uuencode and the like :-)

Anyway, not much point in rambling on about this. I can always write
my own dump routine :-) ... but I think I can live with what we've got.

roland

"Max Bowsher"<maxb@ukf.net> writes:

> Roland Besserer wrote:
> ...
> > I would like to comment on the concept of 'human readable' though.
> > Although emacs (for example) can easily handle binary files just
> > dumped into the output file, including 8-bit data sure doesn't make
> > the dump file human readable anymore. It also makes processing the
> > dump file with text (or more accurately line) oriented tools error
> > prone.
> >
> > SVN is already, in my opinion, somewhat handicapped by the fact that
> > it uses a database backend
>
> You seem to have ignored FSFS.
>
> > and thus a binary file format that puts you
> > at the mercy of the decode/repair tools specifically designed for
> > it.
>
> Under the hood, the formats really aren't that much more difficult to
> comprehend than CVSes. Anyone who really wants to peek under the
> covers, is free to do so.
>
> > It would be nice if at least the dump file format would stick to
> > an ASCII only representation that makes processing of dump files with
> > 'standard' utilities easy and less error prone.
> >
> > Max Bowsler made the interesting comment that "Personally, I think
> > that uuencoding (or similar) doesn't increase human-readability, it
> > just wastes processing time" which I completely disagree with. Who
> > cares about minute incremental decoding time or even file size in
> > this age of multi-GHz processors and 100GB disks. Human readable
> > is a term that should not be taken literally. To me it means that it
> > is an ASCII/text based representation I can feed any tool like sed or
> > awk with.
>
> Hi, it's me again :-). "Bowsher" not "Bowsler", by the way.
>
> This is a debate about tradeoffs -
>
> Subversion saves processing time and file size, at the cost of putting
> greater requirements on the tools used.
>
> I happen to feel that this is the right tradeoff to make in this case.
>
> Any small overhead can become quite magnified when dealing with
> gigabytes of data, and if you want to restrict the available byte
> values to printable ASCII, then the amount of space required to store
> arbitrary data will increase by approximately a factor of 3.
>
> The downside, of course, is the increased restrictions on the tools:
>
> I think expecting data processing tools to be 8-bit clean is a
> reasonable demand for newly engineered systems today.
>
> There is the further complication, of course, of dumpfile-header-like
> data appearing in the middle of file content - I admit that this is a
> harder problem. However, both perl and python are excellent tools, and
> the dumpfile format has been deliberately designed to be easily
> parseable, offering a way to cleanly circumvent this issue.
>
>
> One particular choice of tradeoffs will never be the optimum for all cases.
> The particular choice made by Subversion happens to work very nicely
> for the common cases of using dumpstreams for backup and migration.
>
>
> Max.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Thu Dec 30 21:59:52 2004

This is an archived mail posted to the Subversion Users mailing list.