[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: svnrdump: The BIG update

From: Bert Huijben <bert_at_vmoo.com>
Date: Tue, 17 Aug 2010 09:30:08 -0700

> -----Original Message-----
> From: Ramkumar Ramachandra [mailto:artagnon_at_gmail.com]
> Sent: dinsdag 17 augustus 2010 9:09
> To: Daniel Shahaf
> Cc: Subversion-dev Mailing List
> Subject: Re: svnrdump: The BIG update
> Hi Daniel,
> Daniel Shahaf writes:
> > Ramkumar Ramachandra wrote on Thu, Aug 12, 2010 at 12:17:34 +0530:
> > > > > The dump functionality is also complete- thanks to Stefan's review
> and
> > > > > MANY others for cleaning it up. It's however hit a brick wall now
> > > > > because of missing headers in the RA layer. Until I (or someone
> > > > > figures out how to fix the RA layer, we can't do better than the
> > > > > copy-and-modify test I've committed.
> > > >
> > > > Part of the diff there is lack of SHA-1 headers --- which is
> > > > until editor is revved --- but part of it is a missing
> md5.
> > > > Why don't you output that information --- doesn't the editor give it
> you?
> > >
> > > Afaik, no. I don't see Text-copy-source-* anywhere in the RA
> > > layer. Maybe I'm not looking hard enough?
> > >
> >
> > Hmm. It seems you're right. So you might have to use two RA session in
> > parallel...
> >
> > (and then, you might have to have the user authenticate twice?)
> Hm, I also have to find out if it's allowed. The commit_editor doesn't
> allow it for instance. Besides, it's a very inelegant solution- I'd
> rather fix the RA layer than do this.

@Daniel, what would adding these adders add?

The extra headers are for making it easier to detect corruptions by checking
them along the transfer.

If we are just doing additional work to add headers via a different process
it slows the dumping down more than a bit and it doesn't make the dump file
any safer because it uses a different processes to obtain the header.
I think you would have to obtain the source of the copyfrom and get some
checksum from that; maybe you can do that without transferring the file
again, but I'm not sure about that.

(And without the added headers the process is already as safe as svnsync.).

Yes, we can add more and more processing to also get those new Sha1 headers
by recalculating them while dumping, but the idea for svnrdump was to create
a fast and secure way to dump and load repositories... not an incredible
slow one that has to transfer files multiple times just to make all the
optional headers match the output of svnadmin.

Those headers were made optional for a reason: you don't always have them.
And different conversion processes have different headers available.
Svnadmin looks at the FS layer for dumping, so it sees different things than
an RA layer api. E.g. the dump in svnadmin has to create diffs from
fulltexts itself, while svnrdump has diffs and must apply these itself to
get full texts. The checksums have a similar mangling. The FS has access to
some of the checksums and recalculates others for you. (See the performance
drop in 1.6 of svnadmin dump)

There is a similar case at the import side. Applying commits can't check all
the checksums, but the really important ones are already handled. Svnrdump
dump and svnrdump load are a nice match.

Received on 2010-08-17 18:31:18 CEST

This is an archived mail posted to the Subversion Dev mailing list.