[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Is there a way to dump the checksums from a svn repo?

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Thu, 29 Nov 2012 21:33:26 +0200

Philip Martin wrote on Thu, Nov 29, 2012 at 19:13:11 +0000:
> Daniel Shahaf <d.s_at_daniel.shahaf.name> writes:
>
> > Philip Martin wrote on Thu, Nov 29, 2012 at 18:26:04 +0000:
> >> Daniel Shahaf <d.s_at_daniel.shahaf.name> writes:
> >>
> >> > Les Mikesell wrote on Thu, Nov 29, 2012 at 09:59:47 -0600:
> >> >> But, the copy built by svnsync doesn't necessarily
> >> >> get stored the same way, does it?
> >> >
> >> > I think in 1.8/fsfs it will byte-for-byte identical. (except
> >> > rep-cache.db, but you can remove that file without consequences)
> >> >
> >> > There was a dev@ thread by philipm about this not too long ago.
> >>
> >> No, an svnsync mirror is usually not identical to the master. It does
> >> contain the same versioned data but the representation of that data is
> >> different. For example, every failed commit on the master will bump the
> >> fsfs sequence number and that will cause the node-revision-ids to be
> >> different.
> >
> > Node-revision-id's in revisions don't embed transaction id's...
> >
> > For example the noderev header (yes, header, not just id) of
> > /subversion/trunk/notes is identical between svn.us and svn.eu.
>
> OK. But the sequence number differences do show up in other places:
>
> Further, node-revision-ids can vary for other reasons. Representations
> in the revision files are in whatever order the client sends
> representations to the server. There are no defined orders for clients
> to use so it is quite likely that commits to the master and the mirror
> will use different orders:

> That affects the offsets in the text: lines, often changing the line
> length, which in turn affects the position of the subsequent nodes, and
> the position of the nodes affects the node-revision-ids.
>

Yes, that's exactly what your thread <87mx2hw607.fsf_at_stat.home.lan> was
about. I thought in the end that patch got committed?

> svnadmin create repo
> svn mkdir -mm file://`pwd`/repo/A # r1
> svn mkdir -mm file://`pwd`/repo/A # fail
> svn mkdir -mm file://`pwd`/repo/A/B # r2
> svnadmin create repo2
> svnadmin dump repo | svnadmin load repo2
> diff repo/db/revs/0/2 repo2/db/revs/0/2
> 37c37
> < _1.0.t1-2 add-dir false false /A/B
> ---
> > _1.0.t1-1 add-dir false false /A/B
>

Well, that answers the question: revision files are not byte-for-byte
identical.

I wonder, though, if we should be rewriting these to use the revfile
noderev id's? If not to avoid _* id's in revfiles, then to make the
revfiles deterministic by using the ("stable") revfile noderev id's ---
for the reasons given in your linked thread.
Received on 2012-11-29 20:34:22 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.