[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: SHA-1 collision in repository?

From: Nico Kadel-Garcia <nkadel_at_gmail.com>
Date: Wed, 28 Feb 2018 09:17:17 -0500

On Tue, Feb 27, 2018 at 4:09 PM, Myria <myriachan_at_gmail.com> wrote:

> Not to mention that the two revisions complained about are unrelated, and
> 2/3 the repository history apart.
>
> One thing that's interesting is that the commit the svnsync failed on is a
> gigantic commit. It's 1.8 GB. Maybe that svnsync is failing because of a
> Subversion bug with huge files...?

Hmm. Could 2 GB filesize limites be involved?

When someone starts encountering this kind of issue with such large
commits, it leads me to think "what the heck was in that commit"?
There are various tools more likely to break when hammered that hard,
wuch as pre-commit hooks written carelessly in Python that try to
preload a hash with the contents of the file and just say "holy sone
of a !@#$, I'm out of resources!!!". Been there, done that, had to
explain the concept of reading a text file with a loop to the
programmer in question.

Also, I'd like to think outside the box at such a point and say "can
that commit be skipped? is there anything in it that we actually need?
can we just do an export/import to a new repo, discard the old repo's
history, and get back to work?" And, what has been a useful tool to
re-arrange and discard undesired branches and tags and history with,
"can we do a git-svn export, flush history we don't need", and publish
it back up to the new canonical Subversion repository"? I've use that
approach now for several Subversion upgrades effectively, especially
to allow some sanitization of the Subversion history. I know such
history clean up is often discouraged, that the history is considered
the critical component of the source control system, but there are
many environments where legacy history is no longer needed. I wonder
if this is one, and the questionable commit itself could be dumped.

> I started an svnadmin verify on my incomplete local copy last night, and no
> problems were reported when it finished this morning. I'll try again with
> this -M option you mention.
>
> I'll also start an svnsync from a Linux machine.
>
> I'm going to see how hard it would be to just copy the 43 GB repository
> directly. We'd have to shut down Subversion service during the copy, so it
> might be a while before I have a chance to.

Would you? If you can use a "rsync" based operation, such as mounting
the share on the Linux system via CIFS and using "rsync", you should
be able to verify that no operations occur during the filesystem based
replication and re-run the "rsync" command when completed, to catch
any dangling operations. If the Subversion server is busy, you might
have to block "write" operations for a while to support consistent
replication.

With tools like CygWin, the files could also be rsynced and copied
locally on the Windows box, then sent over to the Linux box. or an
external USB used, or many other tools.

>>
>>
>>
>> --
>> Philip
Received on 2018-02-28 15:17:28 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.