[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: enforcing LF-normalization for svn:eol-style=native files (issue #4065)

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Sun, 9 Dec 2012 21:15:24 +0100

On Sun, Dec 9, 2012 at 11:42 AM, Branko Čibej <brane_at_wandisco.com> wrote:
> On 09.12.2012 09:14, Daniel Shahaf wrote:
>> Johan Corveleyn wrote on Sun, Dec 09, 2012 at 01:02:55 +0100:
>>> Last week a colleague managed to commit a non-LF-normalized
>>> svn:eol-style=native file in our repository again. As explained in
>>> issue #4065 [1], this causes all kinds of problems.
>>>
>>> I suspect there might be a bug in SVNKit, some edge-case where it
>>> doesn't correctly translate the to-be-committed files to
>>> LF-termination. But regardless, I'd like to protect my repository. I
>>> don't fully control the clients that people use, and those clients can
>>> always have bugs, ...
>>>
>>> So ... how hard would it be to fix issue #4065, making the server
>>> enforce the right eol-ness for correct operation? It's a genuine
>>> question, I have no idea :-).
>>>
>> for i in $(svnlook changed -t $TXN $REPOS); do
>> if propget = native || propget = LF ; then
>> svnlook cat -t $TXN $REPOS $i | xxd -c1 -p | grep -i 0d
>> fi
>> done
>>
>> And writing that made me realise... "contains CR" is such a simple
>> condition, that you don't need the fulltext for it --- you would be able
>> to implement it by looking at the "literal new text" segments within
>> svndiff streams directly. However, I'm not quite sure whether this
>> observation is the key to a simple implementation or to an
>> unnecessarily-complicated one.
>>
>>> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4065 - server
>>> should enforce LF normalization for svn:eol-style=native files
>
> The server does not look at the contents of files, or the value of
> properties. It does not even know that properties /have/ semantics. The
> only reasonable place to do this would be in a pre-commit hook. Anything
> else would require a major change in the design of the server and/or
> libsvn_repos.
>
> If the client encounters non-normalized files with svn:eol-style=native
> and does /not/ properly normalize them regardless, that's a client bug.
> Though of course the file would immediately show up as modified as soon
> as it was updated.

Bah, that just sounds like the client (any client) should paper over
these "corrupt files". I don't think that's a good solution.

These files are present in the repository in a corrupt form, not
following the expected invariant that they should have LF line
endings. If you 'svnlook cat' such a file, it will have CRLF line
endings, while normal svn:eol-style=native files will always have LF
line endings when 'svnlook cat'ed.

Any "client" that diffs that change ('svnlook diff', 'svn diff -c $REV
$REPOS', ...), will see a full file diff. They would all have to
normalize that "corrupt file" on the fly to give a good diff like it
should have been if it weren't corrupted.

Note that the Apache repository itself already contains such
corruptions (see [1], the user thread started by Konstantin Kolinko,
which prompted me to file the issue). Any large repository will have
them, because of the diversity of svn clients and bugs they may
contain. These buggy clients ruin the experience for everyone else.
I'd like there to be some better protection against this problem,
centrally.

A pre-commit hook is an option, and I'll probably have to go that
route if there is no other way. But I see two problems with it:

1) It's dog slow. Doing an "svnlook pg" and then potentially "svnlook
cat" for every changed file can easily make the pre-commit hook take
longer than the standard client-side timeout of 30 seconds (for neon,
dunno about the serf timeout). Just imagine doing a catch-up merge
touching 1000 files. Parsing both the properties and the content from
one "svnlook diff" call might be a workaround, but yuck.

2) Am I the only one who wants to protect his repository against this
corruption? Judging from [1], I don't think so. It doesn't make sense
that everyone starts writing this pre-commit hook, for something that
IMHO is an obvious anti-corruption protection. I think every
repository out there deserves to be protected against this.

[1] http://svn.haxx.se/users/archive-2011-10/0287.shtml

-- 
Johan
Received on 2012-12-09 21:16:18 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.