[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: So which is subversion wrong about; changing imported files

From: Ryan Schmidt <subversion-2007a_at_ryandesign.com>
Date: 2007-03-03 04:26:45 CET

Yeah, I'm not sure what exactly your question is, but I will say this:

On Mar 2, 2007, at 13:47, Jeff Smith wrote:

> 1. Subversion will always PRESERVE the original file when importing
> (many authors)
[snip]
> I can completely support 1, however:
> The problem is that those insisting that 1 is solid have also proven
> to me that when dealing with svn:eol-style (in auto-properties),
> Subversion breaks the law. In the repository, it "normalizes" the
> file. Read "I, Robot" by Isaac Asimov, and see that these guys are
> probably calling it "normalize" instead of "modify" to save
> developers from breaking down, realizing that they have actually
> modified the file on import. Once imported, there is ABSOLUTELY no
> indication of whether that file originally ended lines with CR, LF,
> or CRLF characters. Let's face it... the file has been modified! I
> would rather subversion be consistent, but preserving this file.

AI think I agree: Subversion has modified the file. However I think
that by using the svn:eol-style property, you have asked Subversion
to modify the file. I'm not sure how the feature could otherwise
function.

In researching this earlier, I found a Wikipedia page describing the
Unicode-approved way of dealing with newlines in a file:

 From http://en.wikipedia.org/wiki/Newline#Unicode :

> The Unicode standard addresses the problem by defining a large
> number of characters that conforming applications should recognize
> as line terminators:
>
> LF: Line Feed, U+000A
> CR: Carriage Return, U+000D
> CR+LF: CR followed by LF, U+000D followed by U+000A
> NEL: Next Line, U+0085
> FF: Form Feed, U+000C
> LS: Line Separator, U+2028
> PS: Paragraph Separator, U+2029
>
> This may seem overly complicated compared to a simple approach like
> converting all line terminators to a single character, for example
> LF. The simple approach breaks down, however, when trying to
> convert a text file from an encoding like EBCDIC to Unicode and
> back. When converting to Unicode, NEL would have to be replaced by
> LF, but when converting back it would be impossible to decide if a
> LF should be mapped to an EBCDIC LF or NEL. The approach taken in
> the Unicode standard allows this transformation to be information-
> preserving while still enabling applications to recognize all
> possible types of line terminators.
So, it seems the Unicode recommendation coincides with your feeling
that Subversion should preserve whatever line endings the file
contains, and not modify them. However this requires all tools you
use to also obey this, and recognize all the above Unicode characters
as line endings. Subversion can of course achieve this preserving
behavior if you simply do not set the svn:eol-style property. It
seems that any use of the svn:eol-style property the way Subversion
currently implements it (and perhaps even in any way in which it
could be implemented) is in violation of the above Unicode line
ending principles.

If the only reason for the Unicode recommendation is support of
EBCDIC systems, then one can decide for oneself how important EBCDIC
support is. To me, for example, it's not important at all.

I'm not sure if I'm arguing for any particular behavior in Subversion
or recommending any particular practice by people who use Subversion,
just noting what I found.

-- 
To reply to the mailing list, please use your mailer's Reply To All  
function
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sat Mar 3 04:27:19 2007

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.