[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] $LastChangedDate$ encoding

From: Vincent Lefevre <vincent+svn_at_vinc17.org>
Date: 2006-05-07 15:27:47 CEST

On 2006-05-07 05:40:40 -0500, Peter Samuelson wrote:
> Furthermore, I view keyword expansion as an action specific to a WC.

I disagree, in particular because the content encoding is fixed,
and Subversion doesn't convert the encoding used on the repository
side into the user's locale encoding (and I think it shouldn't
IMHO, possibly except in a future extension, just like svn:eol).

> The encoding should be consistent with filenames, which are also
> specific to a WC.

There's absolutely no reason why they should be the same.

> Ivan argued for not localising the keyword expansion at all, but using
> English with ASCII. [Note: I think that is reasonable, but perhaps not
> easy to implement efficiently, as it may involve switching locales.]

Well, Subversion already uses a locale-independent format for the date
in the Id keyword.

> The reason I bring this up is that I applied my patch to the Debian
> unstable version of svn 1.3.1, and I got an angry complaint about it,

[from me]

> partly because my package is now inconsistent with official svn. I
> view my patch as a bug fix, but if consensus is reached that the status
> quo really is better, I'll seriously consider reverting the patch.
> Further arguments: http://bugs.debian.org/290774

I've posted the pros and the cons of various solutions before advanced
keyword expansion is implemented:

    * Using UTF-8 (current behavior):
      + Pros: fixed encoding; no loss; compatible with file formats
        based on UTF-8, which are common (UTF-8 is more or less the
        default encoding nowadays).
      + Cons: may be incompatible with some documents.

    * Using US-ASCII (transliteration):
      + Pros: fixed encoding; compatible with any encoding (except
        EBCDIC, but this one is not tractable) and any file format.
      + Cons: small loss for non-ASCII characters.

    * Using the encoding specified by the locales:
      + Pros: compatible with tools that don't understand encodings
        different from the one specified by the locales.
      + Cons: all the documents using keywords should have the same
        encoding; also requires every user of the repository to use
        the same locales or compatible ones (which may require root
        access to install them, or may not even be available on some
        OS's); if externals are used, the corresponding repositories
        should assume compatible encodings; not backward compatible.

The first 2 are OK for me, not the 3rd one. Also, if this is
configurable, it is also OK for me.

Vincent Lefèvre <vincent_at_vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun May 7 15:28:15 2006

This is an archived mail posted to the Subversion Dev mailing list.