[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn log --xml not generating valid utf-8

From: B. Smith-Mannschott <bsmith.occs_at_gmail.com>
Date: Fri, 6 Nov 2009 19:38:23 +0100

On Fri, Nov 6, 2009 at 18:26, Justin Michel <justin_michel_at_hotmail.com> wrote:
> This usually happens when one of our devs enters a comment containing
> non-ascii text. There's a lot of this in a large legacy project we've
> inherited, and it makes it inconvenient to use tools that post-process the
> logs to extract information. (e.g. statsvn)

I suspect:

Subversion <= 1.5 assumes that the bytes for the log message (and
other internal subversion properties) are UTF-8 but does not actually
verify this.
This works provided the client software does the transcoding. I
believe the svn client learns this from the environment variables LANG
and friends.
If Subversion believes that the console is using a different encoding
than it actually is, hilarity ensues.

When emitting XML Subversion just assumes that the bytes it's got are
correctly encoded and drops them into the output.
(I encountered a failure to properly escape & in an earlier release,
but that's neither here nor there and probably long since fixed.)

I believe Subversion >= 1.6 is stricter on this count, rejecting log
messages which do not use the proper encoding (UTF-8) and eol-style
(LF).
But this doesn't help you if your server is older than 1.6, and it
won't help for old commits made with previous releases of Subversion.

Anyway, that's my understanding. Corrections welcome.

// Ben

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=2415185

To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_subversion.tigris.org].
Received on 2009-11-06 19:39:39 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.