On 07/19/2011 03:49 AM, Ivan Zhakov wrote:
> On Mon, Jul 18, 2011 at 23:50, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
>> Ivan notes on IRC that this fixes issues with non-ASCII lock tokens.
>> (At least, he reports errors in 'svn ls -v' over http://; but for me
>> 'ls' and 'info' work fine over svn://.)
>>
>> Are lock tokens even permitted to be non-ASCII?
>> (both backends generate ASCII-only lock tokens (in the same manner))
>>
>> If they are, are they required to be in UTF-8? Or can they be, say,
>> arbitrary octet strings?
>>
> I didn't find explicit specification for lock token encoding, but it
> handled everywhere as normal utf-8 string.
>
> That's why I decided to implement minimum patch just to decide lock
> token returned from hook output to utf-8 string.
>
> But I'm fine to require lock token to be ASCII only string.
Having this as an open question is unacceptable, so we need to make a
decision about it. To my knowledge, ASCII is the only thing used for lock
tokens in the code we ship. There's no telling what encoding could be in
use if folks are taking advantage of the pre-lock-hook token dictation
feature -- Apache will give them a "C" locale environment, but sure anyone
who has advanced enough configuration to require taking advantage of this
feature is also setting the locale from inside those hooks.
FWIW, I think UTF-8 is a fine choice, especially if we've been internally
handling this string as UTF-8 already. But let's document the fact
somewhere, please (preferably in the pre-lock-hook template comments).
--
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
Received on 2011-07-19 15:45:47 CEST