On Tue, Jul 19, 2011 at 21:46, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
> C. Michael Pilato wrote on Tue, Jul 19, 2011 at 09:45:09 -0400:
>> On 07/19/2011 03:49 AM, Ivan Zhakov wrote:
>> > On Mon, Jul 18, 2011 at 23:50, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
>> >> Ivan notes on IRC that this fixes issues with non-ASCII lock tokens.
>> >> (At least, he reports errors in 'svn ls -v' over http://; but for me
>> >> 'ls' and 'info' work fine over svn://.)
>> >>
>> >> Are lock tokens even permitted to be non-ASCII?
>> >> (both backends generate ASCII-only lock tokens (in the same manner))
>> >>
>> >> If they are, are they required to be in UTF-8? Or can they be, say,
>> >> arbitrary octet strings?
>> >>
>> > I didn't find explicit specification for lock token encoding, but it
>> > handled everywhere as normal utf-8 string.
>> >
>> > That's why I decided to implement minimum patch just to decide lock
>> > token returned from hook output to utf-8 string.
>> >
>> > But I'm fine to require lock token to be ASCII only string.
>>
>> Having this as an open question is unacceptable, so we need to make a
>> decision about it. To my knowledge, ASCII is the only thing used for lock
>> tokens in the code we ship. There's no telling what encoding could be in
>
> For the record, our code generates tokens in the form
> "opaquelocktoken:$UUID".
>
>> use if folks are taking advantage of the pre-lock-hook token dictation
>> feature -- Apache will give them a "C" locale environment, but sure anyone
>> who has advanced enough configuration to require taking advantage of this
>> feature is also setting the locale from inside those hooks.
>>
>> FWIW, I think UTF-8 is a fine choice, especially if we've been internally
>> handling this string as UTF-8 already. But let's document the fact
>> somewhere, please (preferably in the pre-lock-hook template comments).
>>
>
> Looking again, svn_fs.h:2059 documents that lock tokens are URI's.
> (and, consequently, are printable ASCII)
>
> We also need them to be XML-safe for their use in ra_dav LOCK responses.
>
> So, I'm thinking that:
>
> * We should validate the supplied lock tokens, not for being well-formed
> UTF-8 as in Ivan's change, but for being ASCII-only.
>
> * We should move the validation to the FS layer.
>
> Thoughts?
>
I'm fine with this approach, but I think we should leave explicit
validation of lock tokens in pre-lock hook handling for better
diagnostic.
--
Ivan Zhakov
Received on 2011-07-19 19:55:16 CEST