[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: request to clarify and improve Subversion property name specification

From: Garret Wilson <garret_at_globalmentor.com>
Date: Mon, 23 Jan 2012 10:23:46 -0800

On 1/23/2012 9:55 AM, Philip Martin wrote:
> So lots of high range Unicode points are allowed.

Yes.

> How will we validate that?

The same way the code (as given to me by others) currently validates the
names---it iterates the characters provided, and if one of them doesn't
meet the definition, it returns false. Basically what goes inside
if(...){} would be changed.

> Do we have any suitable code in Subversion?

In my original email I provided the name of the method that is currently
providing the arbitrary restriction. If the if(...){} block would change
to relax its current restrictions. I don't see what is difficult about
it, although perhaps I'm being naive. However, noting that SVN+DAV works
just fine with this relaxed restriction, and that JavaHL works just fine
/reading/ values with relaxed restriction, my best guess is that all you
have to do is change a few lines in that method and things will all work
nicely.

> Do we write an XML validator?

Nowhere was there ever a hint of XML validation. In fact, I wasn't even
proposing verification of XML well-formedness. There is no XML markup
involved. I'm simply proposing we use the same definition that XML does
of a name.

The definition of a name is conceptually a set of characters. Think of
it as a regular expression. Currently Subversion uses something like
/[a-zA-z:_][a-zA-z0-9\.:_]*/. I'm simply proposing we relax this using
XML's "regular expression" instead of the one we use now. There is no
XML involved. We are simply re-using a definition from their specification.

Currently there are at least two "official" Subversion clients. One is
using XML's definition of a property name. Another is ("for now" the
code says) using another definition. Whatever we do, I would propose
they both use the same definition. I would vote for XML's definition

> Use some other existing validator? Do we have to extract UTF8
> multibyte characters first?

We would have to interpret the incoming bytes that as UTF-8 and parse
them accordingly before validating the characters, yes. In fact, this
should be happening anyway. Remember that clients such as Subclipse and
TortoiseSVN are already /reading/ these property name values as UTF-8,
so the code that validates them should be interpreting them as UTF-8 as
well.

> I thought you were proposing to write the code?

I'm fine with that as well. Looks like I would have to add a few lines
to decote UTF-8 (surely such code already exists in the Subversion
codebase somewhere) and change a few if(...){} statements. I should be
able to handle that. I would imagine it will take more effort on my part
to get permission to change the code than actually writing the code itself.

>> Basically I'm proposing that we set
>> publicly what constitutes a valid Subversion name, and then make
>> whatever code changes are needed to conform. A test suite comes to
>> mind as a tool to assist in this, but that's another subject
>> altogether.
> Subversion has a testsuite.

Either 1) the test suite does not cover property name validity, or 2)
the DAV+SVN client isn't run through the test suite, because the DAV+SVN
client doesn't comply to the property name validation present behind JavaHL.

Garret
Received on 2012-01-23 19:24:24 CET

This is an archived mail posted to the Subversion Dev mailing list.