Oh, and it case I wasn't clear, I'm referring to a Subversion 
repository, not a local copy. And I'm referring to the top-most API. If 
some of the lower layers are more restrictive than the top-most API, 
then they should use some encoding scheme (what, I don't care) to shield 
this platform-specific restriction from the top-level API---which is 
what I thought Daniel was saying at first.
Garret
On 1/20/2012 7:28 PM, Garret Wilson wrote:
> On 1/20/2012 7:00 PM, Daniel Shahaf wrote:
>> Garret Wilson wrote on Fri, Jan 20, 2012 at 18:18:24 -0800:
>>> On 1/20/2012 6:14 PM, Daniel Shahaf wrote:
>>>> You don't care what FS backend the server runs. All you care is
>>>> that the endpoint of svn_ra_open4() implements the Subversion RA
>>>> API properly. Normal Subversion servers use svn_fs.h which in turn
>>>> presents the same API _regardless of which backend is used_. I'll
>>>> spell it out: the notion of 'valid pathname in a Subversion
>>>> filesystem' does not depend on the FS backend in use.
>>> All that is good news. So I guess the important question is: what
>>> spells out "the notion of 'valid pathname on a Subversion
>>> filesystem'"? Is it "any valid Unicode code point?" What I'm getting
>> See my previous reply.
>
> Right. So your previous reply said that a "valid pathname" is the same 
> on all platforms, and that the underlying implementation will take 
> care of the details. I'm asking what are the rules for a "valid 
> pathname". I'm glad that these rules are the same across all 
> platforms, but I don't know what the rules are. In other words, what 
> goes in the following function?
>
> boolean isValidSubversionPathname(String pathname);
>
>
>>
>>> at is that I need to know which characters, if any, I need to encode
>>> before passing them to Subversion. If Subversion supports any
>>> Unicode character, I can just pass the path decoded and sleep
>>> soundly at night. If not, I need to know which ones to decode and
>>> which ones to pass through.
>> Err, that depends on what API layer you're working with.  (For example:
>> svn_fs.h is perfectly happy with :,*,\n as part of the basename, but
>> libsvn_wc on windows, and the mergeinfo logic, aren't.)
>
> Oh, that's bad news. In your previous reply you said, "the notion of 
> 'valid pathname in a Subversion
> filesystem' does not depend on the FS backend in use." Now you seem to 
> say "whether some pathname is valid or not it depends on whether you 
> 're on Windows or some other platform." (Even worse, you seem to be 
> saying that the notion of "valid pathname" isn't even consistent 
> across the API.)
>
>> And 'what to encode/decode' is a rather vague question.  I'm not sure if
>> it means "Does `svn info uri:///foo bar` == `svn info uri:///foo%20bar`?"
>> or something else.  Can you be more concrete?
>
> It doesn't matter. It's some black box that works like this:
>
> String encode(String input);
> String decode(String output);
>
> I can come up with a thousand ways to encode/decode. I can use %hh. I 
> can use ^0xhh. The only two requirements are that 1) encode() provides 
> me with a string guaranteed to be a valid pathname, and 2) decode() 
> will take the encoded string and give me back the decoded string I 
> started with.
>
> But to meet requirement #1, I have to know which characters are 
> considered valid and which aren't. That's what I don't know, and 
> that's what I'm asking:
>
>  1. Does the API guarantee that a "valid pathname" (whatever that is)
>     is the same across all platforms? I thought you said yes, but now
>     it seems you're saying no. (If you say "no", then there's no point
>     in answering question 2, because we're stuck---I can write code
>     that may work with one repository on one platform, but suddenly
>     fail when I move the same data to another platform.)
>  2. What is the definition of "valid pathname"? Is it any Unicode
>     character? Is it only XML name characters? Is it any Unicode
>     character except control characters and NULL (\u0000)?
>
> Sorry if I'm not clear. It's a very simple question, and I hope I'm 
> not making it more complicated than it is.
>
> Think about it this way: pretend you have an XML document with the 
> element <a-b>. You to walk the DOM of that document on Windows, and it 
> works fine. But you try process the DOM on a Mac, it breaks, with your 
> XML processor saying, "sorry, an XML name cannot have a '-' 
> character". That will never happen. Why? Because (these are analogous 
> questions to the ones above concerning Subversion):
>
>  1. The XML specification guarantees that all XML processors agree on
>     what an XML name is.
>  2. Specifically, an XML name is composed of a NameStartChar followed
>     by any NameChar, as defined here:
>     http://www.w3.org/TR/REC-xml/#NT-Name
>
> Does that make sense? Can we answer those same two questions 
> concerning Subversion pathnames?
>
> Garret
>
Received on 2012-01-21 04:35:53 CET