[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn_client_args_to_target_array and peg revisions -> no good

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Tue, 02 Jun 2009 13:38:22 +0100

Stefan Sperling wrote:
> On Sat, May 30, 2009 at 11:51:09AM +0100, Stefan Sperling wrote:
> > I'll need more time diving into the matter to pin-point the exact
> > cause of the problem.
>
> Still talking to myself here, but just following up with latest
> thoughts in case anyone is listening.

Hi Stefan. I'm listening!

Your analysis (below) looks spot on, and I agree that replacing the
string data format with a (path, rev) structure would be good. See the
thread ending with this message
<http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=40874&orderBy=createDate&orderType=desc> in which Troy was starting to do just that. Troy, ping?

However, I think first it would be sensible to fix the problem with "svn
add" by simply making it parse peg revisions*, and preferably check that
no peg was specified. That should be sufficient to fix the reported bug.

The API change could be done as a separate internal improvement.

(* I believe all commands should accept peg-rev syntax even where peg
revs don't make sense, like the quote you found in the book suggests.
Otherwise the command-line syntax would have ugly consistencies. For
example, a sequence of commands specifying the same local target such as
"svn add $1; svn commit $1" would not work when the target contains
"@".)

- Julian

> Q: Why does issue #3416 happen?
>
> A: Well, let's look at one case at a time.
>
> First case:
>
> $ svn add 'dir/@File.txt'
> svn: warning: 'dir_at_File.txt' not found
>
> This happens because svn_client_args_to_target_array() splits peg
> revisions off all path arguments passed on the command line.
> 'dir/@File.txt' gets split into
> target_path = 'dir/'
> peg_rev = '@File.text'
> The function then canonicalizes the target path, so we get:
> target_path = 'dir'
> peg_rev = '@File.text'
> Then the two get concatenated again, resulting in 'dir_at_File.txt',
> the string which eventually ends up going to 'svn add' unmodified.
> 'svn add' does not try to parse peg revisions in its arguments, so
> it tries to add a file called 'dir_at_File.txt' which isn't what the
> user wanted.
>
> Second case (the user tries to escape the first @ by appending another):
>
> $ svn add 'dir/@File.txt@'
> svn: warning: 'dir/@File.txt@' not found
> In this case, svn_client_args_to_target_array() splits the path up as:
> target_path = 'dir/@File.text'
> peg_rev = '@'
> Then the two get concatenated again, resulting in 'dir/@File.txt@',
> the string which eventually ends up going to 'svn add' unmodified.
> But 'svn add' does not try to parse peg revisions in its arguments,
> so even the escaping @ is just being ignored, which isn't what the
> user wanted.
>
> Q: OK well, so why does svn_client_args_to_target_array() concatenate
> these strings again after having split them up? It looks like
> doing so messes things up!
>
> A: Part of svn_client_args_to_target_array()'s job is to canonicalize
> input paths. It concatenates the target and the peg_rev again because
> when the paths end up in the argument list of the subcommand function,
> many subcommands will try to parse peg revisions again in a very similar
> way to what svn_client_args_to_target_array() already did.
> So svn_client_args_to_target_array() was written to accommodate for
> subcommands still doing their own peg-rev parsing. But not all subcommands
> actually do their own peg-rev parsing, e.g. 'svn add' does not.
>
> This is why we are seeing problems.
> !! There seems to be no single point of responsibility for peg-rev
> !! parsing in our code. Instead, peg-rev parsing is done in a rather
> !! ad-hoc and uncoordinated fashion. This is the root cause of #3416.
>
> Q: Right. What function(s) should do the parsing then?
>
> A: It seems like there is no consensus on this currently. One approach
> would be to parse peg-revs only for subcommands that can interpret them,
> and just interpret the '@' character as a literal '@' for all other
> subcommands, saving the user from having to escape '@' characters in
> arguments for those subcommands. But svn_client_args_to_target_array()
> currently breaks this approach, because it always runs before every
> subcommand and parses peg revisions in every argument path.
>
> The other approach would be to always do peg-rev parsing in all subcommands,
> and require users to always escape @ characters in all their filenames by
> appending an '@' to the filename, no matter which subcommand is being
> called. Some subcommands such as add currently prevent such escaping
> from happening, because they don't do any peg-rev parsing at all and
> just take all '@' characters literally.
>
> The svnbook suggests that the second approach is what should be
> happening:
> http://svnbook.red-bean.com/nightly/en/svn.advanced.pegrevs.html
> The perceptive reader is probably wondering at this point whether the
> peg revision syntax causes problems for working copy paths or URLs that
> actually have at signs in them. After all, how does svn know whether
> news_at_11 is the name of a directory in my tree or just a syntax for
> “revision 11 of news”? Thankfully, while
> svn will always assume the latter, <--- this statement is currently wrong!
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> there is a trivial workaround. You need only append an at sign
> to the end of the path, such as news_at_11@. svn cares only about the last
> at sign in the argument, and it is not considered illegal to omit a
> literal peg revision specifier after that at sign. This workaround even
> applies to paths that end in an at sign—you would use filename@@ to talk
> about a file named filename@.
>
> Q: Wouldn't it make more sense to do peg-rev parsing just once, and
> pass the parsed information to subcommands in a structured form that
> does not require subsequent parsing to get at peg-rev information?
> That way we'd have a clear point of responsibility for peg-rev parsing.
>
> A: Yes! I'm thinking that such an approach would be the most appropriate
> fix for issue #3416. Instead of passing targets as pure C strings
> with embedded peg-rev information, we'd pass a struct such as:
>
> struct { const char *path, svn_revision_t peg_rev} svn_target_t;
>
> svn_client_args_to_target_array() would create an array of such
> target structs. Subcommands would stop doing their own parsing.
> Those interested in peg revisions would refer to the 'peg_rev' field,
> while the other subcommands would just use the 'path' field.
>
> And an empty 'path' field would be an error. Conceptionally, this already
> is an error. But because of the way svn_client_args_to_target_array()
> splits and concatenates strings it currently is not an error.
> Which is why this works:
> $ svn add '@File.txt'
> A @File.txt
> svn_client_args_to_target_array() splits this up as:
> target = ''
> peg_rev = '@File.txt'
> Then the two get concatenated again, resulting in '@File.txt',
> the string which eventually ends up going to 'svn add' unmodified.
>
> Does anyone see anything wrong with the above observations and the
> suggested approach for a fix?
>
> Thanks,
> Stefan

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2358763
Received on 2009-06-02 17:02:31 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.