[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Filenames with trailing newlines wreak havoc

From: Ben Reser <ben_at_reser.org>
Date: Wed, 27 Mar 2013 10:01:01 -0700

On Wed, Mar 27, 2013 at 9:12 AM, Julian Foad <julianfoad_at_btopenworld.com> wrote:
> So there is a compromise between the theoretical
> independence which allows FS to have a different definition of valid
> filenames, and the practical issue that if we actually do make it
> different from the repos layer's definition then it's more work to
> maintain.

I think we need to recognize that using libsvn_fs alone is entirely
theoretical.

* How many people are really going to create a dependency of the
entire Subversion setup in order to just use libsvn_fs? We don't ship
it separately. It's designed around our needs.
* If you are inclined to use libsvn_fs then you're inclined to use the
same limitations that our client library has because you're probably
wanting to interoperate with normal SVN clients. Great examples of
this are svk and git-svn. Both of which talk our wire protocols (svk
actually using our repos and ra layers) without using our client or wc
library or formats at all.

If people really were running around putting control characters in
their repository file names then I don't think it would have taken
this long for this issue to come up. FSFS has been the primary
filesystem for how long now? If there were real active use cases like
the argument that the FS is more liberal than our other layers then I
firmly believe we'd have run into this before.

> Something like the proposal above, to reject LF only at the FS (or FSFS)
> layer, and all control characters in libsvn_repos, sounds good to me.
> We should write unit tests to ensure that the FS layer works
> properly with all other control characters.

I really don't like this proposal. I think a major reason why we're
in this situation is that we've been inconsistent. Why does fsfs not
handle newlines in filenames? We could have just as easily used some
other delimiter, or had an encoding to allow entirely arbitrary node
names. The answer in my opinion is that we thought control characters
were not allowed because the client library rejected them. If we were
building a new file system right now without this having come up we
might even make the same mistake. There is a huge benefit of
consistency. I wish the code reuse argument was realistic, but it's
not. So I think the bigger benefit here is to be consistent.
Received on 2013-03-27 18:01:36 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.