[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Filename encoding in working copy (was: Check-out fails with LANG=C)

From: Vincent Lefevre <vincent-svn_at_vinc17.net>
Date: Fri, 19 Jul 2013 17:27:11 +0200

[Cc to the dev@ list]

On 2013-07-19 16:50:49 +0200, Stefan Sperling wrote:
> On Fri, Jul 19, 2013 at 04:32:50PM +0200, Vincent Lefevre wrote:
> > [...]
> >
> > Actually I think that the encoding needs to be stored somewhere in the
> > working copy. Otherwise even if the user never changes the encoding,
> > problems may occur, and this is also true with the current behavior.
> > Indeed it was said in the past that USB keys were supported. So, move
> > a USB key to a different computer, where the encoding specified by the
> > environment is different... and see what happens if you try to do an
> > "svn update"...
> Simply storing the encoding doesn't really solve anything. Sure, it
> remembers the LC_CTYPE value as the time the working copy was created.
> But then... what?

At least it would work better. And in some cases, one wouldn't notice
any problem. For instance, a repository (and therefore the corresponding
working copies too) may contain mostly filenames with only US-ASCII
characters, and when moving a USB key to another computer, one may be
interested only in a part of the working copy with such filenames.

> We also need to specify some new behaviour that increases user
> convenience for such a new feature to have any value.
> For this, we need answers to questions like:
> How can the client detect whether the stored encoding name matches
> the on-disk encoding? What does it do when they differ? How can users
> re-encode filenames in the working copy when on-disk encoding has changed?

I don't understand what you mean here. If the chosen encoding for the
filenames is stored in the working copy: The on-disk encoding has been
chosen by Subversion (at checkout time). So, initially it matches the
stored encoding. This normally doesn't change... unless the user has
explicitly chosen to re-encode the filenames globally with some tool.
In such a case, the user also needs to modify the stored encoding, and
Subversion can provide a command for that. Subversion can also provide
a command to re-encode the filenames of a working copy and update the
stored encoding.

The main problem is when a working copy is moved to another machine,
on which the chosen encoding is not supported on the new machine. But
I don't think Subversion can do anything about it. The user needs to
make sure that the chosen encoding is supported on the new machine.
This should be the case for ASCII (+ escaping mechanism) and UTF-8.

> I'm very much interested in enhancements in this area, but at this
> point we don't need to rehash all the problems there are. We need
> to design solutions. This discussion needs a change of direction towards
> being more constructive, or it will die with no results. The discussion
> is also increasingly off-topic for the users@ list.
> In other words, I'm happy to continue this discussion on the dev@ list
> and review your proposed design specs and patches there.

Vincent Lefvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Received on 2013-07-19 17:27:44 CEST

This is an archived mail posted to the Subversion Users mailing list.