RE: Unversioned files with invalid UTF-8 sequence in name confuse svn
From: Markus Schaber <m.schaber_at_codesys.com>
Date: Tue, 1 Mar 2016 14:07:19 +0000
Hi, Brane and Vincent,
From: Branko Čibej [mailto:brane_at_apache.org]
I guess we would need some "change locale" operation, which would at least update the saved locale in the .svn directory.
(Updating the actual on-disk filenames could be left to the tools the user uses to also update his other filenames...)
> > Currently you can't avoid the problem: if the user has used UTF-8 then
Python actually adopted a workaround to this problem called "surrogate escaping".
This mechanism is applied to filenames and similar "byte strings" during communication with the outer world, with the limitation that their purpose is just to transfer the contents of the 8 bit string from one OS interface to the other, with only limited interpretation or processing of them.
Basically, they encapsulate invalid bytes (which cannot be successfully transformed to the internal Unicode representation) to a lonely surrogate, and decode it back to the original byte on the output side.
A solution like this could help SVN to deal with miscoded filenames, and would allow e. G. an "svn rm" or "svn mv" etc.
When adopting such a solution, it should be strictly restricted to local filenames (the RA layers should refuse them), and I guess we could get away with not even allowing them to enter the local working copy database.
For screen output, we could translate them to escape sequences like \x1A, so "svn status" could work...
However, I'm not sure whether it's worth the work to support basically broken environments, but on the other hand, the Python guys did go that way.
> You might as well say that Unix (Linux) is broken and should be fixed (with
All recent Linux installations I saw had UTF-8 as their encoding (independent of the language / country settings actually in use). And I don't see any valid reason to use anything else nowadays, except for keeping compatibility with existing installations...
CODESYS® a trademark of 3S-Smart Software Solutions GmbH
Inspiring Automation Solutions
3S-Smart Software Solutions GmbH
Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade register: Kempten HRB 6186 | Tax ID No.: DE 167014915
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received
This is an archived mail posted to the Subversion Dev mailing list.