"Cooke, Mark" <mark.cooke_at_siemens.com> writes:
> Quick Summary: subversion (both TortoiseSVN and the command-line
> client provided by TSVN) is changing certain characters whilst using
> Basic Authentication (over https, from Windows XP) to apache 2.2 (on
> Windows Server 2003). So far I have confirmed this for the UK
> keyboard `£` (SHIFT-3):
>
>> When using a browser, I get the following for <shift>-1
>> through <shift>-0 on my UK keyboard (bounded by '[]'):
>>
>> 2012-04-17 16:03:09.734000 : svntest [!"£$%^&*()]
>>
>> ...but when I use the svn command line client I log instead:
>>
>> 2012-04-17 16:01:52.124000 : svntest [!"œ$%^&*()]
>>
>> Note that the `£` is now different. I think that this explains
>> the `Password Mismatch` error?
>
> Philip Martin has already responded (thanks!) with:
>
>> Non-ascii passwords are a problem for HTTP because there is
>> no standard for encoding the password before constructing the
>> digest, nor is there a standard for the client to tell the
>> server which encoding it used. Because there is no standard
>> clients tend to do different things. Some clients will
>> convert the password to UTF-8, some clients will convert to
>> some other encoding, and some clients will leave it in whatever
>> encoding the user entered.
>
> ...which helps to explain the problem (except we are using `basic`
> plain text, not digest) but I cannot believe that we are the only
> subversion users with this problem, what about other users with
> non-latin character sets (Russia, Israel etc)?
You have exactly the same problem with basic auth, there is no standard
for encoding non-ASCII passwords. It's generally possible to adjust the
password storage on the server so that any given client works, but it
not possible to get all clients to work.
Suppose I have a password consisting of a single '£' character. In
ISO-8859-1 that is the single byte 0xA3, in UTF-8 that is two bytes 0xC2
0xA3. If I combine that with a username pm2 the the basic auth token is
given by
$ echo -n pm2:£ | base64
In ISO-8859-1 this gives 'cG0yOqM=' while in UTF-8 it gives 'cG0yOsKj'.
When you store the password on the server in an htpasswd file you choose
to store either the literal passowrd or a password hash. If you store
the password literally as the line
pm2:£
you have to choose how to store the password. If you use the one-byte
form the 'cG0yOqM=' auth token will work, if you use the two-byte form
the 'cG0yOsKj' auth token will work. If you use some other form, such
as UTF-16, then neither of those tokens will work.
It's more usual to store password hashes but the same problem occurs.
If you store the password hash, using a salt AA, it's typically
$ mkpassword £ AA
in IS0-8859-1 this leads to the line
pm2:AACiVWnPwZTeE
and the 'cG0yOqM=' token will work. In UTF-8 it leads to the line
pm2:AAzOZFufPfaOQ
and the 'cG0yOsKj' token will work.
A client like curl does no password encoding conversion, so the command
$ curl http://... -u pm2:£
will send the token 'cG0yOqM=' when running in an IS0-8859-1 environment
and the token 'cG0yOsKj' when running in UTF-8. Only one of these will
work depending on how the htpasswd file is set up.
A client like the svn converts passwords from the command line or
keyboard to UTF-8 so the command
$ svn cat http://... --username pm --password £
will always send the 'cG0yOsKj' auth token. This will work if the
htpasswd has been setup for UTF-8 and it will work whatever environment
is being used by the client, but will fail if the htpasswd file has not
been setup for UTF-8
Other clients such as TSVN or web browsers may behave like curl, or they
may behave like svn, or they may do something else. By adjusting the
setup on the server you can generally get any given client to work in
any given encoding, but there is no way to get all clients to work in
all encodings.
It gets even more complicated when you consider password caching: the
passwords that Subversion stores are in UTF-8 and Subversion assumes
that they are still UTF-8 when retrieved. However if the password store
is shared with other clients, say a web browser, then those other
clients may have stored non-UTF-8 passwords and this will cause
Subversion to send non-UTF-8 auth tokens. That works if the server is
setup so that non-UTF-8 tokens work.
--
Philip
Received on 2012-04-18 13:56:01 CEST