Nathan Hartman wrote on Mon, Nov 11, 2019 at 10:38:31 -0500:
> Speaking of testing, is this related to last week's issue, SVN-2079,
> which is unresolved, "SVN-2079 "utf8_tests.py should be made non-
> iso8859-1 specific" and if one is solved, would that (help) solve the
> other?
The two issues are related, yes: both of them are about conversion from the
internal encoding (UTF-8) to the encoding used by stdout and argv. In #2079,
we could do something along the lines of:
1. Check if "en_US.ISO-8859-1" is a valid locale name.
2. From Python, create a file whose name uses some non-ASCII.
3. add/commit that file.
4. Verify that 'svnadmin dump', 'svn ls', etc give the expected results.
5. Do the same thing with another encoding.
> From the issue [#807]:
>
> > Right now, if a log message contains characters that cannot be
> > represented in the client's locale, that log message will simply show
> > up as:
> >
> > "[unconvertible log msg]"
> >
> > Graceful degradation would be nice here :-).
>
> Questions:
>
> Is this still the case?
>
> If we're not sure, how can this be tested? (i.e., how to create and
> commit a log message that will cause this to manifest?)
Well, I'm sure there are better ways, but I just did this:
.
% svnadmin create r
% vim -b r/db/revprops/0/0
.
and manually added an svn:log property with a value that's invalid UTF-8 [svn:*
properties must use UTF-8 with LF line endings]:
.
% xxd r/db/revprops/0/0 | vipe
00000000: 4b20 380a 7376 6e3a 6461 7465 0a56 2032 K 8.svn:date.V 2
00000010: 370a 3230 3139 2d31 312d 3131 5431 363a 7.2019-11-11T16:
00000020: 3038 3a30 312e 3334 3437 3434 5a0a 4b20 08:01.344744Z.K
00000030: 370a 7376 6e3a 6c6f 670a 5620 330a ffff 7.svn:log.V 3...
^^^^
00000040: ff0a 454e 440a ..END.
^^
%
You can confirm it's invalid:
.
% iconv -f utf8 < r/db/revprops/0/0 > /dev/null
iconv: illegal input sequence at position 62
zsh: exit 1 iconv -f utf8 < r/db/revprops/0/0 > /dev/null
'svn log' gives:
.
% svn log file://$PWD/r
------------------------------------------------------------------------
r0 | (no author) | 2019-11-11 16:08:01 +0000 (Mon, 11 Nov 2019) | 1 line
?\FF?\FF?\FF
------------------------------------------------------------------------
%
So I think we can close it as "Fixed at some point"?
Thanks for bringing it up!
Daniel
(who wonders how many years ago the issue was fixed…)
Received on 2019-11-11 17:30:24 CET