[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Are log messages Unicode?

From: Barry Scott <barry_at_barrys-emacs.org>
Date: Wed, 16 Jul 2008 22:41:13 +0100

My user says that repos was created by cvs2svn and wonders if it is
the source
of the bad log entry.

Barry

On Jul 13, 2008, at 22:37, Neels Janosch Hofmeyr wrote:

> Hi list, long time no see :)
>
> Daniel Shahaf wrote:
>> Karl Fogel wrote on Mon, 7 Jul 2008 at 12:15 -0400:
>>> "Ben Collins-Sussman" <sussman_at_red-bean.com> writes:
>>>> On Sun, Jul 6, 2008 at 5:23 AM, Barry Scott <barry_at_barrys-
>>>> emacs.org> wrote:
>>>>> Using the svn_client API is it possible for a client to write
>>>>> none-UTF-8 log messages?
>>>>> Clearly if this happened it would be a bug in the client given the
>>>>> above statement.
>>>> I don't recall the details, but it's actually the *programmers'*
>>>> burden to convert paths and log messages from native locale to UTF8
>>>> (and back again). If you read the svn APIs, you'll notice that
>>>> every
>>>> path and log message passed into APIs (or passed around between
>>>> APIs)
>>>> are presumed to *already* be UTF8. So if you're writing your own
>>>> client, it's your job to convert user input to UTF8 before
>>>> passing to
>>>> svn_client_*(). Look at the commandline client to see how it's
>>>> doing
>>>> that; I believe there a number of convenience routines in
>>>> libsvn_subr
>>>> to help with conversion.
>>> I think Barry's asking if the client and/or server do any
>>> validation.
>>> That is, if the programmer supplies a non-UTF8 log message, our
>>> client
>>> libraries should reject it; and if such a log message were to
>>> reach the
>>> repository (perhaps because someone wrote their own client
>>> software from
>>> scratch), the repository should reject it too.
>>>
>>> I don't know whether we do such validation or not, but agree we
>>> should.
>>>
>>
>> Since r31614 (Neels' fix of issue #1796) we do UTF-8 validation of
>> log
>> messages in libsvn_repos. It has not been backported to 1.5.x.
>
> Quoting message "[PATCH] issue 1796: ..." from 03 Jun 2008 by me:
>
> "
> The subversion server and client do not validate props in places where
> they should:
> - where the server receives props from a client out there. (#1796)
> - where the server reads props from the repository file system.
> - where the svn client reads props from a server out there.
> (Approval by kfogel)
>
> [My] patch starts by fixing the specific problems of issue 1796, only:
> - where the server receives props from a client out there. (#1796)
> , and limited only to the log message prop (SVN_PROP_REVISION_LOG).
> "
>
> I am still intending to continue on these issues... (I have been
> diverted because of the social shock following a recent unexpected
> death
> in my close family)
>
> I am still at the point where I am trying to find out
>
> - the best place to validate props being read from the repository file
> system by the server;
>
> - how to write a unit test on whether the server validates props read
> from the file system (the code that writes *to* the file system now
> validates props; so, how do I get *unvalidated* props written to the
> file system in the first place?);
>
> - the best place to validate props in the client, reading from a
> server
> out there;
>
> - how to write a unit test on whether the client validates props read
> from a server out there;
>
> - which other props need to be validated;
>
> - what the formats for these other props are (are they, by chance, all
> UTF8 & LF? That would be nice.).
>
> Since other/more people are taking interest in these issues, maybe it
> would make sense to file separate issues in the issue tracker for the
> remaining two cases? :
>
> - where the server reads props from the repository file system.
> - where the svn client reads props from a server out there.
>
>>
>> The cmdline client also does some conversions; in my case, it
>> dropped the bytes it couldn't understand:
>>
>> % svn ci iota -F dump-fragment.txt
>> Sending iota
>> Transmitting file data .
>> Committed revision 2.
>>
>> # It should have failed. Let's see...
>> % xxd ../../repos1/db/revprops/0/2
>> ...
>> 00000a0: 370a 7376 6e3a 6c6f 670a 5620 3130 310a 7.svn:log.V
>> 101.
>> 00000b0: 4269 7462 7563 6b65 7420 7273 6572 7620 Bitbucket rserv
>> 00000c0: 2064 6576 2f6e 756c 6c0a 436c 6173 7365 dev/
>> null.Classe
>> ...
>>
>> # Ah, but that's not the log message I specified!
>> % xxd dump-fragment.txt
>> 0000040: 380a 0a4b 2037 0a73 766e 3a6c 6f67 0a56 8..K
>> 7.svn:log.V
>> 0000050: 2031 3031 0a42 6974 6275 636b 6574 2072
>> 101.Bitbucket r
>> 0000060: e973 6572 76e9 20e0 2064 6576 2f6e 756c .serv. . dev/
>> nul
>> # It dropped these bytes: ^ ^ ^
>>
>>> Barry, got time to test/trace it?
>
> Hm, that's not nice. Silently dropped bytes aren't good. The user
> should
> at least be informed about what's happening...
>
> --
> Neels Hofmeyr -- elego Software Solutions GmbH
> Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
> phone: +49 30 23458696 mobile: +49 177 2345869 fax: +49 30 23458695
> http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
> Handelsreg: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr:
> DE163214194
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-07-16 23:41:46 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.