Karl Fogel wrote:
> Neels Janosch Hofmeyr <neels_at_elego.de> writes:
>> So, the svn client does try to convert the incoming log message to UTF-8
>> and complains if it can't.
>> This completes the picture where the client behaves perfectly, but the
>> server accepts anything it is given and writes it to the repository.
>> Thus, if someone used a forged client, they could "corrupt" the
>> repository, which may print unreadable logs due to unexpected character
>> sequences. The user's shell display might "crash" when trying to print
>> paranormal character sequences from another dimension.
> Hrm? The top seems to contradict the bottom... An introductory summary
> in your mail would help a bit :-).
> Your transcript shows the client attempting to convert various
> bytestrings to UTF8, to create a log message to send *to* the
> repository. Your transcript doesn't show any instances of the client
> attempting to convert data it received *from* the repository, which is
> the scenario you're talking about in your final paragraph above.
> My memory is that when the client receives a log message from the
> repository, the client attempts to convert it to UTF-8 and complains (in
> a terminal-safe way) if it can't. If that's not the case, we should fix
> it. However, this seems to have nothing to do with what you showed in
> your mail.
> Am I misunderstanding something?
Oh, you're right. I never said anything about *that*.
I am all the time talking about the user giving the svn client a log
message which is sent to the server. If the log msg given by the user is
invalid (non-UTF-8/LF), then the normal svn client normalises it or
complains if it can't. The server receives only valid log messages.
But if the user forges her client to skip normalising, then enters an
invalid log message, which is sent in its invalid form to the server,
then the server happily writes the invalid log message to the repository.
This is what I've been trying to establish with all of my previous
mails. Now, I notice that I omitted one direction of data transfer in my
descriptions: the data flow from the server back to the client. So, let
me complete the picture:
Say that an invalid log message has been written to the repository. If,
then, the normal (unforged) svn client is invoked with, e.g., `svn log
<file>', the invalid log message is read from the repository and passed
to the client, which does no normalising or checking whatsoever and
prints the invalid characters to the screen directly.
(I repeat, this happens when using the normal svn client without any
malicious modifications. I only forged the part where the user gives a
To confirm this, look in the same test logs that prove the point that
the server accepts invalid log messages. You can see that `svn log'
produces those same invalid characters.
So, right now, there is only *one* place where props get
normalised/checked for consistence:
- where the svn client receives a log message from the user
The places, where checking the props is, supposedly, missing, are:
- where the server receives props from a client out there.
- where the server reads props from the repository file system.
- where the svn client reads props from a server out there.
The place where I don't know yet what happens is:
- where the client receives any svn:prop other than a commit log from
I'd just like to ask: is it considered a lot of overhead to check all
svn:props for utf8 and proper LF in all of the places discussed? If both
the client and server check all the time, then each prop is checked at
least twice for a given operation.
Neels Hofmeyr -- elego Software Solutions GmbH
Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23458696 mobile: +49 177 2345869 fax: +49 30 23458695
http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
Handelsreg: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194
Received on 2008-05-26 23:01:37 CEST