Re: [Issue 1796] defective or malicious client can corrupt repository log messages

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Fri, 01 Aug 2008 10:20:13 +0100

On Fri, 2008-08-01 at 02:17 +0200, Neels Janosch Hofmeyr wrote:
> A number of weeks ago, there was a discussion on validation of the
> commit log messages on their journey from client to server and back.
>
> It was said, that:
>
> Neels Janosch Hofmeyr wrote:
> > So, right now, there is only *one* place where props get
> > normalised/checked for consistence:
> > - where the svn client receives a log message from the user
> >
> > The places, where checking the props is, supposedly, missing, are:
> > - where the server receives props from a client out there.
> > - where the server reads props from the repository file system.
> > - where the svn client reads props from a server out there.
>
> The first of the latter three has been fixed (issue 1796).
> The last two are still lurking.
>
> Since, I've had a discussion on the implications of fixing these latter
> two, with stsp.
>
> Imagine that someone has a repository containing log messages with CR or
> non-UTF8 sequences. Then, *we* come along and make the server validate
> log messages read from the file system, plus make the client validate
> log messages received from the server. In effect, the user isn't able to
> simply *look* at the log message anymore.
>
> It struck us as a rather dumb situation, and I am since of the opinion
> that the part of a log message's journey going in the direction towards
> the user should not have prohibitive log message validation.

Yes, that's sensible.

> If everyone agrees, the question remains whether server and client
> should print out a warning along with passing an inconsistent log
> message towards the user.
>
> What do you think?

A couple of rules of thumb should help.

It's the server's job to maintain the integrity of the historical data.
It doesn't have to do this by checking the data every time it reads it,
and doing so could be inefficient. It would be sufficient to have a way
to check it occasionally, maybe off-line or at scheduled maintenance
intervals, such as in the "svnadmin verify" command or in an external
tool that scans a dump-file.

The client side should validate historical data only to the extent
needed to do its job: so, if it needs to convert the log message from
UTF-8 to Latin1 and finds the message isn't valid UTF-8, that's a case
where it's reasonable to throw an error or a warning. Throwing a warning
when it doesn't need to would be annoying and a waste of effort. The
user would say, "Why is Subversion complaining to me when I read this
log message? I don't have a problem reading this even though it's got a
funny character in it. Subversion should be telling the administrator
instead if there's a problem with the repository."

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-08-01 11:20:43 CEST

This message: [ Message body ]
Next message: Julian Foad: "Re: Improved mime-type guessing (issue #1233)"
Previous message: Hyrum K. Wright: "Improved mime-type guessing (issue #1233)"
In reply to: Neels Janosch Hofmeyr: "Re: [Issue 1796] defective or malicious client can corrupt repository log messages"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]