[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS format7 and compressed XML bundles

From: Vincent Lefevre <vincent-svn_at_vinc17.net>
Date: Tue, 5 Mar 2013 16:47:01 +0100

Hi Julian,

On 2013-03-05 13:30:28 +0000, Julian Foad wrote:
> Vincent Lefevre wrote:
>
> > On 2013-03-01 14:58:10 +0000, Philip Martin wrote:
> >> A server-side solution is difficult.  Suppose the client has some
> >> uncompressed content U which it compresses to C and sends to the server.
> >> The server can uncompress C to get U but unless the compression scheme
> >> has a canonical compressed form, with no other forms allowed, the server
> >> cannot avoid storing C because there is no guarantee that C can be
> >> reconstructed from U.
> >
> > This is not specific to server side. Even on the client side, the
> > reconstruction may not be always possible, e.g. if the system is
> > upgraded or if NFS is used. And the compression level may need to
> > be detected or provided in some way.
>
> Hi Vincent.  I'm not sure you understood Philip's point.

This should be more clear about what I meant below. What I'm saying is
that whether this is done entirely on the server side (a bad solution,
IMHO) or on the client side (see below why), the problems are similar.

> His point is (correct me if I'm wrong) that Subversion's design
> requires that during a checkout or update, the server must
> reconstruct a file containing exactly the same bit pattern that the
> client sent when committing the file.  Compression schemes in
> general don't guarantee that expanding and then compressing will
> produce the same compressed bit pattern, even if you take care to
> use the same "compression level".  Therefore, the server cannot
> simply expand the data before storing it and then re-compress it
> during checkout or update, because, although the resulting
> compressed file would be a valid representation of the user's data,
> it would not satisfy Subversion's own requirement that the bit
> pattern be identical to what was sent by the client during the
> commit.

You say that the server expands the data before storing it. This is
for a server-side only solution, I assume. But even if there would
be no problems with the construction/reconstruction, it would be a
bad solution, IMHO. Indeed, for a commit, it is the client that is
supposed to expand the data before sending the diff to the server,
and for an update, it is the client that is supposed to recompress
the data before storing it to the WC. Actually, the server doesn't
need to know how the file was compressed, it just needs to record
information about the compression (but doesn't need to know what
this means exactly).

> That point _is_ specific to a server-side solution.  With a
> client-side solution, the user's word processor may not mind if a
> versioning operation such as a commit (through a decompressing
> plug-in) followed by checkout (through a re-compressing plug-in)
> changes the bit pattern of the compressed file, so long as the
> uncompressed content that it represents is unchanged.

I disagree. The word processor may not mind (in theory, because
in practice, one may have bugs that depend on the bit pattern,
and it would be bad to expose the user to such kind of bugs and
non-deterministic behavior), but for the user this may be important.
For instance, a different bit pattern will break a possible signature
on the compressed file.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Received on 2013-03-05 16:47:35 CET

This is an archived mail posted to the Subversion Dev mailing list.