Re: caching proxies and SVN network perf

From: Brian Behlendorf <brian_at_collab.net>
Date: 2000-10-24 14:07:53 CEST

On Tue, 24 Oct 2000, Greg Stein wrote:
> Not a problem at all... There are two parts here:
>
> 1) Each version of a file has its own URL. Therefore, an HTTP GET of that
> URL means that the cache can retain copies of the individual version.
>
> 2) For a diff (say, from 5 to 7), SVN asks for v7 of the resource and
> appends a header to the request stating "I have v5 and understand <these>
> diff formats." The server can then return a diff from v5 to v7.
>
> The trick is to include a Vary: header which refers to those extra diff
> headers. The cache keys its values based on the URL and the contents of the
> headers listed in the Vary: header.
>
> The next person to ask for v7, with the v5 listed in the diff header will
> see the document returned from the cache.
>
> [ It would be nice to give a concrete example here, but like I said: I need
> to really dig into the diff-draft and concretely explore how this will be
> done. ]
>
> IOW, any HTTP/1.1 caching proxy that properly obeys the Vary: header (per
> the HTTP spec) *will* cache diff response. The cache can also hold
> individual versions of a file since each has a unique URL.

Yes, of course. The point in my last message was, another client asking
for the diff between v6 and v7 will require another long-distance request
to the remote server, because the proxy isn't going to know how to
decompose the request for the diff(v5,v7) into diff(v5,v6) and
diff(v6,v7); in fact even if it had diff (v5,v7) and diff(v5,v6) it
couldn't do diff (v6,v7). So, I really don't know statistically how much
the proxies will save. They'll save a lot if people always ask for
single-version increments, but I don't know how common that is versus
multi-version increments.

> Now, does akamai simply cache stuff out of the blue? No idea. It seems that
> people may need to have a biz relationship with akamai first. *shrug* My
> point wasn't to provide a concrete example, but to point out that a caching
> network *could* create some scaling benefits for SVN repositories.

For the read-only case, yes. Developers who want to read from and commit
to the same resources will still need to use the "main" svn server.

> > BTW, replicated repositories is something we (collab.net) are *very*
> > interested in helping see happen.
>
> Read-only copies shouldn't be too difficult. When somebody does a commit, we
> just redirect to the master. Having multiple masters that must resolve
> conflicts between them... icky. That will be a bitch. Although, I bet there
> is a lot of theory out there on how to do this, so it might be a "simple"
> reduction of theory to practice. Theoretically. :-)

http://www.bitkeeper.com/. But let's not worry about that for now....

Brian
Received on Sat Oct 21 14:36:12 2006

This message: [ Message body ]
Next message: Karl Fogel: "Re: Command-line verbosity"
Previous message: Brian Behlendorf: "Re: APR cvs-update problems?"
In reply to: Greg Stein: "Re: caching proxies and SVN network perf"
Next in thread: Greg Hudson: "Re: caching proxies and SVN network perf"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]