[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Last-Modified HTTP header in GET responses

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Mon, 2 Sep 2019 16:20:16 +0200

On Fri, Jan 15, 2016 at 1:58 PM Ivan Zhakov <ivan_at_visualsvn.com> wrote:
> On 7 January 2016 at 10:34, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
> > On 6 January 2016 at 08:14, Greg Stein <gstein_at_gmail.com> wrote:
> >> Personally, I'd be more interested in the effects on the network and its
> >> caching ability. Do we really need to save CPU/IO on the server? Today's
> >> servers seem more than capable, and are there really svn servers out in the
> >> wild getting so crushed, that this is important? It seems that as long as
> >> proxies/etc can properly cache the results, and (thus) avoid future touches
> >> on the backend server, then we're good to go.
> >>
> >> If the patch doesn't affect the caching (which it sounds like "no"), then
> >> just go with it. Sure, it is neat to look at ayscalls, but... why? I don't
> >> understand the need to examine/profile. Educate me?
> >>
> >
> > The patch should not affect HTTP caching for two reasons:
> > 1. Browsers and proxies supports ETag and use it instead of
> > Last-Modified header.
> > 2. ETag and Last-Modified headers are used only for cache
> > re-validation when max-age is expired. But Subversion sets max-age=1
> > week for resources with specific revision in URL
> > (http://server/!svn/rvr/1/path). max-age=0 is only used for public
> > URLs without revision, i.e. http://server/path)
> >
> > As far I know proxy usage are limited to public servers with anonymous
> > access, since caching of HTTP responses with Authorization is
> > prohibited by RFC.
> >
> > Anyway I agree that trading bandwidth usage to save some CPU/IO on the
> > server doesn't make sense, but Last-Modified case is the different:
> > Subversion server wasting 10%+ of server resources to produce unused
> > header.
> >
> > I don't have access to svn.apache.org server performance stats, but I
> > suppose it's pretty busy server and Infra team would welcome any
> > Subversion server performance improvements.
> >
> Committed in r1724790.
>
> --
> Ivan Zhakov

A bit late perhaps, but apparently this change (removing the
Last-Modified header from GET responses) broke a specific use case at
my company (we just upgraded our SVN server from 1.9 to 1.10, bringing
along this particular change):

- We use Apache Ivy (http://ant.apache.org/ivy/) for dependency
management of our Java applications.

- Third party jar files are kept in our svn repo under
/trunk/ivyrepository (and branched / tagged in release branches, so we
have completely reproducible builds, even if our third party jars or
their dependency structures change on trunk).

- We use Ivy's "URL Resolver" [1], which downloads the files with
regular GET requests (and HEAD requests to check the up-to-dateness of
the cache on the client). We effectively use SVN in this case as a
"regular" file server (which coincidentally has branches and tags so
we can resolve against the correct tree when making a build).

This last part now fails, i.e. Ivy's URLResolver no longer detects
that a file has changed. It used to compare its own "last-mod time of
the file on disk in the cache" with the Last-Modified header, which
works fine with all kinds of file servers, and worked with SVN < 1.10.

I think it's unfortunate that we broke compatibility here (even if
it's not usage by a normal svn client) for the sake of some relatively
small performance / load gain on the server. If we could get the old
behavior back with some Apache directive, that would have been fine,
but there is no such option at the moment.

Also: if the Last-Modified would have been removed only for the
"internal GET urls" (like http://server/!svn/rvr/1/path), for
optimizing checkout (as executed by normal svn clients), that would
have been understandable. But why remove it for the "external GET
urls" (http://svnserver/path) as well? Those have nothing to do with
checkout load, those are only used by browsers or "tools using SVN as
a glorified file server" :-).

I am by no means an expert in HTTP standards, and various online
sources give different recommendations for these headers (ETag,
Last-Modified, ... request headers for conditional GETs, ...). But we
found an old discussion thread on the "dev_at_rapidsvn.tigris.org"
mailinglist from 15 years ago, discussing "a very basic idea: let
mod_dav_svn set the Last-Modified HTTP header ..." [2]. Perhaps the
feature dates from back then (indicating that it wasn't an accidental
feature)?

Anyway, how about bringing this feature back in some form?
- Revert r1724790?
- or only for "external GET urls"?
- or only if some Apache directive is set?

Thoughts?

[1] http://ant.apache.org/ivy/history/2.5.0-rc1/resolver/url.html
[2] https://dev.rapidsvn.tigris.narkive.com/oRlt6xsW/last-modified-http-header-from-mod-dav-svn

-- 
Johan
Received on 2019-09-02 16:20:39 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.