On 08/03/2011 10:40 AM, Daniel Shahaf wrote:
> cmpilato_at_apache.org wrote on Wed, Aug 03, 2011 at 14:00:39 -0000:
>> + (As an aside, Serf's potential as a platform for future
>> + improvement remains unproven and doubtful. For example, HTTPv2
>> + removes canonical resource URLs, which works against the caching
>> + proxy concept that seems to be the strongest argument in favor of
>> + Serf's approach. But that's not strictly germane here.)
>
> That sounds odd for caching not to be taken into consideration in
> HTTPv2's design. And glancing at the httpv2 design notes suggests that
> it was explicitly a goal.
>
> Are you saying that somehow HTTPv2 actually made the cacheability
> situation worse in some cases? Or just that it doesn't make the
> situation as as good as it promised to?
As I dig into this a bit, I realize that the situation isn't quite as bad as
I originally thought. But that's only because of a coding oversight on my
part. (Happy accident?) I'll explain below. Note that I'm assuming that
cacheability works best when the RA layers use a single canonical URL to
fetch a given resource.
Let's first talk about the cost of addressing any particular server
node_at_revision resource. In HTTPv1, clients couldn't just calculate the URL
of a resource -- they had to negotiate with the server using WebDAV/DeltaV
abstractions. Multiple roundtrips per calculation ... performance shutdown
... you get the picture. mod_dav_svn helps here by transmitting in its
update-style REPORT responses a "version resource URL", which the client
caches via the davprops store in the working copy to avoid future costly
lookups. HTTPv2 facilitates client-side construction of resource URLs
without server negotiation, therefore has no need of the davprops persistent
cache mechanism, and as such the code doesn't use the davprops stuff at all
when HTTPv2 is active.
The second factor of interest here is the canonical URL issue. If I have a
file that was created in revision 1 and remains unchanged henceforward, a
client can address that file via any number of URLs. After all, file_at_1 ==
file_at_2 == file_at_3 == ..., right? mod_dav_svn again tries to help here by
normalizing the version resource URL that it sends to the client for a given
resource based on the created-path and created-rev of the resource. So no
matter which version of our file we're talking about, mod_dav_svn will
report its versioned resource URL as:
.../!svn/ver/<CREATED-REV>/<CREATED-PATH>
Here's where I think the current code falls short. While the update process
still pays attention to the canonical version resource URL transmitted by
the server (that was the happy accident ... ra_serf *could* be ignoring that
today in favor of self-constructed URLs), that URL isn't cached in the WC
any longer. This means that future (non-update-style) operations performed
by the client will be addressing the resources by some self-constructed,
probably-non-canonical URL. Stuff still works, of course, but this eats at
the cache-friendliness.
Does that help to explain things better?
-- C-Mike
[SIDEBAR: It just occurred to me that the server is still transmitting
HTTPv1-style version resource URLs, not HTTPv2-style URLs ... I guess that's
a separate issue, though.]
--
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
Received on 2011-08-04 16:58:30 CEST