On Tue, Jun 5, 2012 at 11:55 PM, Justin Erenkrantz
<justin_at_erenkrantz.com> wrote:
> On Tue, Jun 5, 2012 at 8:28 AM, Mark Phippard <markphip_at_gmail.com> wrote:
>> Keep in mind that this is not about server load, they are looking at
>> this from bandwidth. So a cache in front of the server would not help
>> them at all. Eclipse.org has a 10MB Internet link that is almost
>> always saturated.
>>
>> Also, if it was not clear, Subversion is not involved here. This is a
>> plain Apache server.
>
> IMO, having a 10Mbps link (if that is indeed the case) is probably the
> root of the problem...that's ridiculously underprovisioned for a
> public site. Any type of update checks for a heavily used product no
> matter what the underlying protocol is would saturate that once they
> get enough users. An easy thing for them to have done (for example)
> is to shove the check on their mirrors or a CDN (hi Fastly!) or
> something similar.
It is possible I mis-reported the number. Eclipse already uses
mirrors extensively and the product has built-in support to use the
mirrors for the checks. I believe the main issue for the Eclipse
servers are in servicing all of the development builds and
infrastructure as these do not use the mirrors. That said, bandwidth
is always an issue as you can see from all of the blog posts from the
webmaster:
http://eclipsewebmaster.blogspot.com/search?q=bandwidth
> This goes back to having a simple protocol which is the antithesis of
> the REPORT call in ra_neon. Having a simple straightforward protocol
> allows you to easily drop in caches - REPORT won't let you do that as
> the responses are too specific to the user to permit any type of
> caching which will limit your options when you do hit load.
They are already serving virtually the entire site out of a memory cache:
http://eclipsewebmaster.blogspot.com/search?q=cache
This issue was purely about using the available bandwidth better. The
thing I found interesting is that when you look at these simple small
requests down at the packet level and add them all up (when there are
millions of these every day) the amount of available bandwidth they
suck up is interesting. We have done good work with Serf and HTTPv2
to eliminate a lot the smaller requests we used to do. Hopefully this
is motivation for removing more of them, such as the ones Mike Pilato
mentioned. I always thought of all those HEAD and PROPFIND requests
from the point of view of a client on a high latency connection, and
how it made things slower than it needed to be. I just thought it was
interesting to see how these small requests can add up on the backend
to consume a significant amount of the available packets that can be
sent/received in any one day. It is just food for thought.
--
Thanks
Mark Phippard
http://markphip.blogspot.com/
Received on 2012-06-06 17:19:23 CEST