(For Brian B.)
Well, before the server can send any updates at all, the client has to
communicate the revision numbers of all the files being updated. If
you update from the top of the APR tree, that's every file in APR :-)
(unfortunately, because revision numbers are per-file, CVS has to list
each one -- basically, it spits the contents of its Entries files at
the server). That's probably what's causing the delay. Building the
stuff in /tmp first on the server side also doesn't help.
I doubt that making CVS parallelize these is practical. That's a lot
of work to fix something we're hoping to make obsolete anyway.
[Just a note: In Subversion, the client also has to communicate its
local state to the server, but does so efficiently by expressing a
single base version and the places where it differs from that base.
It doesn't have to list each file. And the server will not be
building no trees in /tmp no way. :-) ]
-K
Brian Behlendorf <brian@collab.net> writes:
> I've noticed this happens too - and fwiw, I am responnsible for the
> admin of apache.org, so I've got an interest in fixing this. =)
>
> Given that we have a few people here who are familiar with CVS internals,
> let me ask - before a cvs update starts updating the remote client, there
> is always this delay. What's it doing? Is it recursing through the
> client and server trees, noting where the deltas are first, before sending
> any data? It appears to be building a huge dir in /tmp for the
> specific process, building up the actual content to be shipped; I presume
> this means recursing through the whole tree, which for something like apr
> can be kinda big (and locus's I/O is pretty impacted). Karl/Jim, does
> this make sense? If so, is there a way to have that streamed, so that
> network I/O doesn't have to wait for all that disk I/O to complete first?
>
> Brian
Received on Sat Oct 21 14:36:12 2006