This is mainly to Mike Pilato, but I thought others might be
interested too:
Mike,
As we discussed, I'm working on efficiency issues in the new
[un]deltification scheme. In fs-test.c, I've tried several variations
in large_file_integrity(). The big data point is that if you set the
`filesize' to 102400, the retrievals of old revisions are quick, but
at 102401 (svndiff window size + 1), they get very *slow*, and
fibonaccically slower as the revisions get older. :-)
The test passes in both cases.
(As an aside, when I set filesize to ((3 * svndiff_window_size) + 1),
the test uses up an outrageous amount of memory and in fact hung my
machine. Yes, there is something nonlinear going on here.)
BUT, with it set to 102400, I've noticed a couple of weird things
going on:
1. rep_read_range() sometimes gets called with an `offset' parameter
of 102400. I don't yet know why that's happening; it shouldn't,
right? The highest offset in that file would be 102399.
2. rep_read_range() is also sometimes skipping the first window in a
delta because the window is "irrelevant" by virtue of
reconstructing text entirely before the requested range. This
might be related to the requests for offset 102400, since
obviously no window in the delta should be reconstructing
anything beyond that,
3. Oh wait, as I was typing this mail, I just figured out what's
happening with points 1 and 2 above. In get_file_digest() in
fs-test.c, we're requesting 100000 bytes at a time. So the first
request is fully satisfied, the next request is partially
satisfied, but then there's a third request due to the
conditional structure of the loop there, and that third request
will always start from one byte past the end of the file, and
return (of course) 0 bytes. Well, that explains that. Please
ignore all three points here.
Dang, I can't believe I wasted all that time figuring out something so
obvious.
So that leaves me with only the data point mentioned first, about how
efficiency drops dramatically the moment we add a new window.
By the way, Ben and I lost a lot of time today because newton finally
failed (hardware failure, it looks like). You know how it's been
going down daily recently, for no reason. Well, today it started
going down hourly :-). After it couldn't even *boot up* a fresh
install of the latest stable FreeBSD, we decided to junk it and make
galois (which had been sitting idle) the new newton etc. Everything
is back online now; hopefully we won't be dropping off the net every
night.
Will be looking more at the fs stuff over the next couple of days, we
should talk on the phone too, since I won't be physically in the
office tomorrow.
Hope you're feeling better!,
-K
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:43 2006