Sebastian Tusk <sebastian.tusk@gmx.net> writes:
> The file to commit is named test. The function performed is guessed.
>
> activity function? time
> ------
> read 512b blocks from "test"
> write 512b blocks to "test.svn_base.tmp" COPY? 3m10s
> ------
> read 102400b blocks from "test.svn-base"
> write 4096b blocks to "temp.tmp"
> occasionally other reads from "test.svn-base" COPY? DIFF? 2m23s
> ------
> read 512b blocks from test.svn-base HASHING? 1m47s
> ------
> read 4096b blocks from "tempfile" TRANSFER TO SERVER 7m8s
> ------
> read 512b blocks from "test.svn-base" HASHING? 1m16s
> ------
> read 512b blocks from "test"
> read 512b blocks from "test.svn-base" COMPARE? 7m46s
> ------
> read 512b blocks from "test.svn-base"
> write 512b blocks to "test.tmp.tmp"
> occasionally other reads from "test.svn-base" COPY? 3m12s
> ------
> read 512b blocks from "test"
> read 512b blocks from "test.tmp" COMPARE? 7m42s
>
> That are all activities that consume significant time. All reads and
> writes are over the full 680MB file. Only during TRANSFER TO SERVER
> there is any noticible server activity. Alltogether the commit command
> uses more than 13 million read or write operations.
>
> Are all this activities necessary?
As I understand it the commit will:
- copy (i.e. read/write) the file to a temporary text-base so that it
won't subsequently change during the rest of the commit.
- read the temporary text-base to get a checksum. I suppose we could
combine this with the copy above but that's quite complex and would
involve moddifying apr_file_copy (or not using it).
- read the temporary text-base to calculate the delta.
Once the commit has succeeded the post commit processing will:
- read the temporary text-base to calculate the checksum. There is a
comment about whether we could reuse the previous checksum. Since
the file is in the .svn area this should be possible.
- read the working and temporary text-base to determine whether the
working file has changed. This is necessary to determine whether
the text-time in the entries file should be set to match the working
file.
I make that 5 reads but you seem to be getting more, I don't know why.
> Especially those after the transfer
> to the server is finished. Is it necessary to access two files in
> parallel in such small blocks? Using larger read/write blocks might
> improve the performance significant without to much hassle.
Using BUFSIZ is probably wrong, particularly if it is 512.
--
Philip Martin
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Mar 9 21:30:38 2006