[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: export, checkout, commit performance

From: Ivan Zhakov <chemodax_at_gmail.com>
Date: 2006-03-10 18:29:56 CET

On 3/10/06, Sebastian Tusk <sebastian.tusk@gmx.net> wrote:
> I repeated the test from my original post.
>
> Add/commit of a single file named "test".
> server: stock 1.3.0 svnserve, Windows XP, NTFS
> client: trunk build, Windows XP, NTFS
>
> 1. copy, read test, write .svn\tmp\text-base\test.svn-base.tmp (512b blocks)
> stack: apr_file_copy <- svn_io_copy_file <- svn_wc_transmit_text_deltas
> <- svn_client__do_commit
>
> 2. transmit, read .svn\tmp\text-base\test.svn-base (100kb blocks)
> stack: svn_txdelta__vdelta <- compute_window <- svn_txdelta_next_window
> <- svn_txdelta_send_txstream <- svn_wc_transmit_text_deltas <-
> svn_client__do_commit
>
> 3. hashing, read .svn\tmp\text-base\test.svn-base (512b blocks)
> stack: svn_io_file_checksum <- svn_wc_transmit_text_deltas <-
> svn_client__do_commit
>
> 4. hashing, read .svn\tmp\text-base\test.svn-base (512b blocks)
> stack: svn_io_file_checksum <- svn_wc_process_committed2
>
> 5. compare, read test and .svn\tmp\text-base\test.svn-base simultaneous
> (512b blocks)
> stack: contents_identical_p <- svn_io_files_contents_same_p <-
> compare_and_verify <- svn_wc__versioned_file_modcheck <-
> log_do_committed <- start_handler <- expat_start_handler <- doContent <-
> contentProcessor <- XML_ParseBuffer <- XML_Parse <- svn_xml_parse <-
> run_log <- svn_wc__run_log <- svn_wc_process_committed2
>
>
> Most obvious problem is that test.svn-base gets hashed two times. Is the
> comparison (point 5) necessary?
Steps (2) and (3) can be ease merged with new stream translation API,
with change like this (don't look to code style, it is only for
testing!):
Index: subversion/libsvn_wc/adm_crawler.c
===================================================================
--- subversion/libsvn_wc/adm_crawler.c (revision 18809)
+++ subversion/libsvn_wc/adm_crawler.c (working copy)
@@ -714,8 +714,9 @@
   apr_file_t *localfile = NULL;
   apr_file_t *basefile = NULL;
   const char *base_digest_hex = NULL;
- unsigned char digest[APR_MD5_DIGESTSIZE];
-
+ unsigned char *digest;
+ svn_stream_t *local_stream;
+
   /* Make an untranslated copy of the working file in the
      administrative tmp area because a) we want this to work even if
      someone changes the working file while we're generating the
@@ -823,9 +824,11 @@

   /* Create a text-delta stream object that pulls data out of the two
      files. */
+ local_stream =
svn_stream_checksummed(svn_stream_from_aprfile(localfile, pool),
+ &digest, NULL, pool);
   svn_txdelta(&txdelta_stream,
               svn_stream_from_aprfile(basefile, pool),
- svn_stream_from_aprfile(localfile, pool),
+ local_stream,
               pool);

   /* Pull windows from the delta stream and feed to the consumer. */
@@ -834,18 +837,11 @@

   /* Close the two files */
   SVN_ERR(svn_io_file_close(localfile, pool));
+ SVN_ERR(svn_stream_close(local_stream));

   if (basefile)
     SVN_ERR(svn_wc__close_text_base(basefile, path, 0, pool));

- /* ### This is a pity. tmp_base was created with svn_io_copy_file()
- above, which uses apr_file_copy(), which probably called
- apr_file_transfer_contents(), which ran over every byte of the
- file and therefore could have computed a checksum effortlessly.
- But we're not about to change the interface of apr_file_copy(),
- so we'll have to run over the bytes again... */
- SVN_ERR(svn_io_file_checksum(digest, tmp_base, pool));
-
   /* Close the file baton, and get outta here. */
   return editor->close_file
     (file_baton, svn_md5_digest_to_cstring(digest, pool), pool);

Also I mention there is additional checksum checking in case if not
fulltext deltas sent, which also can be avoided if move it before
editor->close_file() call. This mean that checksum will be checked
after transmission, but before closing file. I consider this is
acceptable.

--
Ivan Zhakov
Received on Fri Mar 10 18:32:25 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.