[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: export, checkout, commit performance

From: Ivan Zhakov <chemodax_at_gmail.com>
Date: 2006-03-13 13:30:13 CET

On 3/10/06, Ivan Zhakov <chemodax@gmail.com> wrote:
> On 3/10/06, Sebastian Tusk <sebastian.tusk@gmx.net> wrote:
> > I repeated the test from my original post.
> >
> > Add/commit of a single file named "test".
> > server: stock 1.3.0 svnserve, Windows XP, NTFS
> > client: trunk build, Windows XP, NTFS
> >
> > 1. copy, read test, write .svn\tmp\text-base\test.svn-base.tmp (512b blocks)
> > stack: apr_file_copy <- svn_io_copy_file <- svn_wc_transmit_text_deltas
> > <- svn_client__do_commit
> >
> > 2. transmit, read .svn\tmp\text-base\test.svn-base (100kb blocks)
> > stack: svn_txdelta__vdelta <- compute_window <- svn_txdelta_next_window
> > <- svn_txdelta_send_txstream <- svn_wc_transmit_text_deltas <-
> > svn_client__do_commit
> >
> > 3. hashing, read .svn\tmp\text-base\test.svn-base (512b blocks)
> > stack: svn_io_file_checksum <- svn_wc_transmit_text_deltas <-
> > svn_client__do_commit
> >
> > 4. hashing, read .svn\tmp\text-base\test.svn-base (512b blocks)
> > stack: svn_io_file_checksum <- svn_wc_process_committed2
> >
> > 5. compare, read test and .svn\tmp\text-base\test.svn-base simultaneous
> > (512b blocks)
> > stack: contents_identical_p <- svn_io_files_contents_same_p <-
> > compare_and_verify <- svn_wc__versioned_file_modcheck <-
> > log_do_committed <- start_handler <- expat_start_handler <- doContent <-
> > contentProcessor <- XML_ParseBuffer <- XML_Parse <- svn_xml_parse <-
> > run_log <- svn_wc__run_log <- svn_wc_process_committed2
> >
> >
> > Most obvious problem is that test.svn-base gets hashed two times. Is the
> > comparison (point 5) necessary?
> Steps (2) and (3) can be ease merged with new stream translation API,
> with change like this (don't look to code style, it is only for
> testing!):
> Index: subversion/libsvn_wc/adm_crawler.c
> ===================================================================
> --- subversion/libsvn_wc/adm_crawler.c (revision 18809)
> +++ subversion/libsvn_wc/adm_crawler.c (working copy)
> @@ -714,8 +714,9 @@
> apr_file_t *localfile = NULL;
> apr_file_t *basefile = NULL;
> const char *base_digest_hex = NULL;
> - unsigned char digest[APR_MD5_DIGESTSIZE];
> -
> + unsigned char *digest;
> + svn_stream_t *local_stream;
> +
> /* Make an untranslated copy of the working file in the
> administrative tmp area because a) we want this to work even if
> someone changes the working file while we're generating the
> @@ -823,9 +824,11 @@
>
> /* Create a text-delta stream object that pulls data out of the two
> files. */
> + local_stream =
> svn_stream_checksummed(svn_stream_from_aprfile(localfile, pool),
> + &digest, NULL, pool);
> svn_txdelta(&txdelta_stream,
> svn_stream_from_aprfile(basefile, pool),
> - svn_stream_from_aprfile(localfile, pool),
> + local_stream,
> pool);
>
> /* Pull windows from the delta stream and feed to the consumer. */
> @@ -834,18 +837,11 @@
>
> /* Close the two files */
> SVN_ERR(svn_io_file_close(localfile, pool));
> + SVN_ERR(svn_stream_close(local_stream));
>
> if (basefile)
> SVN_ERR(svn_wc__close_text_base(basefile, path, 0, pool));
>
> - /* ### This is a pity. tmp_base was created with svn_io_copy_file()
> - above, which uses apr_file_copy(), which probably called
> - apr_file_transfer_contents(), which ran over every byte of the
> - file and therefore could have computed a checksum effortlessly.
> - But we're not about to change the interface of apr_file_copy(),
> - so we'll have to run over the bytes again... */
> - SVN_ERR(svn_io_file_checksum(digest, tmp_base, pool));
> -
> /* Close the file baton, and get outta here. */
> return editor->close_file
> (file_baton, svn_md5_digest_to_cstring(digest, pool), pool);
>
Commited in r18867. Thanks, Sebastian. Will be great if you made such
analys for other operations like checkout, update and etc.

>
> Also I mention there is additional checksum checking in case if not
> fulltext deltas sent, which also can be avoided if move it before
> editor->close_file() call. This mean that checksum will be checked
> after transmission, but before closing file. I consider this is
> acceptable.
>
After detailed analysis I found that is impossible -- text_editor
wants base checksum BEFORE receiving delta. But we can calc checksum
during copy file to temp base.

Step (5) odd for me. We check for modifications to make decision about
right timestamp. But IMHO presence of temporary file already means
text modifications, is it correct?

--
Ivan Zhakov
Received on Mon Mar 13 13:31:09 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.