[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Flushing directory log accumulator during pool cleanup (Re: svn commit: r23342 - trunk/subversion/libsvn_wc)

From: David Glasser <glasser_at_davidglasser.net>
Date: 2007-11-02 19:55:54 CET

Super-late commit review!

In r23342, a log accumulator was added to the directory baton; instead
of making one "log" file per entry, there's one big log accumulator in
each directory baton, which is flushed at various times (before
descending into subtrees, etc) and executed at "close_directory" time.

That's fine. Except there's one issue here: this revision also adds a
flush_log call to the cleanup_dir_baton APR pool cleanup handler. And
this can be called when Subversion is dying due to an error or signal.
This means that something like merge_file could have only added some
of its loggy entries to the accumulator, but the log can be executed
anyway! This causes the problem Hyrum reported at

  http://svn.haxx.se/dev/archive-2007-11/0078.shtml

I think we should just remove flush_log from the cleanup handler so
that the cleanup handler only runs logs that are guaranteed to be
"complete". (Honestly, I'm not sure how good an idea it is to be
running logs during cleanup anyway... seems like a lot of work to be
doing after an error is found.) I am going to do that and see if the
tests still pass. I've noted the particular line in the diff below.

(Completely orthogonally, it doesn't look like the any of the file or
directory baton pools are ever cleared/destroyed until the main edit
baton is. Is that acceptable memory usage?)

--dave

On 2/4/07, lundblad@tigris.org <lundblad@tigris.org> wrote:
> Author: lundblad
> Date: Sun Feb 4 12:09:32 2007
> New Revision: 23342
>
> Log:
> For checkout/update etc., avoid creating one working copy 'log' file per
> versioned file by accumulating log entries in memory, flushing it
> when closing the directory and before going into subdirectories.
>
> For large working copies, this saves a lot of creation/reading/removing small
> files, which turns out to be rather expensive, especially on remote
> filesystems.
>
> Suggested by: Henner Zeller <hzeller@google.com>
> me
>
>
> * subversion/libsvn_wc/update_editor.c
> (struct dir_baton): Add log_accum field.
> (make_dir_baton): Initialize the log_accum field.
> (flush_log): New function.
> (cleanup_dir_baton): Flush the current log before running logs.
> (open_directory, add_directory): Flush parent directory's log before
> opening the new directory.
> (close_directory): Write to the accumulated log of the dir baton instead
> of a local strinbuf. Flush the accumulated log before running the logs.
> (close_file): Write to the accumulated log of the parent directory instead
> of creating our own log file.
>
>
> Modified:
> trunk/subversion/libsvn_wc/update_editor.c
>
> Modified: trunk/subversion/libsvn_wc/update_editor.c
> URL: http://svn.collab.net/viewvc/svn/trunk/subversion/libsvn_wc/update_editor.c?pathrev=23342&r1=23341&r2=23342
> ==============================================================================
> --- trunk/subversion/libsvn_wc/update_editor.c (original)
> +++ trunk/subversion/libsvn_wc/update_editor.c Sun Feb 4 12:09:32 2007
> @@ -162,6 +162,9 @@
> /* The current log file number. */
> int log_number;
>
> + /* The current log buffer. */
> + svn_stringbuf_t *log_accum;
> +
> /* The pool in which this baton itself is allocated. */
> apr_pool_t *pool;
> };
> @@ -235,6 +238,27 @@
> return entry->url;
> }
>
> +/* Flush accumulated log entries to a log file on disk for DIR_BATON and
> + * increase the log number of the dir baton.
> + * Use POOL for temporary allocations. */
> +static svn_error_t *
> +flush_log(struct dir_baton *db, apr_pool_t *pool)
> +{
> + if (! svn_stringbuf_isempty(db->log_accum))
> + {
> + svn_wc_adm_access_t *adm_access;
> +
> + SVN_ERR(svn_wc_adm_retrieve(&adm_access, db->edit_baton->adm_access,
> + db->path, pool));
> + SVN_ERR(svn_wc__write_log(adm_access, db->log_number, db->log_accum,
> + pool));
> + db->log_number++;
> + svn_stringbuf_setempty(db->log_accum);
> + }
> +
> + return SVN_NO_ERROR;
> +}
> +
> /* An APR pool cleanup handler. This runs the log file for a
> directory baton. */
> static apr_status_t
> @@ -244,23 +268,27 @@
> svn_error_t *err;
> apr_status_t apr_err;
> svn_wc_adm_access_t *adm_access;
> + apr_pool_t *pool = apr_pool_parent_get(db->pool);
>
> - /* If there are no log files to write, return immediately. */
> - if (db->log_number == 0)
> - return APR_SUCCESS;
> -
> - err = svn_wc_adm_retrieve(&adm_access, db->edit_baton->adm_access,
> - db->path, apr_pool_parent_get(db->pool));
> -
> - if (! err)
> + err = flush_log(db, pool);

The above line is the one that concerns me.

> + if (! err && db->log_number > 0)
> {
> - err = svn_wc__run_log(adm_access, NULL, apr_pool_parent_get(db->pool));
> -
> + err = svn_wc_adm_retrieve(&adm_access, db->edit_baton->adm_access,
> + db->path, pool);
> +
> if (! err)
> - return APR_SUCCESS;
> + {
> + err = svn_wc__run_log(adm_access, NULL, pool);
> +
> + if (! err)
> + return APR_SUCCESS;
> + }
> }
> -
> - apr_err = err->apr_err;
> +
> + if (err)
> + apr_err = err->apr_err;
> + else
> + apr_err = APR_SUCCESS;
> svn_error_clear(err);
> return apr_err;
> }
> @@ -363,6 +391,7 @@
> d->add_existed = FALSE;
> d->bump_info = bdi;
> d->log_number = 0;
> + d->log_accum = svn_stringbuf_create("", pool);
>
> apr_pool_cleanup_register(d->pool, d, cleanup_dir_baton,
> cleanup_dir_baton_child);
> @@ -513,8 +542,6 @@
> return SVN_NO_ERROR;
> }
>
> -
> -
> struct file_baton
> {
> /* The global edit baton. */
> @@ -1100,6 +1127,9 @@
> struct dir_baton *db = make_dir_baton(path, eb, pb, TRUE, pool);
> svn_node_kind_t kind;
>
> + /* Flush the log for the parent directory before going into this subtree. */
> + SVN_ERR(flush_log(pb, pool));
> +
> /* Semantic check. Either both "copyfrom" args are valid, or they're
> NULL and SVN_INVALID_REVNUM. A mixture is illegal semantics. */
> if ((copyfrom_path && (! SVN_IS_VALID_REVNUM(copyfrom_revision)))
> @@ -1299,6 +1329,9 @@
>
> svn_wc_adm_access_t *adm_access;
>
> + /* Flush the log for the parent directory before going into this subtree. */
> + SVN_ERR(flush_log(pb, pool));
> +
> /* kff todo: check that the dir exists locally, find it somewhere if
> its not there? Yes, all this and more... And ancestor_url and
> ancestor_revision need to get used. */
> @@ -1419,9 +1452,6 @@
> to deal with them. */
> if (regular_props->nelts || entry_props->nelts || wc_props->nelts)
> {
> - /* to hold log messages: */
> - svn_stringbuf_t *entry_accum = svn_stringbuf_create("", db->pool);
> -
> if (regular_props->nelts)
> {
> /* If recording traversal info, then see if the
> @@ -1478,25 +1508,23 @@
> adm_access, NULL,
> NULL /* use baseprops */,
> regular_props, TRUE, FALSE,
> - db->pool, &entry_accum),
> + db->pool, &db->log_accum),
> _("Couldn't do property merge"));
> }
>
> - SVN_ERR(accumulate_entry_props(entry_accum, NULL,
> + SVN_ERR(accumulate_entry_props(db->log_accum, NULL,
> adm_access, SVN_WC_ENTRY_THIS_DIR,
> entry_props, pool));
>
> - SVN_ERR(accumulate_wcprops(entry_accum, adm_access,
> + SVN_ERR(accumulate_wcprops(db->log_accum, adm_access,
> SVN_WC_ENTRY_THIS_DIR, wc_props, pool));
> -
> - /* Write our accumulation of log entries into a log file */
> - SVN_ERR(svn_wc__write_log(adm_access, db->log_number, entry_accum, pool));
> }
>
> - /* Run the log. */
> + /* Flush and run the log. */
> + SVN_ERR(flush_log(db, pool));
> SVN_ERR(svn_wc__run_log(adm_access, db->edit_baton->diff3_cmd, db->pool));
> db->log_number = 0;
> -
> +
> /* We're done with this directory, so remove one reference from the
> bump information. This may trigger a number of actions. See
> maybe_bump_dir_info() for more information. */
> @@ -2503,7 +2531,6 @@
> svn_wc_notify_state_t content_state, prop_state;
> svn_wc_notify_lock_state_t lock_state;
> svn_wc_adm_access_t *adm_access;
> - svn_stringbuf_t *log_accum;
>
> if (fb->skipped)
> {
> @@ -2531,11 +2558,7 @@
> SVN_ERR(svn_wc_adm_retrieve(&adm_access, eb->adm_access,
> parent_path, pool));
>
> - /* Accumulate log commands in this buffer until we're ready to
> - write the log. */
> - log_accum = svn_stringbuf_create("", pool);
> -
> - SVN_ERR(merge_file(log_accum,
> + SVN_ERR(merge_file(fb->dir_baton->log_accum,
> &content_state,
> &prop_state,
> &lock_state,
> @@ -2551,12 +2574,6 @@
> fb->last_changed_date,
> pool));
>
> - /* Write our accumulation of log entries into a log file */
> - SVN_ERR(svn_wc__write_log(adm_access, fb->dir_baton->log_number,
> - log_accum, pool));
> -
> - fb->dir_baton->log_number++;
> -
> /* We have one less referrer to the directory's bump information. */
> SVN_ERR(maybe_bump_dir_info(eb, fb->bump_info, pool));
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: svn-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: svn-help@subversion.tigris.org
>
>

-- 
David Glasser | glasser_at_davidglasser.net | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Nov 2 19:56:07 2007

This is an archived mail posted to the Subversion Dev mailing list.