[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Is mod_dav_svn safe for use in a threaded MPM?

From: Malcolm Rowe <malcolm-svn-dev_at_farside.org.uk>
Date: 2007-04-17 11:00:19 CEST

On Mon, Apr 16, 2007 at 02:22:18PM -0700, Eric Gillespie wrote:
> I've updated the patch to account for Malcolm's and Ben's
> comments. The activity file is created as a temporary file and
> then renamed into place with the MD5-encoded file name, and
> read_txn (renamed :) handles ESTALE. The transaction name is now
> terminated with a newline.
>

Thanks - nearly there, see comments below.

> [[[
> Stop using apr_dbm to manage the DAV activities database which maps
> DAV activity ids to Subversion transaction names. Instead, use a
> simple directory structure, with activity ids as file names and
> transaction names as file contents.
>
> * subversion/mod_dav_svn/activity.c
> (ACTIVITY_DB): Change to 'dav/activities.d' so as not to conflict
> with the DBM file 'dav/activities'.
> (escape_activity): New helper to escape activity for safe use as a
> file name.
> (read_txn): New helper to return the txn from an activity file.
> (dav_svn__get_txn): Make this a simple wrapper around _read_txn .

"read_txn", now.

> (dav_svn__delete_activity): Use read_txn to get the transaction
> name; delete the file from the activities directory.
> (dav_svn__store_activity): Create the activities directory if it
> does not exist and create the activity file.
> ]]]
>
> Index: subversion/mod_dav_svn/activity.c
> ===================================================================
> --- subversion/mod_dav_svn/activity.c (revision 24594)
> +++ subversion/mod_dav_svn/activity.c (working copy)
> @@ -16,11 +16,16 @@
> * ====================================================================
> */
>
> -#include <apr_dbm.h>
> +#include <errno.h>
>
> +#include <apr_md5.h>
> +
> #include <httpd.h>
> #include <mod_dav.h>
>
> +#include "svn_error.h"
> +#include "svn_io.h"
> +#include "svn_md5.h"
> #include "svn_path.h"
> #include "svn_fs.h"
> #include "svn_repos.h"
> @@ -28,59 +33,111 @@
> #include "dav_svn.h"
>
>
> -#define ACTIVITY_DB "dav/activities"
> +#define ACTIVITY_DB "dav/activities.d"
>
>
> -const char *
> -dav_svn__get_txn(const dav_svn_repos *repos, const char *activity_id)
> +static const char *
> +escape_activity(const char *activity_id, apr_pool_t *pool)
> {

Doc-comment for this function would be good :-)

> - apr_dbm_t *dbm;
> - apr_status_t status;
> - const char *pathname;
> - apr_datum_t key;
> - apr_datum_t value;
> - const char *txn_name = NULL;
> + unsigned char digest[APR_MD5_DIGESTSIZE];
> + apr_md5(digest, activity_id, strlen(activity_id));
> + return svn_md5_digest_to_cstring_display(digest, pool);
> +}
>
> - pathname = svn_path_join(repos->fs_path, ACTIVITY_DB, repos->pool);
> - status = apr_dbm_open(&dbm, pathname, APR_DBM_READONLY,
> - APR_OS_DEFAULT, repos->pool);
> - if (status != APR_SUCCESS)
> +
> +/* FSFS transaction ids:
> + 19 bytes for svn_revnum_t (room for 32 or 64 bit values)
> + + 1 byte for '-'
> + + 5 bytes (up to 99999)
> + + 1 terminating null / newline
> + = 26 bytes
> + BDB transaction ids are base-36, bounded by bdb MAX_KEY_SIZE, but
> + realistically, 26 bytes ought to be enough.
> + */
> +#define TXN_LEN 26
> +
> +static const char *
> +read_txn(const dav_svn_repos *repos, const char *pathname)

Ditto.

> +{
> + apr_file_t *activity_file;
> + apr_pool_t *iterpool = svn_pool_create(repos->pool);
> + apr_size_t len = TXN_LEN;
> + svn_error_t *err = SVN_NO_ERROR;
> + char *txn_name = apr_palloc(repos->pool, len);;
> + int i;
> +
> + /* Try up to 10 times to read the txn name, retrying on ESTALE
> + (stale NFS file handle because of dav_svn__store_activity
> + renaming the activity file into place).
> + */
> + for (i = 0; i < 10; i++)
> {
> - /* ### let's just assume that any error means the DB doesn't exist,
> - ### therefore, the activity/transaction doesn't exist */
> - return NULL;
> - }
> + svn_error_clear(err);
> + svn_pool_clear(iterpool);
>
> - key.dptr = (char *)activity_id;
> - key.dsize = strlen(activity_id) + 1; /* null-term'd */
> - if (apr_dbm_exists(dbm, key))
> - {
> - status = apr_dbm_fetch(dbm, key, &value);
> - if (status != APR_SUCCESS)
> + err = svn_io_file_open(&activity_file, pathname, APR_READ,
> + APR_OS_DEFAULT, iterpool);
> + if (err && err->apr_err != APR_SUCCESS)
> {
> - /* ### again: assume failure means it doesn't exist */
> - apr_dbm_close(dbm);
> +#ifdef ESTALE
> + if (APR_TO_OS_ERROR(err->apr_err) == ESTALE)
> + continue;
> +#endif
> + /* ### let's just assume that any error means the
> + ### activity/transaction doesn't exist */
> return NULL;

You just leaked 'err'.

> }
> - txn_name = apr_pstrdup(repos->pool, value.dptr); /* null-term'd */
> - apr_dbm_freedatum(dbm, value);
> +
> + err = svn_io_read_length_line(activity_file, txn_name, &len, NULL);

'len' needs to be initialised immediately before this statement, in case
the first iteration returned ESTALE and borked the original value.

> + if (err && err->apr_err != APR_SUCCESS)
> + {
> +#ifdef ESTALE
> + if (APR_TO_OS_ERROR(err->apr_err) == ESTALE)
> + continue;
> +#endif
> + return NULL;

'err' leaked again.

> + }
> +
> + err = svn_io_file_close(activity_file, iterpool);
> + if (err && err->apr_err != APR_SUCCESS)
> + {
> +#ifdef ESTALE
> + if (APR_TO_OS_ERROR(err->apr_err) != ESTALE)
> +#endif
> + return NULL;

Why have different behaviour for ESTALE/non-ESTALE? While I can't think
of a valid non-ESTALE return, it seems like we should treat them the
same - whether that means returning what we have or returning NULL, I'm
not sure. (especially as ESTALE means that someone just overwrote the
transaction name, so we know we have stale data).

> + }
> +
> + break;
> }
> + svn_pool_destroy(iterpool);
>
> - apr_dbm_close(dbm);
> + if (err)
> + {
> + svn_error_clear(err);
> + return NULL;
> + }
>
> return txn_name;
> }
>
>
> +const char *
> +dav_svn__get_txn(const dav_svn_repos *repos, const char *activity_id)
> +{
> + const char *pathname;
> + pathname = svn_path_join_many(repos->pool, repos->fs_path, ACTIVITY_DB,
> + escape_activity(activity_id, repos->pool),
> + NULL);
> + return read_txn(repos, pathname);
> +}
> +
> +
> dav_error *
> dav_svn__delete_activity(const dav_svn_repos *repos, const char *activity_id)
> {
> dav_error *err = NULL;
> - apr_dbm_t *dbm;
> apr_status_t status;
> const char *pathname;
> - apr_datum_t key;
> - apr_datum_t value;
> svn_fs_txn_t *txn;
> const char *txn_name;
> svn_error_t *serr;
> @@ -89,23 +146,12 @@
> 404. If the transaction is not present or is immutable, return a
> 204. For all other failures, return a 500. */
>
> - /* Open the activities database. */
> - pathname = svn_path_join(repos->fs_path, ACTIVITY_DB, repos->pool);
> - status = apr_dbm_open(&dbm, pathname, APR_DBM_READWRITE,
> - APR_OS_DEFAULT, repos->pool);
> - if (status != APR_SUCCESS)
> - return dav_new_error(repos->pool, HTTP_NOT_FOUND, 0,
> - "could not open activities database.");
> -
> - /* Get the activity from the activity database. */
> - key.dptr = (char *)activity_id;
> - key.dsize = strlen(activity_id) + 1; /* null-term'd */
> - status = apr_dbm_fetch(dbm, key, &value);
> - if (status == APR_SUCCESS)
> - txn_name = value.dptr;
> - else
> + pathname = svn_path_join_many(repos->pool, repos->fs_path, ACTIVITY_DB,
> + escape_activity(activity_id, repos->pool),
> + NULL);
> + txn_name = read_txn(repos, pathname);
> + if (txn_name == NULL)
> {
> - apr_dbm_close(dbm);
> return dav_new_error(repos->pool, HTTP_NOT_FOUND, 0,
> "could not find activity.");
> }
> @@ -131,7 +177,7 @@
> err = dav_svn__convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
> "could not open transaction.",
> repos->pool);
> - goto cleanup;
> + return err;
> }
> }
> else
> @@ -142,24 +188,19 @@
> err = dav_svn__convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
> "could not abort transaction.",
> repos->pool);
> - goto cleanup;
> + return err;
> }
> }
> }
> -
> +
> /* Finally, we remove the activity from the activities database. */
> - status = apr_dbm_delete(dbm, key);
> + status = apr_file_remove(pathname, repos->pool);
> if (status)
> {
> err = dav_new_error(repos->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
> "unable to remove activity.");
> - goto cleanup;
> }

I'm curious why (here and elsewhere) we're calling direct into APR
rather than using the normal Subversion calls - I'd have expected
something like this:

err = svn_io_remove_file(pathname, repos->pool);
if (err)
  {
    svn_error_clear(err);
    err = dav_new_error(repos->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
                        "unable to remove activity.");
  }

... which might be more verbose, but makes it clear that we're reporting
one error in place of another (presumably for some reason?).

And elsewhere, we get an APR status for one of the APR file functions
and then wrap it into an svn_error_t* - seems like it'd be easier just
to call the svn_io_xxx function in the first place.

>
> - cleanup:
> - apr_dbm_freedatum(dbm, value);
> - apr_dbm_close(dbm);
> -
> return err;
> }
>
> @@ -169,40 +210,72 @@
> const char *activity_id,
> const char *txn_name)
> {
> - apr_dbm_t *dbm;
> + const char *final_path, *tmp_path;
> + svn_error_t *err;
> apr_status_t status;
> - const char *pathname;
> - apr_datum_t key;
> - apr_datum_t value;
> + apr_file_t *activity_file;
>
> - pathname = svn_path_join(repos->fs_path, ACTIVITY_DB, repos->pool);
> - status = apr_dbm_open(&dbm, pathname, APR_DBM_RWCREATE,
> - APR_OS_DEFAULT, repos->pool);
> - if (status != APR_SUCCESS)
> + /* Create activities directory if it does not yet exist. */
> + err = svn_io_make_dir_recursively(svn_path_join_many(repos->pool,
> + repos->fs_path,
> + ACTIVITY_DB,
> + NULL),
> + repos->pool);
> + if (err != NULL)
> + return dav_svn__convert_err(err, HTTP_INTERNAL_SERVER_ERROR,
> + "could not open files.",

Seems like an odd error message to choose?

> + repos->pool);
> +
> + final_path = svn_path_join_many(repos->pool, repos->fs_path, ACTIVITY_DB,
> + escape_activity(activity_id, repos->pool),
> + NULL);
> + err = svn_io_open_unique_file2(&activity_file, &tmp_path, final_path,
> + ".tmp", svn_io_file_del_none, repos->pool);
> + if (err)
> {
> - svn_error_t *serr = svn_error_wrap_apr(status, "Can't open activity db");
> + svn_error_t *serr = svn_error_quick_wrap(err, "Can't open activity db");
> + return dav_svn__convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
> + "could not open files.",
> + repos->pool);
> + }
>
> + if (apr_file_printf(activity_file, "%s\n", txn_name) < 1)

Why '< 1'? Why not check the length against what we expect?

Looks like apr_file_printf() doesn't actually check that we wrote as
many bytes as were requested - it just assumes that if apr_file_puts()
didn't return an error, it must have written everything. Yuck.

Separately, I think it'd be a good idea to write the original activity
name after that transaction name so that we have the option to enumerate
them later if we choose to (and also to aid in debugging).

Perhaps just build up a string with apr_psprintf, and then pass it to
svn_io_file_write_full()? That fixes the above problems too.

> + {
> + svn_error_t *serr =
> + svn_error_wrap_apr(status, "Can't write to activity db");
> +
> + /* Try to remove the tmp file, but we already have an error... */
> + svn_error_clear(svn_io_remove_file(tmp_path, repos->pool));

Need to try closing the temporary file first - you can't remove open files
on Windows.

> +
> return dav_svn__convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
> - "could not open dbm files.",
> + "could not write files.",
> repos->pool);
> }
>
> - key.dptr = (char *)activity_id;
> - key.dsize = strlen(activity_id) + 1; /* null-term'd */
> - value.dptr = (char *)txn_name;
> - value.dsize = strlen(txn_name) + 1; /* null-term'd */
> - status = apr_dbm_store(dbm, key, value);
> - apr_dbm_close(dbm);
> + status = apr_file_close(activity_file);
> if (status != APR_SUCCESS)
> {
> svn_error_t *serr =
> svn_error_wrap_apr(status, "Can't close activity db");

(This is an example of where it might make more sense just to call
svn_io_file_close() directly - it'll do the work of putting the filename
into the message).

>
> + svn_error_clear(svn_io_remove_file(tmp_path, repos->pool));
> +
> return dav_svn__convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
> - "could not close dbm files.",
> + "could not close files.",
> repos->pool);
> }
>
> + err = svn_io_file_rename(tmp_path, final_path, repos->pool);
> + if (err)
> + {
> +
> + svn_error_clear(svn_io_remove_file(tmp_path, repos->pool));
> +
> + return dav_svn__convert_err(err, HTTP_INTERNAL_SERVER_ERROR,
> + "could not close files.",

We're renaming, not closing - "replace files", maybe?

> + repos->pool);
> + }
> +
> return NULL;
> }
>

Regards,
Malcolm

  • application/pgp-signature attachment: stored
Received on Tue Apr 17 11:01:51 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.