[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Make pre-commit hook have access to lock tokens

From: Karl Fogel <kfogel_at_red-bean.com>
Date: Tue, 19 Aug 2008 12:08:06 -0400

"Chia-liang Kao" <clkao_at_clkao.org> writes:
> Thanks for the review. Here's the updated patch and log. Comments
> regarding the review are at the end.
>
> [[[
> Make pre-commit hook take lock tokens associated with the commit from stdin.
>
> * subversion/include/svn_fs_private.h
> Include svn_fs.h, as we need svn_fs_access_t.
> (svn_fs_access_get_lock_tokens): New.

As we discussed in IRC just now, there is some confusion between log
message and patch, and internally within the patch, as to whether this
is 'svn_fs_access_get_lock_tokens' or 'svn_fs__access_get_lock_tokens'.
Should be the latter, of course.

> * subversion/include/svn_fs.h
> * subversion/libsvn_fs/access.c
> Document that lock tokens cannot contain newlines
> (svn_fs_access_add_lock_token2): New. Like svn_fs_access_add_lock_token but
> takes the path as well.
> (svn_fs_access_get_lock_tokens): New.
>
> * subversion/libsvn_ra_local/ra_plugin.c
> (svn_ra_local__get_commit_editor): Call svn_fs_access_add_lock_token2
> instead, with the full paths in repository.
>
> * subversion/libsvn_repos/hooks.c
> (lock_token_content): New, for turning path -> lock token hashs into
> serialized format used in pre-commit hook.
> (svn_repos__hooks_pre_commit): If there are lock tokens, make the stdin of
> the hook script take a series of lock tokens associated with this commit.
>
> * subversion/libsvn_repos/repos.c: Document the new pre-commit feature that
> lock tokens are passed in from stdin.
>
> * subversion/mod_dav_svn/repos.c
> (get_resource): Call svn_fs_access_add_lock_token2 intead.
>
> * subversion/mod_dav_svn/version.c
> (dav_svn__push_locks): Call svn_fs_access_add_lock_token2 instead.
>
> * subversion/svnserve/serve.c
> (add_lock_tokens): Call svn_fs_access_add_lock_token2 instead.
>
> ]]]
>
>
>>> + svn_stringbuf_appendstr(lock_str,
>>> + svn_stringbuf_createf(pool, "%s|%s\n",
>>> + svn_path_uri_autoescape(path, pool),
>>> + token));
>>> + }
>>
>> What happens with a path that contains a "|" (which is perfectly legal
>> in Subversion)?
>
> That was what the uri-escaping intended to avoid.

Sounds good, I just missed that (oops).

>> You call the handle 'stdin_handle', but what it will actually get filled
>> with is some tmp file handle. That could confuse the reader. Maybe
>> just call it hook_input, or something else that won't make the reader
>> think it's a handle to stdin itself?
>
> Well, all other functions in hook.c are calling it stdin_handle, I
> guess I just didn't want to be special. ;) But I am happy to change
> if it feels confusing.

No, let's go for consistency. If we change its name, we should change
it in all the functions, and in a separate commit just for that.

>>> --- subversion/mod_dav_svn/repos.c (revision 32317)
>>> +++ subversion/mod_dav_svn/repos.c (local)
>>> @@ -1783,8 +1783,13 @@
>>> }
>>>
>>> do {
>>> - serr = svn_fs_access_add_lock_token(access_ctx,
>>> - list->locktoken->uuid_str);
>>> + /* Note the path/lock pairs are only for lock token checking in access,
>>> + and the relative path is not actually accurate as it contains the !svn bits.
>>> + However we are only using the tokens for the access control. */
>>> +
>>> + serr = svn_fs_access_add_lock_token2(access_ctx, relative,
>>> + list->locktoken->uuid_str);
>>> +
>>
>> Does this mean that the paths seen by the hook will be different
>> depending on the RA layer? That the "!svn" bits will still be in the
>> paths here? That would be bad; a hook is an interface, and the data we
>> feed that interface has to be consistent across all RA layers.
>
> No, the codepath never actually invokes the hook, it's the MERGE
> request that would, only the tokens are used.

Okay. The last sentence of the comment is ambiguous; it should say

  "However, we're using only the tokens anyway (for access control)."

I know that seems the same, but the way English binding works out, it
changes the meaning a bit :-).

Continuing on to the patch:

> === subversion/include/private/svn_fs_private.h
> ==================================================================
> --- subversion/include/private/svn_fs_private.h (revision 32395)
> +++ subversion/include/private/svn_fs_private.h (local)
> @@ -20,6 +20,8 @@
> #ifndef SVN_FS_PRIVATE_H
> #define SVN_FS_PRIVATE_H
>
> +#include "svn_fs.h"
> +
> #ifdef __cplusplus
> extern "C" {
> #endif /* __cplusplus */

I don't think it's a big deal, but this header addition isn't mentioned
in the log message. Yes, I'm an evil, evil nitpicker :-).

> @@ -38,6 +40,15 @@
> */
> #define SVN_FS__TXN_MAX_LEN 220
>
> +/** Retrieve the lock-tokens associated in the context @a access_ctx.
> + * The tokens are in a hash keyed with * <tt>const char *</tt> tokens,
> + * and with * <tt>const char *</tt> values for the paths associated.
> + *
> + * @since New in 1.6. */
> +
> +apr_hash_t *
> +svn_fs__access_get_lock_tokens(svn_fs_access_t *access_ctx);
> +

There are some extra "*"s in your doc string, I think.

> #ifdef __cplusplus
> }
> #endif /* __cplusplus */
> === subversion/include/svn_fs.h
> ==================================================================
> --- subversion/include/svn_fs.h (revision 32395)
> +++ subversion/include/svn_fs.h (local)
> @@ -471,12 +471,25 @@
> svn_fs_access_t *access_ctx);
>
>
> -/** Push a lock-token @a token into the context @a access_ctx. The
> - * context remembers all tokens it receives, and makes them available
> - * to fs functions. The token is not duplicated into @a access_ctx's
> - * pool; make sure the token's lifetime is at least as long as @a
> - * access_ctx. */
> +/** Push a lock-token @a token associated with path @path into the
> + * context @a access_ctx. The context remembers all tokens it
> + * receives, and makes them available to fs functions. The token and
> + * path are not duplicated into @a access_ctx's pool; make sure the
> + * token's lifetime is at least as long as @a access_ctx.
> + *
> + * @since New in 1.6. */
> +
> svn_error_t *
> +svn_fs_access_add_lock_token2(svn_fs_access_t *access_ctx,
> + const char *path,
> + const char *token);
> +/**
> + * Same as svn_fs_access_add_lock_token2(), but with @a path set to value 1.
> + *
> + * @deprecated Provided for backward compatibility with the 1.1 API.
> + */
> +
> +svn_error_t *
> svn_fs_access_add_lock_token(svn_fs_access_t *access_ctx,
> const char *token);

A thought: does this mean that if a caller uses the old
svn_fs_access_add_lock_token() API, but *also* uses the new
svn_fs__access_get_lock_tokens() API, that the values from the latter
will be unusuable, because they'll be 1? If so, maybe the doc string
for svn_fs__access_get_lock_tokens() should point this out, as well as
the doc string for svn_fs_access_add_lock_token().

Or maybe svn_fs__access_get_lock_tokens() should look for the special
value 1 (yuck, sigh) and return NULL in that case? (This would need to
be documented too, of course.)

> @@ -1794,7 +1807,7 @@
> * string by the client), and is required to make further use of the
> * lock (including removal of the lock.) A lock-token can also be
> * queried to return a svn_lock_t structure that describes the details
> - * of the lock.
> + * of the lock. lock-tokens must not contain any newline character.
> *
> * Locks are not secret; anyone can view existing locks in a
> * filesystem. Locks are not omnipotent: they can broken and stolen

This change is not mentioned in the log message either. This is
actually not nit-picking: as a reader of the change, I'm wondering what
the motivation behind the above doc string tweak is.

Actually, it would be even better to say in the doc string itself why
the no-newline requirement is imposed: talk about the pre-commit format.

> === subversion/libsvn_fs/access.c
> ==================================================================
> --- subversion/libsvn_fs/access.c (revision 32395)
> +++ subversion/libsvn_fs/access.c (local)
> @@ -77,11 +77,23 @@
>
>
> svn_error_t *
> +svn_fs_access_add_lock_token2(svn_fs_access_t *access_ctx,
> + const char *path,
> + const char *token)
> +{
> + apr_hash_set(access_ctx->lock_tokens,
> + token, APR_HASH_KEY_STRING, (void *) path);
> + return SVN_NO_ERROR;
> +}

Why is the '(void *)' cast necessary?

> +svn_error_t *
> svn_fs_access_add_lock_token(svn_fs_access_t *access_ctx,
> const char *token)
> {
> - apr_hash_set(access_ctx->lock_tokens,
> - token, APR_HASH_KEY_STRING, (void *) 1);
> + return svn_fs_access_add_lock_token2(access_ctx, (const char *) 1, token);
> +}

Yup. Should be 'svn_fs__access_add_lock_token', as we discussed.

> - return SVN_NO_ERROR;
> +apr_hash_t *
> +svn_fs_access_get_lock_tokens(svn_fs_access_t *access_ctx) {
> + return access_ctx->lock_tokens;
> }

Gosh, that one was easy :-).

> === subversion/libsvn_ra_local/ra_plugin.c
> ==================================================================
> --- subversion/libsvn_ra_local/ra_plugin.c (revision 32395)
> +++ subversion/libsvn_ra_local/ra_plugin.c (local)
> @@ -649,11 +649,13 @@
> hi = apr_hash_next(hi))
> {
> void *val;
> - const char *token;
> + const char *path, *token;
> + const void *key;
>
> - apr_hash_this(hi, NULL, NULL, &val);
> + apr_hash_this(hi, &key, NULL, &val);
> + path = svn_path_join(sess->fs_path->data, (const char *)key, pool);
> token = val;
> - SVN_ERR(svn_fs_access_add_lock_token(fs_access, token));
> + SVN_ERR(svn_fs_access_add_lock_token2(fs_access, (const char *)path, token));
> }
> }
> }

'path' is already a 'const char *'. Why is the second cast necessary?

> === subversion/libsvn_repos/hooks.c
> ==================================================================
> --- subversion/libsvn_repos/hooks.c (revision 32395)
> +++ subversion/libsvn_repos/hooks.c (local)
> @@ -28,6 +28,7 @@
> #include "svn_utf.h"
> #include "repos.h"
> #include "svn_private_config.h"
> +#include "private/svn_fs_private.h"

Change not mentioned in doc string (but it's obvious what the change is
for, so this is really just a matter of taste, I admit).

> @@ -367,7 +368,39 @@
> return SVN_NO_ERROR;
> }
>
> +/* Turns the lock_tokens hash into the file handle in the format
> + documented in the pre-commit template */
>
> +static svn_error_t *
> +lock_token_content(apr_file_t **handle, apr_hash_t *lock_tokens, apr_pool_t *pool)
> +{

Can we have a real doc string please? It's fine to refer to the
pre-commit template, but a doc string needs to say exactly what happens
with all the arguments. Among other things, this should document the
format of 'lock_tokens' (referring out to svn_fs_access_get_lock_tokens
or some other function is fine; the main thing is that a reader of the
code does not have to spend time wondering).

> + svn_stringbuf_t *lock_str = svn_stringbuf_create("LOCK-TOKENS:\n", pool);
> + apr_hash_index_t *hi;
> +
> + for (hi = apr_hash_first(pool, lock_tokens); hi;
> + hi = apr_hash_next(hi))
> + {
> + void *val;
> + const char *path, *token;
> +
> + apr_hash_this(hi, (void *)&token, NULL, &val);
> + path = val;
> + svn_stringbuf_appendstr(lock_str,
> + svn_stringbuf_createf(pool, "%s|%s\n",
> + svn_path_uri_autoescape(path, pool),
> + token));
> + }
> +
> + svn_stringbuf_appendcstr(lock_str, "\n");
> + SVN_ERR(create_temp_file(handle,
> + svn_string_create_from_buf(lock_str, pool),
> + pool));
> +
> + return SVN_NO_ERROR;
> +}

So, hmm, should we worry about the fact that this isn't done streamily?
That is, we put the whole list in a string and then put the string in a
tmp file. Does this open up a D.o.S attack on the server? Is there any
way to do it streamily?

(I should have asked this before, sorry.)

> svn_error_t *
> svn_repos__hooks_pre_commit(svn_repos_t *repos,
> const char *txn_name,
> @@ -383,14 +416,29 @@
> else if (hook)
> {
> const char *args[4];
> + svn_fs_access_t *access_ctx;
> + apr_file_t *stdin_handle = NULL;
>
> args[0] = hook;
> args[1] = svn_path_local_style(svn_repos_path(repos, pool), pool);
> args[2] = txn_name;
> args[3] = NULL;
>
> - SVN_ERR(run_hook_cmd(SVN_REPOS__HOOK_PRE_COMMIT, hook, args, NULL,
> - pool));
> + SVN_ERR(svn_fs_get_access(&access_ctx, repos->fs));
> + if (access_ctx)
> + {
> + apr_hash_t *lock_tokens = svn_fs__access_get_lock_tokens(access_ctx);
> + if (apr_hash_count(lock_tokens)) {
> + SVN_ERR(lock_token_content(&stdin_handle, lock_tokens, pool));
> + }
> + }
> +
> + if (!stdin_handle)
> + SVN_ERR(svn_io_file_open(&stdin_handle, SVN_NULL_DEVICE_NAME,
> + APR_READ, APR_OS_DEFAULT, pool));
> +
> + SVN_ERR(run_hook_cmd(SVN_REPOS__HOOK_PRE_COMMIT, hook, args,
> + stdin_handle, pool));
> }

Looks good, except for above streaminess concern.

> === subversion/libsvn_repos/repos.c
> ==================================================================
> --- subversion/libsvn_repos/repos.c (revision 32395)
> +++ subversion/libsvn_repos/repos.c (local)
> @@ -381,6 +381,15 @@
> "# [1] REPOS-PATH (the path to this repository)" NL
> "# [2] TXN-NAME (the name of the txn about to be committed)" NL
> "#" NL
> +"# [STDIN] LOCK-TOKENS ** the lock tokens are passed via STDIN." NL
> +"#" NL
> +"# If the STDIN contains a line matching LOCK-TOKENS and a newline," NL

No, it's "LOCK-TOKENS:" followed by a newline (document the colon).

Might be better just to write:

   '... "LOCK-TOKENS:\n" (where "\n" denotes a single newline) ...'

> +"# the lines following it are used as the lock tokens for this commit," NL
> +"# until it reaches a line containing only a newline character." NL
> +"#" NL
> +"# Each line of lock token is of the format: URI-escaped path, a '|'" NL
> +"# character, the lock token string, and a new line character." NL
> +"#" NL

It might be good to just show an example here.

> === subversion/mod_dav_svn/repos.c
> ==================================================================
> --- subversion/mod_dav_svn/repos.c (revision 32395)
> +++ subversion/mod_dav_svn/repos.c (local)
> @@ -1783,8 +1783,13 @@
> }
>
> do {
> - serr = svn_fs_access_add_lock_token(access_ctx,
> - list->locktoken->uuid_str);
> + /* Note the path/lock pairs are only for lock token checking in access,
> + and the relative path is not actually accurate as it contains the !svn bits.
> + However we are only using the tokens for the access control. */
> +
> + serr = svn_fs_access_add_lock_token2(access_ctx, relative,
> + list->locktoken->uuid_str);
> +

See my above comment about the "However..." comment.

> --- subversion/mod_dav_svn/version.c (revision 32395)
> +++ subversion/mod_dav_svn/version.c (local)
> @@ -1181,12 +1181,13 @@
>
> for (hi = apr_hash_first(pool, locks); hi; hi = apr_hash_next(hi))
> {
> - const char *token;
> + const char *path, *token;
> + const void *key;
> void *val;
> - apr_hash_this(hi, NULL, NULL, &val);
> - token = val;
> + apr_hash_this(hi, &key, NULL, &val);
> + path = key, token = val;
>
> - serr = svn_fs_access_add_lock_token(fsaccess, token);
> + serr = svn_fs_access_add_lock_token2(fsaccess, path, token);
> if (serr)
> return dav_svn__convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
> "Error pushing token into filesystem.",

Looks good.

> --- subversion/svnserve/serve.c (revision 32395)
> +++ subversion/svnserve/serve.c (local)
> @@ -1125,7 +1125,7 @@
> sb, conn, pool);
>
> token = token_item->u.string->data;
> - SVN_ERR(svn_fs_access_add_lock_token(fs_access, token));
> + SVN_ERR(svn_fs_access_add_lock_token2(fs_access, path, token));
> }

Looks good.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-08-19 18:08:26 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.