[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH v3] Reject checkouts to existing directory

From: Patrick Steinhardt <ps_at_pks.im>
Date: Mon, 21 Nov 2016 11:57:01 +0100

On Sat, Nov 19, 2016 at 05:53:26AM +0000, Daniel Shahaf wrote:
> Patrick Steinhardt wrote on Wed, Nov 09, 2016 at 12:54:08 +0100:
> > +++ b/subversion/svn/checkout-cmd.c
> > @@ -61,6 +61,144 @@
> > Is this the same as CVS? Does it matter if it is not?
> > */
> >
> > +static svn_error_t *
> > +listing_cb(void *baton,
> > + const char *path,
> > + const svn_dirent_t *dirent,
> > + const svn_lock_t *lock,
> > + const char *abs_path,
> > + const char *external_parent_url,
> > + const char *external_target,
> > + apr_pool_t *scratch_pool)
> > +{
> > + size_t *count = (size_t *) baton;
> > +
> > + (*count)++;
> > +
>
> Would it make sense, at this point, to raise an error
> (SVN_ERR_CEASE_INVOCATION or SVN_ERR_ITER_BREAK) if count>1, so as to
> not iterate the remaining dirents?

Yes, it does make sense indeed.

>
> > + return SVN_NO_ERROR;
> > +}
> > +
> > +/* Check if the target directory which should be checked out to
> > + * is a valid target directory. A target directory is valid if we
> > + * are sure that a checkout to the directory will not create any
> > + * unexpected situations for the user. This is the case if one of
> > + * the following criteria is true:
> > + *
> > + * - the target directory does not exist
> > + * - the target directory is empty
> > + * - the target directory is the same repository with the same
> > + * relative URL as the one that is to be checked out (e.g. we
> > + * resume a checkout)
> > + * - the repository that is to be checked out is empty
> > + */
> > +static svn_error_t *
> > +verify_checkout_target(const char *repo_url,
> > + const char *target_dir,
> > + const svn_opt_revision_t *peg_revision,
> > + const svn_opt_revision_t *revision,
> > + svn_client_ctx_t *ctx,
> > + apr_pool_t *pool)
> > +{
> > + svn_node_kind_t kind;
> > + apr_pool_t *scratch_pool;
> > + const char *absolute_path;
> > + size_t count = 0;
> > + svn_wc_status3_t *status;
> > + svn_error_t *error;
>
> Variables of type 'svn_error_t *' are conventionally named 'err'.

Okay.

> > + svn_boolean_t empty;
> > +
> > + scratch_pool = svn_pool_create(pool);
> > +
> > + SVN_ERR(svn_dirent_get_absolute(&absolute_path,
> > + target_dir,
> > + pool));
> > +
> > + SVN_ERR(svn_io_check_path(absolute_path,
> > + &kind,
> > + pool));
> > +
> > + /* If the directory does not exist yet, it will be created
> > + * anew and is thus considered valid. */
> > + if (kind == svn_node_none)
> > + return SVN_NO_ERROR;
> > +
> > + /* Checking out to a file or symlink cannot work. */
> > + if (kind != svn_node_dir)
> > + return svn_error_createf(
> > + SVN_ERR_ILLEGAL_TARGET,
> > + NULL,
> > + _("Rejecting checkout to existing %s '%s'"),
> > + svn_node_kind_to_word(kind),
>
> Would using 'kind' through %s cause a problem for translators?
> It would result in a nominative English noun being used in a dative
> local-language context.

Is there any recommendation what to use instead here?

> > + absolute_path);
>
> Paths in error messages should be converted to local style
> (with svn_dirent_local_style()).

Okay.

> > + error = svn_wc_status3(&status,
> > + ctx->wc_ctx,
> > + absolute_path,
> > + pool,
> > + scratch_pool);
> > +
> > + /* If the remote repository UUID and working copy UUID are the
> > + * same and if the relative paths inside the repository match,
> > + * we assume that the command is a resumed checkout. */
> > + if (error == SVN_NO_ERROR)
> > + {
> > + if (repo_relpath
> > + && !strcmp(status->repos_uuid, repo_uuid)
> > + && !strcmp(status->repos_relpath, repo_relpath))
> > + return SVN_NO_ERROR;
> > + }
> > + else
> > + {
> > + svn_error_clear(error);
>
> Shouldn't you check for a specific error code here? I.e.,
>
> if (err == SVN_NO_ERROR)
> ⋮
> else if (err && err->apr_err == SVN_ERR_WC_NOT_WORKING_COPY)
> svn_error_clear();
> else
> SVN_ERR(err)
>
> (possibly a few more error codes in the second branch)?
>
> To catch the case that ABSOLUTE_PATH denotes a wedged working copy,
> for example.

Yeah, I was reasoning about the same thing here while I wrote
this. It probably makes sense to do.

> > + SVN_ERR(svn_client_list4(repo_url,
> > + pool));
> > +
> > + /* If the remote respotiory has more than one file (that is, it
>
> Typo for "repository".
>
> > + * is not an empty root directory only), we refuse to check out
> > + * into a non-empty target directory. Otherwise Subversion
> > + * would create tree conflicts. */
> > + if (count > 1)
> > + return svn_error_createf(
> > + SVN_ERR_ILLEGAL_TARGET,
> > + NULL,
> > + _("Rejecting checkout of non-empty repository into non-empty directory '%s'"),
>
> I think it would be more accurate to s/repository/repository URL/
> or s/repository/repository directory/.
>
> > + absolute_path);
>
> Again svn_dirent_local_style().
>
> > +
> > + svn_pool_clear(scratch_pool);
> > +
>
> This pool usage is not idiomatic.
>
> Does this function (verify_checkout_target()) return a value to its
> caller? If it does, then let its signature have two pools. If it does
> not, then let it take a single 'apr_pool_t *scratch_pool' argument, use
> that pool for both pool arguments of callees, and leave that pool for
> its (verify_checkout_target()'s) caller to clear. (In this case,
> probably at apr_terminate().)

No, `verify_checkout_target()` does not return a value, it only
indicates an error if the target is not valid. The reason why I
created the scratch pool was that I have to pass both a result
and scratch pool to `svn_client_get_repos_root`, where I wasn't
exactly sure which semantics one may expect when passing the same
pool as both result and scratch pool to a callee.

So if I understand it correctly I can expect
`svn_client_get_repos_root` to behave correct in that case, so I
can pass a scratch pool twice?

> The rationale for this is (quoting HACKING):
>
> Functions should not create/destroy pools for their operation; they
> should use a pool provided by the caller. Again, the caller knows
> more about how the function will be used, how often, how many times,
> etc. thus, it should be in charge of the function's memory usage.
>
> >
>
> As to the actual business logic, it seems okay to me (meaning I'm +0),
> however, I'll leave it for others to review it in detail.
>
> Thanks for the detailed comments and log message.
>
> Cheers,
>
> Daniel
>
> P.S. Two more minor points: One, listing_cb() might grow a "This
> implements the `svn_client_list_func2_t' interface." docstring. Two,
> I found the variable name "absolute_path" somewhat opaque; I would
> suggest a name based on the semantics, rather than the data type, e.g.,
> "target" or "candidate_target_directory".

Thanks for your review.

Regards
Patrick

Received on 2016-11-21 11:57:09 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.