[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Issue #1935: svn status too verbose with svn:externals definitions

From: <kfogel_at_collab.net>
Date: 2005-08-25 03:49:08 CEST

This is looking pretty good, IMHO. Below are some comments, mostly
about minor matters. However, there is one C89 compliance issue, one
interesting pool allocation point, and one correctness issue having to
do with UTF8 versus native encoding.

Scott Palmer <scott.palmer@2connected.org> writes:
> [[[
> Fix issue #1935: svn status too verbose with svn:externals definitions
>
> Unless the verbose option (-v) is specified, the message "Performing
> status on external item at ..." is suppressed. When verbose and
> quiet (-q) options are specified together the message will be printed
> only when there will be relevant status output for that external
> folder.
>
> * subversion/clients/cmdline/cl.h
> (svn_cl__status_baton): New. Replaces status_baton in status-cmd.c.
> (svn_cl__notify_baton): New. Replaces notify_baton in notify.c.
> The above structures were needed to communicate information about
> the command line options to svn_wc_notify_status_external in notify.c
> and to provide a place to remember information about the message that
> will be accessible should the code in status-cmd.c determine that the
> message is needed when verbose and quiet options are specified
> together.
>
> While I generally didn't like to tie these two structures together or
> duplicate the printing of the same message in two places, it appears
> to be the simplest, most straightforward method to handle issue
> #1935. Other possible implementations may require API changes that
> should probably be considered outside the scope of this single issue.

Those explanations should really be comments at the appropriate places
in the code, I think (for example, at each duplicate message, and/or
at the new structures). Readers of the code won't necessarily look at
this log message, but they'll see code comments, and anyone changing
(say) one of the messages needs to know to change the other as well.

> * subversion/clients/cmdline/notify.c
> Modified processing of svn_wc_notify_status_external to suppress
> output unless -v is specified (without -q). If the -q option was
> specified then save enough information to regenerate the message from
> status_cmd.c

This is a good, concise description, and in this case it's probably
enough to enable a clueful reader to understand the code change.
However, when we commit this, we'll need to make it follow the log
message guidelines. This is not merely procedure for procedure's
sake; if this log message followed those guidelines, it would be even
*more* helpful to those trying to understand the change, e.g., me :-).

The guidelines are at:

   http://subversion.tigris.org/hacking.html#log-messages

They're pretty easy to grok, I think; let us know if anything there is
unclear.

> * subversion/clients/cmdline/status-cmd.c
> If there is status to print and -q was specified with -v, print
> the svn_wc_notify_status_external message from here, since this is
> the point where we know that relevant status output for the external
> folder will happen. There does not appear to be a straightforward
> way to eliminate printing this message from two possible locations
> (here and from notify.c)
> ]]]

(Same comment about the log message.)

Yeah, I tried to find another way myself and was unable to avoid a
large, complex change, so I sympathize.

> Index: subversion/clients/cmdline/cl.h
> ===================================================================
> --- subversion/clients/cmdline/cl.h (revision 15888)
> +++ subversion/clients/cmdline/cl.h (working copy)
> @@ -150,7 +150,38 @@
> svn_client_ctx_t *ctx;
> } svn_cl__cmd_baton_t;
>
> +struct svn_cl__notify_baton;
> +struct svn_cl__status_baton
> +{
> + /* These fields all correspond to the ones in the
> + svn_cl__print_status() interface. */
> + svn_boolean_t detailed;
> + svn_boolean_t show_last_committed;
> + svn_boolean_t skip_unrecognized;
> + svn_boolean_t repos_locks;
> + apr_pool_t *pool;
>
> + svn_boolean_t had_print_error; /* To avoid printing lots of errors if we get
> + errors while printing to stdout */
> + svn_boolean_t xml_mode;
> + struct svn_cl__notify_baton *nb;
> +};

svn_cl__status_baton should get a general comment explaining what it's
for. Ideally, the old 'struct status_baton' of which this is the heir
would have had such a doc string already, but anyway it becomes even
more important when the struct is pulled out of a status-specific
file.

In general, a lot of the log message's explanatory text about the
situation really belongs near these two structures.

When documenting an individual field (e.g., 'had_print_error'), don't
just state its purpose, state its precise semantics. For example,
"/* TRUE if blah blah blah, FALSE otherwise. */" It's fine to also
describe the purpose, if that's not clear from the semantics, just
don't leave the semantics out.

> +/* Baton for notify and friends. */
> +struct svn_cl__notify_baton
> +{
> + svn_boolean_t received_some_change;
> + svn_boolean_t is_checkout;
> + svn_boolean_t is_export;
> + svn_boolean_t suppress_final_line;
> + svn_boolean_t sent_first_txdelta;
> + svn_boolean_t in_external;
> + svn_boolean_t had_print_error; /* Used to not keep printing error messages
> + when we've already had one print error. */
> + int extern_path_prefix_length; /* Used to recreate external folder name */
> + struct svn_cl__status_baton *sb;
> +};
> +

Same comment about individual fields.

Where it says "Baton for notify and friends", it's not clear what kind
of entity "notify" is. Is it a function? A subcommand? A general
concept? :-) I think a more verbose doc string for the overall
structure would be helpful.

I realize you inherited that comment from the old code. It was too
terse even in its original context; it should have said "notify()"
with parens, probably. But in its new context, it's positively
cryptic, because the function notify() is not local to svn_cl.h.

> /* Declare all the command procedures */
> svn_opt_subcommand_t
> svn_cl__add,
> Index: subversion/clients/cmdline/notify.c
> ===================================================================
> --- subversion/clients/cmdline/notify.c (revision 15888)
> +++ subversion/clients/cmdline/notify.c (working copy)
> @@ -33,27 +33,12 @@
>
> #include "svn_private_config.h"
>
> -
> -/* Baton for notify and friends. */
> -struct notify_baton
> -{
> - svn_boolean_t received_some_change;
> - svn_boolean_t is_checkout;
> - svn_boolean_t is_export;
> - svn_boolean_t suppress_final_line;
> - svn_boolean_t sent_first_txdelta;
> - svn_boolean_t in_external;
> - svn_boolean_t had_print_error; /* Used to not keep printing error messages
> - when we've already had one print error. */
> -};

By the way, the reason I'm not complaining about the fact that most of
the individual fields don't (didn't) have descriptions is that their
names make their purposes and semantics obvious. But my feeling is,
if the name wasn't enough -- and someone obviously felt it wasn't in
the case of 'had_print_error' -- then the very first thing to describe
is the semantics, because that's what someone debugging a problem woud
need to know most urgently.

> /* This implements `svn_wc_notify_func2_t'.
> * NOTE: This function can't fail, so we just ignore any print errors. */
> static void
> notify (void *baton, const svn_wc_notify_t *n, apr_pool_t *pool)
> {
> - struct notify_baton *nb = baton;
> + struct svn_cl__notify_baton *nb = baton;
> char statchar_buf[5] = " ";
> const char *path_local;
> svn_error_t *err;
> @@ -281,10 +266,25 @@
> break;
>
> case svn_wc_notify_status_external:
> - if ((err = svn_cmdline_printf
> - (pool, _("\nPerforming status on external item at '%s'\n"),
> - path_local)))
> - goto print_error;
> + if (nb->sb->detailed) /* -v, --verbose */
> + {
> + if (nb->sb->skip_unrecognized) /* -q, --quiet */
> + {
> + nb->extern_path_prefix_length = strlen(path_local);
> + }
> + else
> + {
> + nb->extern_path_prefix_length = 0;
> + if ((err = svn_cmdline_printf
> + (pool, _("\nPerforming status on external item at '%s'\n"),
> + path_local)))
> + goto print_error;
> + }
> + }
> + else
> + {
> + nb->extern_path_prefix_length = 0;
> + }
> break;

<nod> Okay, I see the logic there. It makes me think that
'extern_path_prefix_length' is perhaps too implementation-oriented a
name, and that 'extern_path_directory_length' (along with detailed
documentation in the structure, of course) might be better.

Formatting nit: space before paren in funcall: "strlen (...)".

And, obviously, this is one of the places where you should point out
that the same message is printed in status-cmd.c:print_status().

It's a problem that you're calculating the length based on a path in
native encoding, but then later in print_status() you're printing the
path as UTF8. See my comments later on that.

> case svn_wc_notify_status_completed:
> @@ -391,7 +391,7 @@
> svn_boolean_t suppress_final_line,
> apr_pool_t *pool)
> {
> - struct notify_baton *nb = apr_palloc (pool, sizeof (*nb));
> + struct svn_cl__notify_baton *nb = apr_palloc (pool, sizeof (*nb));
>
> nb->received_some_change = FALSE;
> nb->sent_first_txdelta = FALSE;
> @@ -400,6 +400,8 @@
> nb->suppress_final_line = suppress_final_line;
> nb->in_external = FALSE;
> nb->had_print_error = FALSE;
> + nb->extern_path_prefix_length = 0;
> + nb->sb = NULL;
>
> *notify_func_p = notify;
> *notify_baton_p = nb;
> Index: subversion/clients/cmdline/status-cmd.c
> ===================================================================
> --- subversion/clients/cmdline/status-cmd.c (revision 15888)
> +++ subversion/clients/cmdline/status-cmd.c (working copy)
> @@ -22,6 +22,7 @@
>
> /*** Includes. ***/
>
> +#include "svn_cmdline.h"
> #include "svn_wc.h"
> #include "svn_client.h"
> #include "svn_error.h"
> @@ -32,26 +33,8 @@
>
> #include "svn_private_config.h"
>
> -
> -
> /*** Code. ***/
>
> -struct status_baton
> -{
> - /* These fields all correspond to the ones in the
> - svn_cl__print_status() interface. */
> - svn_boolean_t detailed;
> - svn_boolean_t show_last_committed;
> - svn_boolean_t skip_unrecognized;
> - svn_boolean_t repos_locks;
> - apr_pool_t *pool;
> -
> - svn_boolean_t had_print_error; /* To avoid printing lots of errors if we get
> - errors while printing to stdout */
> - svn_boolean_t xml_mode;
> -};
> -
> -
> /* Prints XML target element with path attribute TARGET, using POOL for
> temporary allocations. */
> static svn_error_t *
> @@ -119,9 +102,17 @@
> const char *path,
> svn_wc_status2_t *status)
> {
> - struct status_baton *sb = baton;
> + struct svn_cl__status_baton *sb = baton;
> svn_error_t *err;
> -
> +
> + if (sb->nb->extern_path_prefix_length > 0)
> + {
> + err = svn_cmdline_printf
> + (sb->pool, _("\nPerforming status on external item at '%s'\n"),
> + apr_pstrndup (sb->pool, path, sb->nb->extern_path_prefix_length));
> + sb->nb->extern_path_prefix_length = 0;
> + }
> +

Same point about mentioning that this message is also printed in
notify.c:notify(), of course.

Somehow it bugs me that we have to dup part of the string every time
to print this out. But I agree it would be over-optimization to keep
an svn_stringbuf_t in the structure, solely for the purpose of
avoiding these little pool allocations.

Someday, someone's going to show up on the list asking why they're
running out of memory, and it's going to turn out that they have
65,000,000 svn:externals properties set, and we're going to feel dumb.
Oh well.

One thing that would be helpful, as long as you're moving status_baton
out to svn_cl.h, is to document its 'pool' field. Specifically,
whether that gets cleared at certain points during a status run, or
not. It's a great example of why it's important for the semantics of
every field to be documented: if the pool parameter had been
adequately documented in the first place, I wouldn't be sitting here
wondering whether someone might run out of memory someday... :-)

Inspection of the code reveals that, unfortunately, sb->pool is never
cleared right now. On the other hand, it also looks like
print_status() could clear it without harm. That's the only place the
pool is used, and it's used only for ephemeral function calls anyway.
Since your change adds more allocation burden to that pool, I think it
would be appropriate to have a pool clear be part of this change.

One problem in the current code is that while print_status()
implements the 'svn_wc_status_func2_t' interface, it does not say so
explicitly. We should always be saying what interfaces we implement;
in most of the code, we live up to this ideal, but apparently we
forgot in this case.

So the comment above print_status() should really read:

   /* A callback function for printing STATUS for PATH.
      This implements the 'svn_wc_status_func2_t' interface. */

Why am I bothering to say this, when it's not part of your change?
Well...

print_status() just calls svn_cl__print_status_xml() and
svn_cl__print_status() to do the real work. Both of those functions
take their 'path' arguments in UTF8, as does print_status() itself,
since it implements a public interface and Subversion public
interfaces always take paths in UTF8 unless otherwise stated.

So, you can't just print out 'path'; you have to convert it to native
encoding and print that instead. And this solves the problem that
back in notify.c, you calculate 'extern_path_prefix_length' based on
'path_local', because now the calculation will match what's being
printed later.

Do remember to document in the structure that
'extern_path_prefix_length' is the length in native encoding, not in
UTF8 as might be normally expected.

> if (sb->xml_mode)
> err = svn_cl__print_status_xml (path, status, sb->pool);
> else
> @@ -156,7 +147,7 @@
> apr_pool_t * subpool;
> int i;
> svn_opt_revision_t rev;
> - struct status_baton sb;
> + struct svn_cl__status_baton sb;
> svn_revnum_t repos_rev = SVN_INVALID_REVNUM;
>
> SVN_ERR (svn_opt_args_to_target_array2 (&targets, os,
> @@ -169,7 +160,10 @@
> if (! opt_state->xml)
> svn_cl__get_notifier (&ctx->notify_func2, &ctx->notify_baton2, FALSE,
> FALSE, FALSE, pool);
> -
> + struct svn_cl__notify_baton *nb = ctx->notify_baton2;
> + sb.nb = nb;
> + nb->sb = &sb;
> +
> /* Add "." if user passed 0 arguments */
> svn_opt_push_implicit_dot_target(targets, pool);

We write C the old-fashioned way around here -- no declarations after
code :-). The 'svn_cl__notify_baton *nb' should be up in the
declaration block at the beginning of the function.

Have you verified that the change passes 'make check', by the way?

-Karl

-- 
www.collab.net  <>  CollabNet  |  Distributed Development On Demand
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Aug 25 04:48:47 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.