[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] #2850 Retrieve arbitrary revprops from log (long)

From: Eric Gillespie <epg_at_pretzelnet.org>
Date: 2007-09-18 05:06:25 CEST

"David Glasser" <glasser@davidglasser.net> writes:

> > Index: subversion/include/svn_repos.h
> > /**
> > - * Same as svn_repos_get_logs4(), but with @a receiver being a
> > - * @c svn_log_message_receiver_t instead of @c svn_log_message_receiver2_t.
> > - * Also, @a omit_log_text is set to @c FALSE.
> > + * Same as svn_repos_get_logs4(), but with @a receiver being a @c
> > + * svn_log_message_receiver_t instead of @c svn_log_entry_receiver_t.
>
> Inconsistent grammar (*a* svn_log_message_receiver_t vs
> svn_log_entry_receiver_t) carried over from original.

Fixed.

> > Index: subversion/include/svn_types.h
> > ===================================================================
> > --- subversion/include/svn_types.h (revision 26570)
> > +++ subversion/include/svn_types.h (working copy)
> > @@ -586,15 +586,9 @@
> > /** The revision of the commit. */
> > svn_revnum_t revision;
> >
> > - /** The author of the commit. */
> > - const char *author;
> > + /** The hash of revision properties, or NULL if none. */
>
> Perhaps insert "requested" before "revision"?

Fixed.

> > +/* XXX Hmm, new svn_compat.h header for these? */
> > +
>
> Yeah, that might be a good idea.

I turned this into a TODO(epg), and I'll do it as a follow-up.

> > Index: subversion/include/svn_client.h
> > /**
> > * Similar to svn_client_log4(), but using @c svn_log_message_receiver_t
> > - * instead of @c svn_log_message_receiver2_t. Also, @a
> > - * include_merged_revisions and @a omit_log_text are set to @c FALSE
> > + * instead of @c svn_log_entry_receiver_t. Also, @a
> > + * include_merged_revisions is set to @c FALSE and @a revprops is
>
> Perhaps "contains" instead of "is" at the end here?

To me, "is" means that's all there is, but "contains" may mean it
contains other things as well.

> > Index: subversion/include/svn_ra.h
> > + * @note Pre-1.5 servers do not support custom revprop retrieval; if @a
> > + * revprops lists any but svn:author, svn:date, and svn:log, an @c
> > + * SVN_ERR_RA_NOT_IMPLEMENTED error is returned.
>
> How about "if @a revprops is NULL or contains a revprop other than svn:author,
> svn:date, or svn:log, an ..."? (This makes it clear that (a) NULL is
> bad and (b) listing a subset of the three standard ones is OK.)

Fixed.

> > Index: subversion/libsvn_subr/compat.c
> > +svn_compat_log_revprops_clear(apr_hash_t **revprops)
> > +{
> > + if (*revprops)
> > + {
> > + apr_hash_set(*revprops, SVN_PROP_REVISION_AUTHOR,
> > + APR_HASH_KEY_STRING, NULL);
> > + apr_hash_set(*revprops, SVN_PROP_REVISION_DATE,
> > + APR_HASH_KEY_STRING, NULL);
> > + apr_hash_set(*revprops, SVN_PROP_REVISION_LOG,
> > + APR_HASH_KEY_STRING, NULL);
> > + /* XXX bother with this? */
> > +/* if (apr_hash_count(*revprops) == 0) */
> > +/* *revprops = NULL; */
>
> I think leaving this out is good, since the semantics of NULL vs empty
> are rather different in most places (though admittedly not in the
> couple of places where this function is used).

I was modelling this after changed_paths, which always either is
NULL or has elements. However, none of that code has to go to
unusual lengths to maintain this, as we would in this case.
Also, it's not documented that way:

 * If @a log_entry->changed_paths is non-@c NULL, then it contains as keys
 * every path committed in @a log_entry->revision; the values are
 * (@c svn_log_changed_path_t *) structures.

That doesn't say that changed_paths is never an empty array. I'd
rather leave it lax, but I wanted to check with others first.

> > log_wrapper_callback(void *baton,
> > svn_log_entry_t *log_entry,
> > apr_pool_t *pool)
> > {
> > struct log_wrapper_baton *lwb = baton;
> > + const char *author, *date, *message;
> >
> > + svn_compat_log_revprops_out(&author, &date, &message, log_entry->revprops);
>
> Move the declaration of the variables into the conditional, and remove
> this duplicate call to svn_compat_log_revprops_out.

Heh, hmm. Fixed.

> > Index: subversion/libsvn_client/log.c
> > +pre_15_receiver(void *baton, svn_log_entry_t *log_entry, apr_pool_t *pool)

> It might be better to make the fallback implementation use
> svn_ra_rev_proplist for any request which requires non-standard
> revprops instead of potentially making multiple svn_ra_rev_prop
> requests. The tradeoff is minimizing the number of round-trip
> requests vs. minimizing the amount of unrequested data sent; I'm not
> sure what the best alternative in practice would be.

I wrote it to minimize the amount of data rather than the number
of requests, since property values may be arbitrarily large. It
may be that most revprop values in the wild are small, but I was
afraid of the diabolical behavior in the other case. I'm willing
to go either way.

> > + /* Caller requested custom revprops from apre-1.5 server;
>
> Missing space between "a" and "pre".

Fixed.

> This conditional is always true (since we're in a "if (!pre_15_server
> && ...)" block). Actually, it looks like the logic for this section
> is broken.
>
> (I pause to note that wow, the behavior of this codepath (for
> something like "svn log -rCOMMITTED:42 file1 file2") is actually
> really bizarre.)
>
> It looks like in your version, when doing compatability mode,
> essentially nothing gets returned after the first iteration through
> the loop; I think you're missing a block that says "if pre_15_server
> call svn_ra_get_log2 with pre_15_receiver".
>
> Also, I think that the pre_15_server boolean declaration can be moved
> into the "if (start_is_local || end_is_local)" block.

It's completely broken, you're right. I guess we have no tests
covering this case, or they're broken. I'll fix that, too.

> > Index: subversion/libsvn_ra_serf/log.c
> > + /* pre-1.5 compatibility */
> > + svn_boolean_t seen_revprop_element;
> > + svn_boolean_t want_author;
> > + svn_boolean_t want_custom_revprops;
>
> Reverse the order of the above two fields, since want_author,
> want_date, and want_message are three of a kind.

Done.

> > + else if (state == REVPROP_VALUE &&
> > + strcmp(name.name, "revprop-value") == 0)
> > + {
>
> Possibly sanity-check here that info->revprop_name is non-NULL?

Hm? No version of the server sends revprop-name elements but not
revprop-value elements.

> (Also, I noticed that in the output of "svn log --xml" revprop names
> are attributes on an element, whereas here they are elements
> alternating with the values; any reason not to make them attributes in
> the DAV request too?)

In both cases, I copied and pasted from nearby code. I see now
that copyfrom info is attributes, so I can change that if you
like. I am no DAV expert and have no opinion. I suppose there's
an argument to be made for not transmitting the same word twice
and going with an attribute, but then we should have used a
better format than XML in the first place...

> > Index: subversion/libsvn_ra_neon/log.c
> > ===================================================================
> > --- subversion/libsvn_ra_neon/log.c (revision 26570)
> > +++ subversion/libsvn_ra_neon/log.c (working copy)
> > @@ -56,13 +56,21 @@
> >
> > /* Information about each log item in turn. */
> > svn_log_entry_t *log_entry;
> > + /* Place to hold revprop name until we get the revprop-value element. */
> > + char *revprop_name;
> > + /* pre-1.5 compatibility */
> > + svn_boolean_t seen_revprop_element;
> > + svn_boolean_t want_author;
> > + svn_boolean_t want_custom_revprops;
>
> Reverse order here too?

Done.

> > + case ELEM_revprop_value:
> > + if (! lb->log_entry->revprops)
> > + lb->log_entry->revprops = apr_hash_make(lb->subpool);
>
> Check lb->revprop_name != NULL?

Same as for serf.

> > - if (omit_log_text)
> > + if (revprops)
> > {
> > + lb.want_author = lb.want_date = lb.want_message = FALSE;
> > + lb.want_custom_revprops = FALSE;
> > + for (i = 0; i < revprops->nelts; i++)
> > + {
> > + char *name = APR_ARRAY_IDX(revprops, i, char *);
> > + svn_stringbuf_appendcstr(request_body, "<S:revprop>");
> > + svn_stringbuf_appendcstr(request_body, name);
>
> Reading this makes me wonder if we need to validate the property name
> before putting it into XML output...

I don't know where it's documented (though svnbook implies it),
but I thought property names were fine here.
neon/props.c:append_setprop doesn't seem to do anything special.

> > Index: subversion/mod_dav_svn/reports/log.c
> > @@ -78,6 +81,7 @@
> > apr_pool_t *pool)
> > {
> > struct log_receiver_baton *lrb = baton;
> > + svn_boolean_t no_custom_revprops = lrb->requested_custom_revprops;
>
> OK, so this line is entirely correct, but at first glance it looks
> like you've written
>
> svn_boolean_t no_foo = lrb->foo;
>
> Maybe throw in a comment that makes it clear that there's no missing !
> ?

Done.

> > + /* If this is still FALSE after the loop, we haven't seen either of
> > + the revprop elements, meaning a pre-1.5 client; we'll return the
> > + standard author/date/log revprops. */
> > + seen_revprop_element = FALSE;
>
> Why not initialize at the declaration?

I wanted this close to the loop body.

> > + /* XXX can i remove this comment in a follow-up? */
> > /* ### todo: okay, now go fill in svn_ra_dav__get_log() based on the
> > syntax implied below... */

I'll go ahead and remove this...

> > + else if (strcmp(child->name, "revprop") == 0)
> > + {
> > + /* XXX should we raise an error if all-revprops and revprop
> > + elements are given? */
>
> Sure!

Which error should I raise? SVN_ERR_RA_DAV_MALFORMED_DATA?

> Index: subversion/tests/cmdline/log_tests.py

> > + def ignore(self, *args, **kwargs):
> > + del self.cdata[:]
>
> Maybe it's more efficient or something, but "self.cdata = []" is more
> clear to me.

That's the idiomatic way to clear a list. self.cdata = [] is
entirely different:

  some_obj.some_list = [3]
  foo = some_obj.some_list
  ...
  some_obj.clear_list()
  # foo is [3] in your case, should be []

Granted, that's not going to happen in this case, but they remain
different operations, and list-clearing is what I'm after here.

> > + def author_end(self):
> > + return self.svn_prop('author')
>
> svn_prop doesn't seem to return anything...

It's a habit; fixed.

> I would refactor this test into three tests (with a common setup
> function): one which should work against any server, one for 1.5
> servers, and one for older server, and then use conditional Skips to
> only run two of them.

I pointed only you at the diff rather than sending it to the list
because it wasn't ready for publication yet ;->. You've caught
me in the middle of reworking the test. We no longer have
separate pre- and post- 1.5 behavior, thanks to you suggestion of
requesting the revprops, so no need for separate tests.

> > Index: subversion/tests/cmdline/svntest/main.py
> > ===================================================================
> > --- subversion/tests/cmdline/svntest/main.py (revision 26584)
> > +++ subversion/tests/cmdline/svntest/main.py (working copy)
> > @@ -941,6 +941,7 @@
> > print 'EXCEPTION: %s: %s' % (ex.__class__.__name__, ex_args)
> > else:
> > print 'EXCEPTION:', ex.__class__.__name__
> > + traceback.print_exc(file=sys.stdout)
>
> Please commit this, but separately!
>
> > except KeyboardInterrupt:
> > print 'Interrupted'
> > sys.exit(0)

Heh, more of the "not-ready" diff :). This change is always in
all my working copies; I just carefully exclude it from diffs and
commits. I've been meaning to ask about making the change for a
while, but haven't gotten around to it, because what I really
want to do is drop all this exception trapping in the first place.

> > Index: subversion/libsvn_repos/log.c
> > ===================================================================
> > --- subversion/libsvn_repos/log.c (revision 26570)
> > +++ subversion/libsvn_repos/log.c (working copy)
> > @@ -38,9 +38,9 @@
> > const char *path,
> > svn_revnum_t rev,
> > svn_boolean_t discover_changed_paths,
> > - svn_boolean_t omit_log_text,
> > + apr_array_header_t *revprops,
> > svn_boolean_t descending_order,
> > - svn_log_message_receiver2_t receiver,
> > + svn_log_entry_receiver_t receiver,
> > void *receiver_baton,
> > svn_repos_authz_func_t authz_read_func,
> > void *authz_read_baton,
> > @@ -830,22 +830,14 @@
> > svn_revnum_t rev,
> > svn_fs_t *fs,
> > svn_boolean_t discover_changed_paths,
> > - svn_boolean_t omit_log_text,
> > + apr_array_header_t *revprops,
> > svn_repos_authz_func_t authz_read_func,
> > void *authz_read_baton,
> > apr_pool_t *pool)
> > {
> > - svn_string_t *author, *date, *message;
> > apr_hash_t *r_props, *changed_paths = NULL;
> > + svn_boolean_t get_revprops = TRUE, censor_message = FALSE;
>
> The censor_message-based logic here is wrong. The security policy is
> that partially-readable revisions allow only svn:author and svn:date,
> not that they disallow only svn:log. See
> svn_repos_fs_revision_proplist, for example.

Fixed.

> (Random notes while I'm at it: perhaps this function should be using
> the new svn_repos_check_revision_access; and also, this security
> policy should perhaps be documented with
> svn_repos_check_revision_access or svn_repos_revision_access_level_t.)

Ah, neat. Separate change, though.

> > + {
> > + FILE *fp = fopen("/tmp/svnservelog", "a");
> > + fprintf(fp, "send %s\n", name);
> > + fclose(fp);
> > + }
>
> This doesn't have an XXX; make sure to remember to remove it.

Hehe, fixed.

> > Index: subversion/libsvn_ra_svn/protocol
> > ===================================================================
> > --- subversion/libsvn_ra_svn/protocol (revision 26570)
> > +++ subversion/libsvn_ra_svn/protocol (working copy)
> > @@ -342,14 +342,17 @@
> > log
> > params: ( ( target-path:string ... ) [ start-rev:number ]
> > [ end-rev:number ] changed-paths:bool strict-node:bool
> > - ? limit:number
> > - ? include-merged-revisions:bool omit-log-text:bool )
> > + ? limit:number
> > + ? include-merged-revisions:bool
> > + all-revprops | revprops
> > + ? ( revprop:string ... ) )
> > Before sending response, server sends log entries, ending with "done".
> > If a client does not want to specify a limit, it should send 0 as the
> > limit parameter.
> > log-entry: ( ( change:changed-path-entry ... ) rev:number
> > - [ author:string ] [ date:string ] [ message:string ]
> > - ? has-children:bool invalid-revnum:bool )
> > + [ author:string ] [ date:string ] [ message:string ]
> > + ? has-children:bool invalid-revnum:bool
> > + revprop-count:number rev-props:proplist )
>
> You should probably document whether or not author/date/message show
> up in the rev-props as well. (I think they don't?)

They don't, right. Fixed.

> > Index: subversion/svn/log-cmd.c
> > + if (!opt_state->xml)
> > + {
> > + if (opt_state->all_revprops)
> > + return svn_error_create(SVN_ERR_CL_ARG_PARSING_ERROR, NULL,
> > + _("'with-all-revprops' option only valid in"
> > + " XML mode"));
> > + if (opt_state->revprop_table != NULL)
> > + return svn_error_create(SVN_ERR_CL_ARG_PARSING_ERROR, NULL,
> > + _("'with-revprop' option only valid in"
> > + " XML mode"));
> > + }
> > +
>
> Perhaps check here to disallow use of both --with-all-revprops and
> --with-revprop simultaneously?

I prefer commands that allow later options to override earlier
ones, so if I append --with-all-revprops to a command that
already has some --with-revprop options, it just works. I'm not
sure off-hand if we have any kind of precedence for that in svn.

> > /* Before allowing svn_opt_args_to_target_array() to canonicalize
> > all the targets, we need to build a list of targets made of both
> > ones the user typed, as well as any specified by --changelist. */
> > @@ -547,6 +572,31 @@
> > if (! opt_state->incremental)
> > SVN_ERR(svn_cl__xml_print_header("log", pool));
> >
> > + if (opt_state->all_revprops)
> > + revprops = NULL;
> > + else if (opt_state->revprop_table != NULL)
> > + {
> > + apr_hash_index_t *hi;
> > + revprops = apr_array_make(pool,
> > + apr_hash_count(opt_state->revprop_table),
> > + sizeof(char *));
> > + for (hi = apr_hash_first(pool, opt_state->revprop_table);
> > + hi != NULL;
> > + hi = apr_hash_next(hi))
> > + {
> > + char *property;
> > + apr_hash_this(hi, (void *)&property, NULL, NULL);
>
> Here (or actually, better in parse_revprop), use
> svn_prop_name_is_valid to make sure the property name is legal.

Ah, I didn't know about that one. Its documentation confirms
what I thought about property names, so we need take no special
care in the DAV code. It looks like only propedit-cmd.c uses
this right now; I'll add it to parse_revprop and some other
commands in a separate change, before this one.

> Also, it appears that the user can type
>
> $ svn log --xml --with-revprop foo=bar
>
> and it will be interpreted as '--with-revprop foo'. Perhaps check to
> make sure that the *value* in the hash is "" and error otherwise?

Sure, OK, done.

> (Also, huh, there's no "make an array from the keys of this hash"
> function?)

I don't know, but now we can't use it anyway ;->.

> > Index: subversion/svnserve/serve.c
> > + return svn_error_create(SVN_ERR_RA_SVN_MALFORMED_DATA, NULL,
> > + _("Log revprop entry not astring"));
>
> Missing space between "a" and "string".

Fixed.

> In general I like the feature though :)

Whew! I'll fix the client compat logic tomorrow afternoon/
evening, and maybe some other things depending on your response,
and post a new diff then.

Thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Sep 18 05:06:40 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.