[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r25138 - in branches/merge-sensitive-log/subversion: include include/private libsvn_fs libsvn_fs_base libsvn_fs_fs libsvn_fs_util

From: Hyrum K. Wright <hyrum_wright_at_mail.utexas.edu>
Date: 2007-05-30 03:41:23 CEST

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Daniel Rall wrote:
> On Thu, 24 May 2007, hwright@tigris.org wrote:
> ...
>> Add a new API used to determine all of the mergeinfo for each path, and its
>> children, in a set of paths.
>>
>> * subversion/libsvn_fs/fs-loader.h
>> (root_vtable_t): Add get_children_mergeinfo().
>
> +1 on this API being part of the vtable. It keeps merge info
> backend-independent.
>
> This name is awfully close to get_mergeinfo_for_children()...I'd much
> prefer this (and its wrapper APIs) be called get_mergeinfo_for_tree().

Good suggestion. Implemented in r25189.

>> * subversion/libsvn_fs/fs-loader.c
>> (svn_fs_get_children_mergeinfo): New function.
>>
>> * subversion/libsvn_fs_base/tree.c,
>> subversion/libsvn_fs_fs/tree.c
>> (root_vtable): Add svn_fs_mergeinfo__get_children_mergeinfo().
>>
>> * subversion/libsvn_fs_util/mergeinfo-sqlite-index.c
>> (get_mergeinfo_for_children): New function.
>> (svn_fs_mergeinfo__get_children_mergeinfo): New function.
>>
>> * subversion/include/svn_fs.h
>> (svn_fs_get_children_mergeinfo): New prototype.
>>
>> * subversion/include/private/svn_fs_mergeinfo.h
>> (svn_fs_mergeinfo__get_children_mergeinfo): New prototype.
> ...
>> --- branches/merge-sensitive-log/subversion/include/private/svn_fs_mergeinfo.h (original)
>> +++ branches/merge-sensitive-log/subversion/include/private/svn_fs_mergeinfo.h Thu May 24 21:27:21 2007
> ...
>> +/* Get the combined mergeinfo for each one of PATHS (an array of
>> + absolute-in-the-fs paths) under ROOT and return it in *MERGEINFO,
>> + mapping char * paths to mergeinfo hashs. */
>
> Aren't we supposed to be getting the merge info for children of PATHS
> too?
>
> This doc string should indicate that the API conforms to the
> get_children_mergeinfo() interface.

The intent is to get the mergeinfo for the children of PATHS also. The
first thing get_mergeinfo_for_tree() does is call get_mergeinfo() for
PATHS. We then go through this business of querying the database for
each path's childrens' mergeinfo.

>> +svn_error_t *
>> +svn_fs_mergeinfo__get_children_mergeinfo(apr_hash_t **mergeinfo,
>> + svn_fs_root_t *root,
>> + const apr_array_header_t *paths,
>> + apr_pool_t *pool);
>> +
> ...
>> --- branches/merge-sensitive-log/subversion/include/svn_fs.h (original)
>> +++ branches/merge-sensitive-log/subversion/include/svn_fs.h Thu May 24 21:27:21 2007
>> @@ -1214,6 +1214,26 @@
>> svn_boolean_t include_parents,
>> apr_pool_t *pool);
>>
>> +/** Retrive combined mergeinfo for multiple nodes, and their children.
> ^^^^^^^
> "Retrieve". The svn_fs_mergeinfo__get_children_mergeinfo()
> implementation doesn't appear to actually return merge info for PATHS,
> only for their children.
>
>> + *
>> + * @a mergeinfo is filled with mergeinfo for each of the @a paths and
>> + * their children, stored as a mergeinfo hash. It will never be @c NULL,
>> + * but may be empty.
>> + *
>> + * @a root indicate the revision root to use when looking up paths.
>> + *
>> + * @a paths indicate the paths you are requesting information for.
>> + *
>> + * Do any necessary temporary allocation in @a pool.
>> + *
>> + * @since New in 1.5.
>> + */
>> +svn_error_t *
>> +svn_fs_get_children_mergeinfo(apr_hash_t **mergeinfo,
>> + svn_fs_root_t *root,
>> + const apr_array_header_t *paths,
>> + apr_pool_t *pool);
>> +
> ...
>> --- branches/merge-sensitive-log/subversion/libsvn_fs_util/mergeinfo-sqlite-index.c (original)
>> +++ branches/merge-sensitive-log/subversion/libsvn_fs_util/mergeinfo-sqlite-index.c Thu May 24 21:27:21 2007
> ...
>> +/* Get the mergeinfo for all of the children of PATH in REV. Return the
>> + results in PATH_MERGEINFO. PATH_MERGEINFO should already be created
>> + prior to calling this function, but it's value may change as additional
>> + mergeinfos are added to it. */
>> +static svn_error_t *
>> +get_mergeinfo_for_children(sqlite3 *db,
>> + const char *path,
>> + svn_revnum_t rev,
>> + apr_hash_t **path_mergeinfo,
>> + apr_pool_t *pool)
>> +{
>> + sqlite3_stmt *stmt;
>> + int sqlite_result;
>> + char *like_path;
>> +
>> + /* Get all paths under us. */
>> + SQLITE_ERR(sqlite3_prepare(db,
>> + "SELECT MAX(revision), path "
>> + "FROM mergeinfo_changed "
>> + "WHERE path LIKE ? AND revision <= ?;",
>> + -1, &stmt, NULL), db);
>
> Do you need to sort by path (as we do elsewhere)? Perhaps it doesn't
> matter, since we're returning the merge info in a hash.
>
> How do we handle the case where PATH is something like
> "/branches/1.1", and the DB contains a path like "/branches/1.1.1"
> (which we want to filter out, I think)?

Would something like '/branches/1.1/%' work? We don't have to worry
about '/branches/1.1', because we've already retrieved its mergeinfo, so
the only paths we are interested in would be ones which are children of
'/branches/1.1'.

>> + like_path = apr_psprintf(pool, "%s%%", path);
>
> Alternately, could you inline the % SQL wildcard character into the
> SQL statement to avoid the sprintf()?

I don't think so. The % needs to be inside the quote marks, which
sqlite adds as part of parameter binding.

>> + SQLITE_ERR(sqlite3_bind_text(stmt, 1, like_path, -1, SQLITE_TRANSIENT), db);
>> + SQLITE_ERR(sqlite3_bind_int64(stmt, 2, rev), db);
>> +
>> + sqlite_result = sqlite3_step(stmt);
>> + while (sqlite_result != SQLITE_DONE)
>> + {
>> + sqlite_int64 lastmerged_rev;
>> + const char *merged_path;
>> +
>> + if (sqlite_result == SQLITE_ERROR)
>> + return svn_error_create(SVN_ERR_FS_SQLITE_ERROR, NULL,
>> + sqlite3_errmsg(db));
>> +
>> + lastmerged_rev = sqlite3_column_int64(stmt, 0);
>> + merged_path = sqlite3_column_text(stmt, 1);
>> +
>> + /* If we've got a merged revision, go get the merge info from the db */
>> + if (lastmerged_rev > 0)
>> + {
>> + apr_hash_t *db_mergeinfo;
>> +
>> + SVN_ERR(parse_mergeinfo_from_db(db, merged_path, lastmerged_rev,
>> + &db_mergeinfo, pool));
>> +
>> + SVN_ERR(svn_mergeinfo_merge(path_mergeinfo, db_mergeinfo, pool));
>> + }
>> +
>> + sqlite_result = sqlite3_step(stmt);
>> + }
>> +
>> + SQLITE_ERR(sqlite3_finalize(stmt), db);
>> +
>> + return SVN_NO_ERROR;
>> +}
> ...
>> +/* Get all the mergeinfo for all the children for each path. Return a hash of
>> + mergeinfo hashes, keyed on the input paths. */
>
> This information should live only in the doc string in
> svn_fs_mergeinfo.h (which I think it is redundant with).
>
>> +svn_error_t *
>> +svn_fs_mergeinfo__get_children_mergeinfo(apr_hash_t **mergeinfo,
>> + svn_fs_root_t *root,
>> + const apr_array_header_t *paths,
>> + apr_pool_t *pool)
>> +{
>> + apr_hash_t *mergeinfo_str;
>> + svn_revnum_t rev;
>> + sqlite3 *db;
>> + int i;
>> +
>> + SVN_ERR(svn_fs_mergeinfo__get_merge_info(&mergeinfo_str, root, paths, TRUE,
>> + pool));
>
> We never use "mergeinfo_str" after retrieving it, making this either
> a) a bug or b) unnecessary code. I'm guessing (a), and that we
> intended to include the direct or inherited merge info for PATHS in
> our returned MERGEINFO.

You guess correctly (as noted above). Fixed in r25198.

I think there is some opportunity for optimization here (in terms of
parsing, unparsing, and maintaining sqlite connections), but that's
not a priority now.

>> + /* This is an inlining of svn_fs_revision_root_revision(), which we cannot
>> + call from here, because it would be a circular dependency. */
>
> "create a circular dep" (I know, cut and paste :).
>
> As also use this code snippet in svn_fs_mergeinfo__get_mergeinfo(), I
> think we should create a macro for it in this .c file.

Good idea. Done in r25189.

>> + rev = root->is_txn_root ? SVN_INVALID_REVNUM : root->rev;
> ...
>> + for (i = 0; i < paths->nelts; i++)
>> + {
>> + const char *path = APR_ARRAY_IDX(paths, i, const char *);
>> + const char *mergeinfo_path_str = apr_hash_get(*mergeinfo, path,
>> + APR_HASH_KEY_STRING);
>
> Level of indentation is off here. I'd name this "path_mergeinfo_str"
> for consistency with other variables.

Done in r25189.

>> + apr_hash_t *path_mergeinfo;
>> +
>> + if (mergeinfo_path_str)
>> + SVN_ERR(svn_mergeinfo_parse(&path_mergeinfo, mergeinfo_path_str, pool));
>> + else
>> + path_mergeinfo = apr_hash_make(pool);
>> +
>> + SVN_ERR(get_mergeinfo_for_children(db, path, rev, &path_mergeinfo,
>> + pool));
>> +
>> + apr_hash_set(*mergeinfo, path, APR_HASH_KEY_STRING, path_mergeinfo);
>> + }
> ...
>> +}
>
> Hmmm. This is leaving dealing with merge info inheritance up to the
> callers. While this isn't necessarily inconsistent with existing
> merge info API usage, because of the type of data returned by this
> API, this needs to be very clearly documented. Also, are we are
> including the direct or inherited merge info for PATHS in our returned
> MERGEINFO? The doc string makes it look like "no", which is
> surprising...

I've updated the doc string in r25189, does that help clarify things?

> As a novice user, I might have expected this API to list the merge
> info for every child path explicitly (which clearly isn't great in
> terms of performance and scalability).

Listing the mergeinfo to each child explicitly probably isn't feasible,
for the reasons you mention. I hope that by changing the name of the
function, and by updating the doc strings, it clarifies things a bit.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGXNZDCwOubk4kUXwRAhHlAJ4hfG4C4o+IlZOHW8il0wFc+/fbUQCfWjgZ
v1NJcZGV2CfvwwuLljkzZ+c=
=9CqF
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed May 30 03:39:01 2007

This is an archived mail posted to the Subversion Dev mailing list.