[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

is_packed(noderev); revision root ID caching (was: svn commit: r1029340 - in /subversion/branches/performance/subversion/libsvn_fs_fs: fs_fs.c id.c id.h)

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Sun, 31 Oct 2010 17:09:17 +0200

I confirm that this fixes the test failures I reported yesterday.

Review below.

stefan2_at_apache.org wrote on Sun, Oct 31, 2010 at 13:40:12 -0000:
> Author: stefan2
> Date: Sun Oct 31 13:40:12 2010
> New Revision: 1029340
>
> URL: http://svn.apache.org/viewvc?rev=1029340&view=rev
> Log:
> Fix handling of cached revision root IDs. Issue discussed here:
> http://svn.haxx.se/dev/archive-2010-10/0497.shtml
>

> The /trunk code caches them for a single session (i.e. server
> request) only and assumes that the respective revision files
> won't get packed while the request is being processed.
>

In other words, if there is a long-running request, and packing is run while the
request is in progress, there is a window where that request will
mis-compute the offsets?

> But now, the cache lives as long as the server process and
> therefore, we must detect whether the respective revision
> got packed (or, theoretically, unpacked) after we cached its
> root ID. If that state changed, the file offset needs to be
> updated. Simply re-read the whole ID in that case.
>

I've fixed more than one of these problems. Usually the solution was to
keep using the revfile-relative offsets, and only at the point where
they are to be used, to translate the (rev,offset) pair into
a (packfile, new offset) pair. (This is done at the very point the rev
file is being opened, to account for the possibility that it becomes
packed /just as/ we attempt to open it.) It seems that you went with
"if I cached (rev,offset) and now it's packed, give up". Why this
difference?

Tangentially, another place where the supposedly-missing "offset +=
get_rev_offset_in_pack_file()" might have a noticeable effect is if the
repository is already packed when the server is started; but I presume
you've tested your code on packed repositories.

> * subversion/libsvn_fs_fs/id.h
> (svn_fs_fs__is_packed, svn_fs_fs__set_packed): new functions
> to store the "rev is packed" state
>
> * subversion/libsvn_fs_fs/id.c
> (id_private_t): add new element
> (svn_fs_fs__is_packed, svn_fs_fs__set_packed): implement
> (svn_fs_fs__id_txn_create, svn_fs_fs__id_rev_create, svn_fs_fs__id_copy,
> svn_fs_fs__id_parse): handle the new element as well
>
> * subversion/libsvn_fs_fs/fs_fs.c
> (svn_fs_fs__rev_get_root): require "is packed" flag to match
> before returning a cached root ID; store that flag in cached ID
>
> Modified:
> subversion/branches/performance/subversion/libsvn_fs_fs/fs_fs.c
> subversion/branches/performance/subversion/libsvn_fs_fs/id.c
> subversion/branches/performance/subversion/libsvn_fs_fs/id.h
>
> Modified: subversion/branches/performance/subversion/libsvn_fs_fs/fs_fs.c
> URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_fs_fs/fs_fs.c?rev=1029340&r1=1029339&r2=1029340&view=diff
> ==============================================================================
> --- subversion/branches/performance/subversion/libsvn_fs_fs/fs_fs.c (original)
> +++ subversion/branches/performance/subversion/libsvn_fs_fs/fs_fs.c Sun Oct 31 13:40:12 2010
> @@ -2953,7 +2953,8 @@ svn_fs_fs__rev_get_root(svn_fs_id_t **ro
>
> SVN_ERR(svn_cache__get((void **) root_id_p, &is_cached,
> ffd->rev_root_id_cache, &rev, pool));
> - if (is_cached)
> + if (is_cached &&
> + is_packed_rev(fs, rev) == svn_fs_fs__is_packed(*root_id_p))

Bug: you are comparing a boolean to a tristate.

Even if both of them were tristates, I still think you would have had to
check that they are both true or both false (but not both unknown).

> return SVN_NO_ERROR;
>
> /* we don't care about the file pointer position */
> @@ -2967,6 +2968,9 @@ svn_fs_fs__rev_get_root(svn_fs_id_t **ro
>
> SVN_ERR(get_fs_id_at_offset(&root_id, apr_rev_file, fs, rev, root_offset,
> pool));
> + svn_fs_fs__set_packed(root_id, is_packed_rev(fs, rev)
> + ? svn_tristate_true
> + : svn_tristate_false);
>
> SVN_ERR(svn_file_handle_cache__close(revision_file));
>
>
> Modified: subversion/branches/performance/subversion/libsvn_fs_fs/id.c
> URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_fs_fs/id.c?rev=1029340&r1=1029339&r2=1029340&view=diff
> ==============================================================================
> --- subversion/branches/performance/subversion/libsvn_fs_fs/id.c (original)
> +++ subversion/branches/performance/subversion/libsvn_fs_fs/id.c Sun Oct 31 13:40:12 2010
> @@ -34,6 +34,7 @@ typedef struct {
> const char *txn_id;
> svn_revnum_t rev;
> apr_off_t offset;
> + svn_tristate_t is_packed;
> } id_private_t;
>

(I'll refer to this point later)

>
> @@ -84,6 +85,24 @@ svn_fs_fs__id_offset(const svn_fs_id_t *
> }
>
>
> +svn_tristate_t
> +svn_fs_fs__is_packed(const svn_fs_id_t *id)
> +{
> + id_private_t *pvt = id->fsap_data;
> +
> + return pvt->is_packed;
> +}
> +

I think these should be in the svn_fs_fs__id_ namespace:

        svn_fs_fs__id_is_packed
        svn_fs_fs__id_set_packed

for consistency with the rest of the file (and svn), and because
"__is_packed" seems to me to name function that should talk about
revisions, not about id's.

> +void
> +svn_fs_fs__set_packed(const svn_fs_id_t *id,
> + svn_tristate_t is_packed)
> +{
> + id_private_t *pvt = id->fsap_data;
> +
> + pvt->is_packed = is_packed;
> +}
> +

None of the other id_private_t members has a setter function, why
should this one be an exception?

> +
> svn_string_t *
> svn_fs_fs__id_unparse(const svn_fs_id_t *id,
> apr_pool_t *pool)
>
> Modified: subversion/branches/performance/subversion/libsvn_fs_fs/id.h
> URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_fs_fs/id.h?rev=1029340&r1=1029339&r2=1029340&view=diff
> ==============================================================================
> --- subversion/branches/performance/subversion/libsvn_fs_fs/id.h (original)
> +++ subversion/branches/performance/subversion/libsvn_fs_fs/id.h Sun Oct 31 13:40:12 2010
> @@ -49,6 +49,12 @@ svn_revnum_t svn_fs_fs__id_rev(const svn
> ID. */
> apr_off_t svn_fs_fs__id_offset(const svn_fs_id_t *id);
>
> +/* Access the "is_packed" flag of the ID. */
> +svn_tristate_t svn_fs_fs__is_packed(const svn_fs_id_t *id);
> +
> +/* Set the "is_packed" flag of the ID. */
> +void svn_fs_fs__set_packed(const svn_fs_id_t *id, svn_tristate_t is_packed);
> +

Two issues here:

* I think these need more docs. The other functions (which read, e.g.,
  "access the 'node id' portion of the ID") include by reference the
  docs in the 'structure' file. For the 'packed?' member, however,
  there is no such concept as a "packed node-revision" -- only a "packed
  revision file" --- so you have to document it explicitly.

* Design issue: is "is this ID packed?" a valid question to ask?

  What exactly do you mean by this? Are you referring to whether the
  (revision,offset) at which the noderev header appears is a packed
  revision? Or to whether the data and props representations live in
  packed revisions?

  The second question is not a good one because the data can live in
  a packed rev while the props in a non-packed rev. The first question,
  I'm not sure that's an inherent property of the noderev->id itself: if
  the cache needs to cache the value of is_packed_rev(id->revision) for
  validation upon get(), it can do that without storing its bookkeeping
  in the id_private_t itself.
  
  (But see my first response; I think it's just fine to return the
  offset-within-revision regardless of whether that revision is or was
  packed, and leave it to whoever uses the offset to open the pack file
  instead of the revision file, should that be needed.)

> /* Convert ID into string form, allocated in POOL. */
> svn_string_t *svn_fs_fs__id_unparse(const svn_fs_id_t *id,
> apr_pool_t *pool);
>
>

Thanks for the quick fix,

Daniel
Received on 2010-10-31 16:12:04 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.