On Sun, Oct 11, 2015 at 3:29 PM, Bert Huijben <bert_at_qqmail.nl> wrote:
> Why a uuid and not just some int64 atomic incrementation value that is
> cached ir similar.
>
You are right, anything "unique enough" will do
and the code can easily be changed to using
and i64 later. I'll do that right after my commit
queue for revprop handling runs empty.
UUIDs are just handy and implementing a
process-wide shared i64 counter requires some
extra coding.
> Generating a uuid is not something that is not measurable performance
> wise. It might take some expensive interproces communication or even
> hardware access.
>
OTOH, the UUID generation is not time-critical.
It will only happen if a repos layer operation
explicitly allows cached revprops and then it
will only happen once for the whole e.g. report.
-- Stefan^2.
>
> Bert
> ------------------------------
> From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
> Sent: 11-10-2015 14:25
> To: Subversion Development <dev_at_subversion.apache.org>
> Subject: Efficient revprop access in lib_repos
>
> [This is the rationale and additional documentation to an upcoming
> set of commits.]
>
> When changing a revprop, we provide the following visibility guarantees:
>
> 1. A request that ends before the set_revprop started, sees the old value
> (duh!).
> 2. A request that starts after the set_revprop completes, sees the new
> value.
> 3. A request that starts before the set_revprop completes, may see the old
> or
> the new value and may not be consistent about it.
>
> The reporter in lib_repos exploits 3. by keeping a hash of the dates
> and authors for all revisions it encountered so far. Not only saves
> that 50+% of revprop lookups but also guarantees consistent properties
> for all reported nodes - at the expense of ~80 bytes/reported revision.
>
> I'd like to expand on that by giving the FS API users more control on
> whether they need to get the latest revprop state (e.g. at the beginning
> of a lib_repos report) or not (e.g. follow-up requests during the same
> lib_repos report). That will allow the FS layer to read whole revprop
> packs and simply deliver their contents during 'svn log' instead of
> re-opening the same files over and over again.
>
> I'll commit a patch set that
>
> * introduces the notion of a barrier in svn_fs_revision_prop2 and
> svn_fs_revision_proplist2 (default: read latest from disk) plus
> a new explicit barier function svn_fs_refresh_rev_props,
> * updates the lib_repos queries to only use one barrier, and
> * implements them in the FS backends.
>
> Note that it is perfectly legal for an FS to ignore the extra flag
> and always fetch the data from disk. So, this how our backends will
> implement it:
>
> * BDB will ignore the flag and always deliver data as today.
>
> * FSFS will use a *temporary*, svn_fs_t-local revprop cache, keyed
> by a UUID. Whenever there is a barrier, the UUID gets cleaned and
> any cache lookup would miss (it should not even try to use the cache).
> Upon the first non-barrier request, a new UUID gets created and the
> cache will be populated whenever we read revprops from disk.
>
> * FSX will only check for a new revprop generation at a barrier.
> That eliminates the requirement to keep the revprop gen file open,
> IOW, revprop caching works with the "normal" file open/read/close
> pattern.
>
> As a result in FSFS, we get most of the benefits of revprop caching,
> fewer file operations and lower CPU load, without introducing complex
> cache invalidation schemes. This is significant in packed repos
> but will also benefit non-packed revs in some operations.
>
> I'll roughly commit the changes in the following order:
>
> * minor cleanup of the FSFS code to keep the relevant change small
> * update FS vtable, defaulting to current behavior
> * implement the new behavior in FSFS
> * bump FS API
> * update lib_repos queries one-by-one for review and testability
> * implement the new behavior in FSX
>
> -- Stefan^2.
>
Received on 2015-10-11 15:41:37 CEST