[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] Rationalize WC internal API to read pristine and working versions

From: Greg Stein <gstein_at_gmail.com>
Date: Thu, 17 Mar 2011 15:22:42 -0400

On Mar 17, 2011 1:37 PM, "Julian Foad" <julian.foad_at_wandisco.com> wrote:
>
> I'd like to rationalize this set of internal libsvn_wc functions:

Is this something that needs to be fixed for 1.7?

>
> svn_wc__get_pristine_contents(path)
> svn_wc__get_ultimate_base_contents(path)
>
> svn_wc__get_pristine_text_status(path)
>
> svn_wc__text_base_path_to_read(path)
> svn_wc__ultimate_base_text_path(path)
> svn_wc__ultimate_base_text_path_to_read(path)
>
> svn_wc__get_ultimate_base_checksums(path)
> svn_wc__get_working_checksums(path)
>
> svn_wc__get_actual_props(path)
> svn_wc__get_pristine_props(path)
>
> They're a really rather random set of getter functions over the possible
> space of information getting, and they have inconsistent names. I have
> contributed to the mess when I added functions into the set.
>
> Another problem with them is that each one retrieves only a tiny
> isolated bit of information about the node, and yet requires the WC to
> first determine whether a versioned node exists at that path and which
> WC it belongs to, open the DB, convert abspath to relpath, and so on.
>
> This situation could be improved in one of two directions. Either have
> single-datum getters like this but have each one take a reference to an
> internal "versioned node" object rather than (db context, abspath):
>
> svn_wc__get_props(versioned_node_object)

No. Then you would need a getter to *get* that versioned object. The
indirection doesn't buy you much over the standard db/abspath pair already
used everywhere.

>
> or have each function return a set of data about the versioned node it
> finds:
>
> svn_wc__get_base_node_info(path)
>
> I think the best thing to do at this point is the latter, like with the
> svn_wc__db_read_info except without so many parameters :-) Have a small
> group of functions to access the state of the base node at a given path,
> maybe like this:
>
> read_base_node(path) => kind, props
> read_base_file(path) => text-stream, checksum, size
> read_base_dir(path) => children, depth
> read_base_link(path) => target
>
> and an identical set of functions that access the whole state of the
> topmost pristine version[2]:
>
> read_pris_node(path) => kind, props
> read_pris_file(path) => text-stream, checksum, size
> read_pris_dir(path) => children, depth
> read_pris_link(path) => target

Call it "topmost" rather than "pris".

>
> and some similar functions that access the state of the current working
> node:
>
> read_curr_node(path) => kind, props

This is already known as "actual", and I see no reason to change that now.

>
> but not read_curr_file/dir/link(), for reasons mentioned below.
>
> The important principles here are:
>
> * The function names follow a consistent pattern. The actual names
> are subject to bikeshedding of course [1].
>
> * Each function returns a group of closely related information, and
> each output parameter is optional. When we read "a node", a single
> function should fetch everything the caller wants to know about the
> generic node. When we read "a file", a single function should fetch
> everything the caller wants to know about the file, and so on.
>
> The exact set of info returned by each class of function is to be
> decided. For example, we would need to decide whether read_*_file()
> should return both SHA-1 and MD-5 checksums or just the SHA-1, and the
> decision would depend partly on how commonly a caller wants the MD-5.
>
> * Reading the topmost pristine version is semantically identical to
> reading the base version, so the set of APIs is identical. (If we were
> to try to push this principle outside the realm of simple getter
> functions, we would find some semantic differences, of course.)
>
> * Reading a (base or topmost) pristine node is semantically similar to
> reading the current working version. However, here the practical
> differences are enough that we may not wish to provide the same APIs.
> For example, "get the checksum" would be a slow or deferred operation
> with the working file because it's not pre-calculated, and the disk
> state can be different from the metadata state, and reading the
> file/dir/link can be done by the caller without going through the WC
> API, and so on.
>
> Thoughts?

"Why?"

-g

>
> - Julian
>
>
> [1] We've so overloaded the word "working" that I think in this case
> it's better to use some other word such as "actual" or "current". And
> we've overloaded the word "base" so that I felt compelled to invent this
> horrible "ultimate base" phrase to distinguish it from existing
> functions that used "base" to mean "the topmost pristine version". We
> can get rid of that "ultimate base" phrase now.
>
> [2] Just to make sure we're understanding each other, the "topmost"
> pristine version is the "base" version in simple cases but is different
> when there are scheduled copies and so on. It's what was often called
> "base" or "the normal base" in WC-1.
>
>
>
> ### Maybe rationalize these too.
>
> svn_wc__has_magic_property
> svn_wc_has_binary_prop
> svn_wc__internal_propget
> svn_wc__internal_propdiff
> svn_wc__internal_transmit_prop_deltas
> svn_wc_get_prop_diffs2
> svn_wc_prop_get2
> svn_wc_props_modified_p2
> svn_wc_transmit_prop_deltas2
>
>
Received on 2011-03-17 20:23:16 CET

This is an archived mail posted to the Subversion Dev mailing list.