[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Pristine store - using it

From: Julian Foad <julian.foad_at_wandisco.com>
Date: Mon, 08 Mar 2010 18:22:17 +0000

On Fri, 2010-03-05, Greg Stein wrote:
> Sigh.
>
> Really, if it was that easy, it would have been fixed a while ago.
> Welcome to libsvn_wc.

Heh. (I wasn't careful enough. But I don't feel bad for trying an easy
fix. There are lots of things that could have easily been fixed a long
time ago and still haven't been.)

I see two Wrong Ways to do this:

(1) Pass the path of the temporary file (through whatever contorted code
and data flows exist) along to install_committed_file().

(2) Ensure that the temporary file has a derived name, so that the same
name can be derived again within (). The name could be based on the
working file path/name/version (as it is in WC-1), or could be based on
the file's SHA-1 checksum (like it will be when properly in the pristine
store) if that is available.

... and one Right Way:

(3) As soon as all the content is written to the temp file, move it
fully into the pristine store, named by its checksum. Then later, in
install_committed_file(), make that pristine text become "this node's
base" by writing its checksum into the node's entry in the DB.

Method (3) is right because the new pristine store can contain both the
old and the new text base simultaneously. We can install a new pristine
text that nothing yet refers to, which can be done regardless of whether
and when the (update) operation completes, and then later make something
refer to it, thus simplifying the loggy requirements.

However, that does beg the question of how and when we will delete the
old text from the store.

If we are going delete it by garbage collection, we must either ensure
that the temporary pristine is known to be "referenced" before it is
actually recorded as the base of any node, or otherwise ensure that no
"garbage collection" can happen while the WC is in such an intermediate
state.

Alternatively, if we are going to delete the old one at the time when
the new one "replaces" it, then we have to "unreference" the old one at
that time, but that's OK as we will have the old SHA1 checksum available
up until that time.

Does this make sense so far?

I'll look at the rest of this later...

> Also take particular notice of process_committed_leaf(). It falls into
> this same mess. It is able to "guess" the filename/checksum, but it
> *does* have to derive it. There is no solid dataflow from delta
> transmission to final installation.
>
> I changed process_committed_leaf() a while back to figure out the
> checksum, and it *can* do that (that's what the SVN_DEBUG section
> finally proves). Because it has a checksum there, it also means that
> the delta transmission could install a pristine, and
> process_committed_leaf can reference it out of the storage.
>
> You may also notice the API gap between transmit and the commit
> finalization. Remember when I said the transmit_text_delta function is
> bogus and should go? This is part of the reason why. The dataflow and
> *assumptions* made are ridiculous, creating offhand dependencies
> across the codebase.
>
> The simplest answer here is: study the use of text_base_path() across
> the codebase before trying to change it. I did this long ago and hit
> the wall of "easy" changes. You're in the hard-to-fix area, so changes
> need to be made with *very* particular care.
>
> -g

Thanks. Will investigate.

- Julian

> On Fri, Mar 5, 2010 at 10:22, Julian Foad <julian.foad_at_wandisco.com> wrote:
> > I (Julian Foad) wrote:
> > [...]
> >> I've converted svn_wc__open_writable_base() to use a generic temp dir
> >> and unique file name, rather than a WC-1-specific path, and made all
> >> three places use it for their WC-1 temp base file, it all still works.
> >>
> >> OK, that bit works. That's a bit neater. Committed in r919413.
> >
> > A problem:
> >
> > In doing this, I made the temporary text-bases live at arbitrary paths
> > rather than the special paths where they lived before.
> >
> > There is still one code path that looks on disk at the special text-base
> > temp path, to see if there is a file there. After this change, there
> > probably won't ever be. The code path is in
> > workqueue.c:install_committed_file() - see it calling
> > svn_wc__text_base_path(..., tmp=TRUE, ...) to find the special path.
> >
> > There is still one code path that *puts* a file at that special place.
> > It is adm_crawler.c:svn_wc__internal_transmit_text_deltas().
> >
> > I'm looking to see whether this is a real problem. It's remotely
> > possible that the two remaining code paths (above) always go together
> > and thus no problem, but that's unlikely. As the code paths are a little
> > hard to follow, I may just end up assuming that there is a problem, and
> > go on to fix it.
> >
> > My first thought on how to solve it is to somehow communicate the info
> > that install_committed_file() needs in another way.
> >
> > (And, for completeness, change svn_wc__internal_transmit_text_deltas()
> > to use an arbitrary path, and delete the "tmp=TRUE" option of
> > svn_wc__text_base_path().)
> >
> > - Julian
> >
> >
> >
Received on 2010-03-08 19:22:56 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.