Re: svn commits: r33357, r33358

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Sun, 5 Oct 2008 19:30:31 +0300 (Jerusalem Daylight Time)

Greg Stein wrote on Sun, 5 Oct 2008 at 09:02 -0700:
> On Sun, Oct 5, 2008 at 8:22 AM, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
> >...
> >> In RTC: if the content is never downloaded, or partially downloaded,
> >> then the row will have NULL for the STORED_SIZE. When you read the
> >> database, you'll immediately know that you don't have valid content.
> >> If some content is present, you could actually checksum it to find
> >> that you have the content, but it just didn't get recorded into the
> >> database properly.
> >
> > And even if the checksum doesn't match, you can assume that the
> > start-of-file that you have locally *is* valid, and download just the
> > rest of the file... (if ra supported that)
>
> Yah. That would work 99% of the time. Then there is the 1% where
> *whatever* is sitting there is just flat out wrong, so downloading
> "the rest" just gives you so much garbage. So then if you're *doubly*
> smart, then you just download the leader part. :-P
>

> >...
> >> Shoot... if you think the
> >> text base is *that* fragile, then you're gonna have to worry about
> >> race conditions. "Oh hey. The text base checksum still matches. Good.
> >> <CORRUPTION> Now, let me go ahead and use it." Ooops.
> >>
> >> So yeah. If the stored size matches what is on disk, then I'm prepare
> >> to trust it.
> >
> > I think current libsvn_wc doesn't do even that (it just trusts the bases
> > blindly?). But, since the size check costs nothing, I see no reason not
> > to do it.
>
> Well, checking the size *does* cost a stat(). If you look at the mode
> to svn_wc__db_pristine_check(), you'll see different ways to check for
> the presence of a pristine file. One of the modes won't even touch the
> file -- it just relies on what is in SQLite. The "single" and "multi"
> modes are good for "svn update" where you want to ask "hey. do I have
> this file?" ... you can get a "no" answer really fast.
>

Actually, I assumed that since we read the file (from disk) already, the
cost of the stat() of the same file would be negligible.

> >...
> >> > I thought we wouldn't need to scan because each dir would know what wc
> >> > it is part of. (If we're going to put metadata at wc-root by default,
> >>
> >> How would it know that?
> >>
> >> In the default case, a versioned dir will NOT have a .svn
> >> subdirectory. Also, there will be no central datastore. Everything
> >> will be wc-root. So given an arbitrary directory, how do you determine
> >> whether it is versioned at ALL, let alone where the wcroot is? Answer:
> >> you traverse up the directory.
> >>
> >> During a single svn command, you might hit some file in a subdirectory
> >> as one of the arguments. The fact that it is "under" one of the
> >> previously-discovered wcroots is NOT sufficient. You still have to
> >> scan upwards to find a switched root, or an svn:external or somesuch.
> >
> > If you have .svn dirs: store the URL given to 'svn checkout' (or the key
> > to the WORKING_COPY table).
>
> Assume we don't. There will only be one .svn directory, at the wcroot.

I assumed the in-tree .svn dirs would just point to the wc root
(basically 'ln -s ../../../../ .svn'). Are they really that bad?
They would save all these upwards-scans-in-the-fs all the time.

> That may point to a central datastore, or it may be the datastore for
> the WC. I'm also in favor of an option to completely eliminate the
> .svn subdir, but then you can't move your WC around (since the central
> datastore identifies it by absolute (unchanging) path).
>

Sounds good.

> > If have central datastore: do the upwards-scan in the datastore (not in
> > the filesystem). (Requires extending the DB to list switched/external
> > dirs.)
>
> True. And if we don't find it, then default to a scan for the wcroot.
> Note that it's possible to have *some* WCs not recorded in the central
> datastore because they're configured to be wcroot-based-datastores.
>

Yes (but see above about the scan).

> >> >> And note that we can't just have db.wc_root ... any given operation
> >> >> could cover *multiple* wc roots (think of svn:externals or switched
> >> >> subdirs).
> >> >
> >> > open_many() allows that too, no?
> >>
> >> Correct. open_many() is intended to (in one pass) find all the common
> >> pdh structures. Future requests into the API might need to look up
> >> more, but that depends on how you use the API.
> >>
> >> There is an open question: maybe just open a db handle, and then as
> >> you start using it, it will open datastores as necessary, rather than
> >> trying to do all that up front. It might simplify the API. (I don't
> >> have a good argument right now for opening all the datastores up
> >> front)
> >
> > Verifying that all of them exist and are readable?
>
> Yah. I guess you can get an "early out" before starting any real work.
>
> > I don't see much difference, unless you also want to close them as soon
> > as we don't need them (otherwise they'll all be opened soon after the
> > start). And then, how do we know that a certain datastore isn't needed
> > (at least for now)?
>
> No plans to close them...
>

Then in the typical case we'll have all of them open shortly after we
start. So I'd say, just open them all right away, and save us from the
accounting.

> > Also, what is 'all datastores' in the metadata-in-.svn case? I'm sure
> > you didn't mean to open all of them recursively, but that's what we'll
> > need to be able to answer the same questions.
>
> Yeah... there will be a lot of open databases. I'm hoping that will be okay...

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-10-05 18:30:54 CEST

This message: [ Message body ]
Next message: Daniel Shahaf: "Adding svn_dirent to the Windows build (was: Re: svn commit: r33396 - trunk/subversion/tests/libsvn_subr)"
Previous message: Neels J Hofmeyr: "Re: `svn diff' paths abbreviated wrongly"
In reply to: Greg Stein: "Re: svn commits: r33357, r33358"
Next in thread: Greg Stein: "Re: svn commits: r33357, r33358"
Reply: Greg Stein: "Re: svn commits: r33357, r33358"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]