Greg Stein wrote on Sun, 5 Oct 2008 at 09:02 -0700:
> On Sun, Oct 5, 2008 at 8:22 AM, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
> >> In RTC: if the content is never downloaded, or partially downloaded,
> >> then the row will have NULL for the STORED_SIZE. When you read the
> >> database, you'll immediately know that you don't have valid content.
> >> If some content is present, you could actually checksum it to find
> >> that you have the content, but it just didn't get recorded into the
> >> database properly.
> > And even if the checksum doesn't match, you can assume that the
> > start-of-file that you have locally *is* valid, and download just the
> > rest of the file... (if ra supported that)
> Yah. That would work 99% of the time. Then there is the 1% where
> *whatever* is sitting there is just flat out wrong, so downloading
> "the rest" just gives you so much garbage. So then if you're *doubly*
> smart, then you just download the leader part. :-P
> >> Shoot... if you think the
> >> text base is *that* fragile, then you're gonna have to worry about
> >> race conditions. "Oh hey. The text base checksum still matches. Good.
> >> <CORRUPTION> Now, let me go ahead and use it." Ooops.
> >> So yeah. If the stored size matches what is on disk, then I'm prepare
> >> to trust it.
> > I think current libsvn_wc doesn't do even that (it just trusts the bases
> > blindly?). But, since the size check costs nothing, I see no reason not
> > to do it.
> Well, checking the size *does* cost a stat(). If you look at the mode
> to svn_wc__db_pristine_check(), you'll see different ways to check for
> the presence of a pristine file. One of the modes won't even touch the
> file -- it just relies on what is in SQLite. The "single" and "multi"
> modes are good for "svn update" where you want to ask "hey. do I have
> this file?" ... you can get a "no" answer really fast.
Actually, I assumed that since we read the file (from disk) already, the
cost of the stat() of the same file would be negligible.
> >> > I thought we wouldn't need to scan because each dir would know what wc
> >> > it is part of. (If we're going to put metadata at wc-root by default,
> >> How would it know that?
> >> In the default case, a versioned dir will NOT have a .svn
> >> subdirectory. Also, there will be no central datastore. Everything
> >> will be wc-root. So given an arbitrary directory, how do you determine
> >> whether it is versioned at ALL, let alone where the wcroot is? Answer:
> >> you traverse up the directory.
> >> During a single svn command, you might hit some file in a subdirectory
> >> as one of the arguments. The fact that it is "under" one of the
> >> previously-discovered wcroots is NOT sufficient. You still have to
> >> scan upwards to find a switched root, or an svn:external or somesuch.
> > If you have .svn dirs: store the URL given to 'svn checkout' (or the key
> > to the WORKING_COPY table).
> Assume we don't. There will only be one .svn directory, at the wcroot.
I assumed the in-tree .svn dirs would just point to the wc root
(basically 'ln -s ../../../../ .svn'). Are they really that bad?
They would save all these upwards-scans-in-the-fs all the time.
> That may point to a central datastore, or it may be the datastore for
> the WC. I'm also in favor of an option to completely eliminate the
> .svn subdir, but then you can't move your WC around (since the central
> datastore identifies it by absolute (unchanging) path).
> > If have central datastore: do the upwards-scan in the datastore (not in
> > the filesystem). (Requires extending the DB to list switched/external
> > dirs.)
> True. And if we don't find it, then default to a scan for the wcroot.
> Note that it's possible to have *some* WCs not recorded in the central
> datastore because they're configured to be wcroot-based-datastores.
Yes (but see above about the scan).
> >> >> And note that we can't just have db.wc_root ... any given operation
> >> >> could cover *multiple* wc roots (think of svn:externals or switched
> >> >> subdirs).
> >> >
> >> > open_many() allows that too, no?
> >> Correct. open_many() is intended to (in one pass) find all the common
> >> pdh structures. Future requests into the API might need to look up
> >> more, but that depends on how you use the API.
> >> There is an open question: maybe just open a db handle, and then as
> >> you start using it, it will open datastores as necessary, rather than
> >> trying to do all that up front. It might simplify the API. (I don't
> >> have a good argument right now for opening all the datastores up
> >> front)
> > Verifying that all of them exist and are readable?
> Yah. I guess you can get an "early out" before starting any real work.
> > I don't see much difference, unless you also want to close them as soon
> > as we don't need them (otherwise they'll all be opened soon after the
> > start). And then, how do we know that a certain datastore isn't needed
> > (at least for now)?
> No plans to close them...
Then in the typical case we'll have all of them open shortly after we
start. So I'd say, just open them all right away, and save us from the
> > Also, what is 'all datastores' in the metadata-in-.svn case? I'm sure
> > you didn't mean to open all of them recursively, but that's what we'll
> > need to be able to answer the same questions.
> Yeah... there will be a lot of open databases. I'm hoping that will be okay...
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-10-05 18:30:54 CEST