On Sun, Oct 5, 2008 at 8:22 AM, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
>...
>> In RTC: if the content is never downloaded, or partially downloaded,
>> then the row will have NULL for the STORED_SIZE. When you read the
>> database, you'll immediately know that you don't have valid content.
>> If some content is present, you could actually checksum it to find
>> that you have the content, but it just didn't get recorded into the
>> database properly.
>
> And even if the checksum doesn't match, you can assume that the
> start-of-file that you have locally *is* valid, and download just the
> rest of the file... (if ra supported that)
Yah. That would work 99% of the time. Then there is the 1% where
*whatever* is sitting there is just flat out wrong, so downloading
"the rest" just gives you so much garbage. So then if you're *doubly*
smart, then you just download the leader part. :-P
>...
>> Shoot... if you think the
>> text base is *that* fragile, then you're gonna have to worry about
>> race conditions. "Oh hey. The text base checksum still matches. Good.
>> <CORRUPTION> Now, let me go ahead and use it." Ooops.
>>
>> So yeah. If the stored size matches what is on disk, then I'm prepare
>> to trust it.
>
> I think current libsvn_wc doesn't do even that (it just trusts the bases
> blindly?). But, since the size check costs nothing, I see no reason not
> to do it.
Well, checking the size *does* cost a stat(). If you look at the mode
to svn_wc__db_pristine_check(), you'll see different ways to check for
the presence of a pristine file. One of the modes won't even touch the
file -- it just relies on what is in SQLite. The "single" and "multi"
modes are good for "svn update" where you want to ask "hey. do I have
this file?" ... you can get a "no" answer really fast.
>...
>> > I thought we wouldn't need to scan because each dir would know what wc
>> > it is part of. (If we're going to put metadata at wc-root by default,
>>
>> How would it know that?
>>
>> In the default case, a versioned dir will NOT have a .svn
>> subdirectory. Also, there will be no central datastore. Everything
>> will be wc-root. So given an arbitrary directory, how do you determine
>> whether it is versioned at ALL, let alone where the wcroot is? Answer:
>> you traverse up the directory.
>>
>> During a single svn command, you might hit some file in a subdirectory
>> as one of the arguments. The fact that it is "under" one of the
>> previously-discovered wcroots is NOT sufficient. You still have to
>> scan upwards to find a switched root, or an svn:external or somesuch.
>
> If you have .svn dirs: store the URL given to 'svn checkout' (or the key
> to the WORKING_COPY table).
Assume we don't. There will only be one .svn directory, at the wcroot.
That may point to a central datastore, or it may be the datastore for
the WC. I'm also in favor of an option to completely eliminate the
.svn subdir, but then you can't move your WC around (since the central
datastore identifies it by absolute (unchanging) path).
> If have central datastore: do the upwards-scan in the datastore (not in
> the filesystem). (Requires extending the DB to list switched/external
> dirs.)
True. And if we don't find it, then default to a scan for the wcroot.
Note that it's possible to have *some* WCs not recorded in the central
datastore because they're configured to be wcroot-based-datastores.
> If you don't have .svn dirs, and don't have central datastore: First
> time, scan up the filesystem, no other way. Second time, do you have
> the result of scanning of the parent? If so, you just have to check if
> you are an 'exception' to the parent (such as: completely unrelated,
> independent wc; switched; external; etc.), and wouldn't have to do
> a complete scan.
"Second time [within the process execution]". When you're in the same
process run, yah. Any upward scan will stop when it finds a wcroot, or
it finds a directory that has been seen before.
>...
>> >> And note that we can't just have db.wc_root ... any given operation
>> >> could cover *multiple* wc roots (think of svn:externals or switched
>> >> subdirs).
>> >
>> > open_many() allows that too, no?
>>
>> Correct. open_many() is intended to (in one pass) find all the common
>> pdh structures. Future requests into the API might need to look up
>> more, but that depends on how you use the API.
>>
>> There is an open question: maybe just open a db handle, and then as
>> you start using it, it will open datastores as necessary, rather than
>> trying to do all that up front. It might simplify the API. (I don't
>> have a good argument right now for opening all the datastores up
>> front)
>
> Verifying that all of them exist and are readable?
Yah. I guess you can get an "early out" before starting any real work.
> I don't see much difference, unless you also want to close them as soon
> as we don't need them (otherwise they'll all be opened soon after the
> start). And then, how do we know that a certain datastore isn't needed
> (at least for now)?
No plans to close them...
> Also, what is 'all datastores' in the metadata-in-.svn case? I'm sure
> you didn't mean to open all of them recursively, but that's what we'll
> need to be able to answer the same questions.
Yeah... there will be a lot of open databases. I'm hoping that will be okay...
>...
Cheers,
-g
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-10-05 18:03:09 CEST