Re: pristine store design

From: Greg Stein <gstein_at_gmail.com>
Date: Wed, 17 Feb 2010 14:59:14 -0500

On Wed, Feb 17, 2010 at 08:09, Bert Huijben <bert_at_qqmail.nl> wrote:
>...
>> [ ] It is cheap to tee to a checksumming stream when reading a
>> stored pristine. The I/O is much slower anyway.
> When we are reading the complete stream: Yes.
> I would skip verification if we only do a partial read (if possible).

That's my belief, but testing can/should confirm. I imagine an SSD
"disk" and a good OS buffer cache might actually be pretty damned
fast, which would then lead to a checksum being a "noticable"
percentage of reading the stream.

Of course, we're usually doing *something* with that stream as we read
it, and that invariably is to write it somewhere else (temp file,
network, etc).

>> [ ] It is expensive to *always* want to tee to a checksumming stream,
>> because that defies speed gain when wanting to read just a subsection
>> of a pristine, and defies using OS copy without de-/translation to put
>> a pristine's copy into the WC.
> See previous answer

I doubt we read portions all that often. My naive approach would have
pristine_read() return a checksumming stream. *IF* the stream is read
to completion, then it verifies the result. If the stream is closed
early (due to partial-read), then we just don't validate.

That said... if "always checksumming" proves to be a bit expensive
overall, then we may want to be strategic about *when* to do a
checksum. That would imply a flag to pristine_read() asking it to do a
checksum as we read. (and for known partial-reads, we just set the
flag to FALSE).

>> [ ] We don't care about optimizing for reading subsections of a pristine.
>> We "always" read the whole pristine anyway.
> Not sure.
> We certainly don't read the entire file if we are only trying to check if a
> file has changed or not. (1st byte in a 2GB file has changed?)

Right. I think this kind of goes to "we may want to be strategic about
when to do a checksum verification"

>> [ ] If the checksum is already known and the data exists in the local
>> file system, there is no need to verify the checksum.

I'm assuming you mean copying from a source in the local file system,
and writing that into the pristine store.

Yes, if we have a checksum for that file, then we can just use it
since we (invariably) *just* computed it.

> That depends. If we are going to read it anyway...
> In most cases I agree. The filesystem also has verifications on bit errors.
> But you also have sed -i <.....>

Not all filesystems do. That is why we record checksums for repository
contents. That "one bit error every five years" needs to be
detectable.

>> [ ] If the checksum is already known and the data exists in the local
>> file system, we still have to verify the checksum again via a tempfile
>> unless that local file already *is* a tempfile (and is thus reasonably
>> protected from accidental modification before we finish our copy).
> Agree. Any process can change files in the WC at any time. Even if we
> checksummed and then moved we have a chance of failure.

Yup. Note that we pretty much *always* copy a source file into a temp
area. During that copy, we can checksum and (de)translate the thing.
These kinds of file copies occur all over the WC.

>...
>> [ ] We are totally fine with just checking size (and maybe mtime) of
>> pristines locally, no need to *always* do checksumming when reading.
> +1.
> Just check where it doesn't impact performance. (E.g. incoming updates that
> read the whole file anyway)

I think a flag to pristine_read() may be the way to go.

>> Pristine check.
>> Do we need a "fast" check? Note that it doesn't need to say how it's
>> implemented; just say whether you think we need a presence check that
>> tries
>> to be as fast as possible by assuming the pristine store is consistent.
>> Greg said there is a use case for that lurking, so I suggest keeping it in
>> the design doc but marking it for post-1.7. Yes?

Sure. I think the question that comes us is, "am I missing this
pristine?" Would have to dig more to remember where/why that comes
up.

>> Write.
>> Julian said: there possibly should be only _write(), a simple API that
> takes
>> care of the rest internally.
>> Greg said: _write() has to go away, it has no way of telling whether the
>> write stream was interrupted by error or ended successfully.
>> Greg also said: It is fine to write directly to a pristine file location.
>> Which are/should be true?

Write does have to go away. No getting around that, and I won't
support further changes to the stream API to "fix" that issue. We've
already made the streams more complicated than I think they should be
(the recent: reset, mark, seek, and line functions). They used to just
be read/write/close. Grr.

>> [ ] We will validate pristines when reading anyway, so we can neglect
>> being paranoid about writing to pristine files without losing safety.
> We should not write faulty files. If somebody opens the stream he assumes
> the file is ok and on some OSs he keeps a lock on the file while he as
> reading so nobody can repair it. (And on other OSs the file should never
> change while it is open)

The schema is designed to *allow* for faulty files. The faulty file
will NOT be used because SIZE==null in the database.

If the file is successfully written, *then* we store a value into
SIZE, indicating the file is present and correct.

Of course, if you're overwriting a pristine file, then you better
store SIZE==null before you begin writing.

>> [ ] It is ok to open a write stream to the final location of the
>> pristine file for a given checksum, no need to store in a
>> pristine-tempfile first.
> I think we should say no here, to guarantee a stable behavior. If the file
> is in the pristine store it should be OK. We can't trust the initial write
> to always provide the full file and a valid hash of this file.
> We should calculate ourselves if it is possible that we receive an
> incomplete file.

The schema allows for this, so it can/should be just fine.

> And calculating ourselves defines that we don't know the final location.

Usually, we assemble a file in the temp area that is detranslated. At
that time, we can compute the checksum. And then we install the file
as a pristine.

We might have a file incoming from the repository with a known
checksum. We can spool that directly into the pristine store.

>...
>> [ ] However, we do not provide an API function for that. The only way
>> to create a pristine (for now) will be via _tempdir() and _install().
> +1

I see no reason to limit it this way.

>> [ ] But we will likely see a need at some point, upon which such API
>> will be added. We will change _write() so that it can tell the
>> difference between successful and unsuccessful stream closure.

-1

>...
>> [ ] Callers can also pass any other file locations to _install(), to
>> facilitate OS-copying/moving into the pristine store without de-/
>> translation (when the checksum is known beforehand).
> At which API level do we trust the callers?

This would be a wc_db API, and (thus) trustable within WC.

The APIs for pristine management provided as the WC public API... none
beyond "read", I think.

>
>> [ ] ^ ... copying ...
>> [ ] ^ ... moving ...
> There are not that many operations where this is actually used. Maybe after
> a commit? The pristine store is mostly read-only and no apis should deliver
> the file locations; just streams.

Absolutely never a path. That killed us in wc-1.

>> Separate pristine store?
>...
> So: design needed

Yah. Post-1.7, I say. For now, the pristine store and wc.db are tied
on a 1:1 basis, at the root of each working copy.

>...
>> SHA1/MD5 compat.
>> [ ] When talking to an old server, we use only MD5 checksums to reference
>> the pristine in the pristine store, nevermind hash collisions.

If we don't have a SHA1 handy, then I don't mind this. But I think we
should store stuff using *only* MD5 or *only* SHA1. Mixing them could
end up with duplicates. Not sure if that is a problem, but it feels
like it could be.

>...
>> [ ] When talking to an old server, we use SHA1 to index the pristine
>> store and also store the MD5 checksum for server comm. We create the
>> SHA1 checksum locally after having verified the MD5 checksum.
> +1. This is the original WC-NG plan

I wouldn't go *that* far. There was definitely some hand-waving going
on. The MD5 in the editor API is a severely complicating item. I had
hoped to fix that particular problem with Ev2.

>> [ ] When talking to a 1.7 server, we will receive and use only SHA1
>> checksums and don't store any MD5 checksums.
>
> There is currently no delta editor upgrade for 1.7, so we still have to send
> MD5 to our 1.7 server as it is now. And to wrap old editors with new editors
> (and reversed) we probably have to send both hashes whenever possible.
> (The repository filesystem already stores SHA1 hashes since 1.6 or ?)

I think we need to keep MD5 checksums around, in addition to the SHA1
for compat with the Editor v1 APIs. Sigh.

>...

Cheers,
-g
Received on 2010-02-17 20:59:50 CET

This message: [ Message body ]
Next message: Greg Stein: "Subversion is now an official ASF project!"
Previous message: Greg Stein: "Re: [RFC] Comments on wc-metadata.sql PRISTINE and"
In reply to: Bert Huijben: "RE: pristine store design"
Next in thread: Neels J Hofmeyr: "Re: pristine store design"
Reply: Neels J Hofmeyr: "Re: pristine store design"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]