Re: Compressed Pristines (Summary)
From: Ashod Nakashian <ashodnakashian_at_yahoo.com>
Date: Sat, 31 Mar 2012 09:16:33 -0700 (PDT)
----- Original Message -----
I hadn't known that fossil uses sqlite, although I was familiar with it (if nothing, because Sqlite uses it!).
So it's fair to say I'm ignorant about the details, but I must say this: A repository, precisely like Git pack files, don't necessarily need good (if at all) support of deletion. This is a very critical issue that I can see why it might not be obvious at first. At least one person (sorry, I don't readily have a name) raised a possible reinventing-the-wheel flag with the proposed pack format and Git's pack file. Git's pack *is* the repository. Technically, a repository needs to at least emulate the behavior of deletes, but it certainly doesn't have the performance issue as they almost never support arbitrary deletions (that is, history rewriting by removing a historical versions of files). Git has some support of history rewriting (typically via rebase) but even then, it can defer housekeeping cleanup to git-gc, which is a slow and by definition an offline operation. Compare this with our case where we keep the latest revision (known to us) only and all
If we find some way to go around this requirement (say by doing things "quick'n'dirty" and deferring cleanups) then we probably can reuse some of the numerous archives, be it Git's pack or any other. But that comes at a cost and, unlike Git, we don't have any advantages to offer in return. Git can keep deleted items until git-gc is invoked, should we support something similar, we need to be consistent and probably support arbitrary revision history, which is out of scope. Sqlite (which internally uses a b-tree pointing to fixed-size pages that overflow using linked-lists) is designed for fast additions/modifications/deletions of typically tiny data (a row is reasonably assumed to be -much- less than a page in most cases) and *without* promising a compact footprint, which we dearly care about. We will be doing the same on KBytes worth of data for each entry. This is something that we must certainly research more with actual data. However in my mind our
Just wanted to make this clear just to be sure we're not talking cross purposes at this point.
>
Thanks Stephan! I'm excited about the prospects myself.
-Ash
|
This is an archived mail posted to the Subversion Dev mailing list.
This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.