Mark Phippard wrote:
> On Jan 21, 2008 2:07 PM, C. Michael Pilato <cmpilato_at_collab.net> wrote:
>> Mark Phippard wrote:
>>>> Node-origins in FSFS can be done by writing a single file for each node-id
>>>> which contains the origin value into a 'node-origins' subdirectory. We'd
>>>> probably want to shard that in some fashion -- we have around 5000 origins
>>>> to store for the Subversion source code repository today.
>>> If we have to create an object in the file system for every node in
>>> the repository is this potentially a problem waiting to happen? Are
>>> we going to be greatly increasing the size of the repository and the
>>> demands on the server file system? An svn import of thousands of
>>> files would have to create thousands of these items too right?
>> Greatly increasing repository size? Not likely. We're talking about a
>> mapping of node-ids (a single base36 number) to node-revision-ids (one of
>> those dotted triplet thingies).
> But suppose minimum file/block size is 4kb. 100k nodes would be 400MB
> of disk space added to the repository.
Well, I suppose that could be a problem. I mean, we could do other things
here -- for example, we could keep a maximum of 36 of these items at a time
in a file that used the same hash-on-disk format that the revprops items use
and was named NODE-ID[:-1]. Or we could use a fixed-record-size file named
the same way. Or ... there are lots of ways to do this.
>> And yes, an import of 2000 files/dirs would create 2000 of these records.
>> But if in the next commit you changed all 2000 of those things and
>> committed, no new records would be created -- they only show up for brand
>> new lines of history.
> I did grok that part. I assume tagging would not create new files either?
>>> Do we have an official goal to remove SQLite as a dependency now?
>> I'd certainly like to see it happen, but that's just my personal preference.
> Concerns about SQLite specifically? Just a desire to avoid
> dependencies? If we eventually need to add caches to the repository,
> it does not seem like a bad choice.
Fear of the unknown mixed with a desire to reduce dependencies. But those
are mild feelings -- I was fine with having SQLite in place until Glasser
decided that it wasn't going to cut it for the merge tracking stuff. With
that bit removed, we were left with no need at all for SQLite in BDB, and
only this silly little origins index in FSFS. Seems harsh to add a new
dependency for what is now just a trivial little index. But then, maybe
this is exactly the kind of thing SQLite is perfect for. *shrug* No strong
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
Received on 2008-01-21 21:11:45 CET