[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Are skels a db-specific thing?

From: Eric W. Sink <eric_at_sourcegear.com>
Date: 2001-08-22 21:59:23 CEST

Pre-message summary: I'm responding to a fragment of an email
from Greg Stein from eight months ago. This topic is completely
non-urgent, and I know you're all busy with M3. If responses are
slow to come, I will certainly understand.

Back in April, Greg Stein wrote:

> After 1.0, I will *help* with replacing the DB backend. I'd like to
> see a SQL backend in there. Some others want a pure-text backend. It
> should all be possible. Our interfaces between the FS core and the
> databases feels pretty trim at the moment, but we'll just have to
> see. (I believe we need to make skel's a DB-specific thing, which
> means the impact could actually be pretty large).

The last sentence in parens is the one which interests me at the
moment. I'd like to understand this issue a little better.

Subversion's use of DB appears to be a very, very simple model. There
is no apparent notion of columns within the tables, as one might expect
in a SQL database. Rather, each "record" is a key and a bunch of bytes.
More specifically, the "bunch of bytes" is a skel.

An alternative implementation to replace DB could simple provide an
alternate way of storing data of identical complexity: Just keep
track of keys and skels.

Slightly more complex of course is the fact that there are five tables.
But this is pretty simple as well.

So, ignoring transactions for a moment, I could implement a replacement
for DB as follows:

        create five directories, one for each of the tables

        store every record as a file in the proper directory.
        the filename is the key, and the contents of the file are
        the skel

        everywhere I see a DB call I would replace it with a simple
        file IO call

Like I said, I'm ignoring transactions for the moment, to see if I
understand the data model. So now I'll ask: Am I missing something
here?

If the above is true, then the same thing can be done for a SQL
backend:
        
        create five SQL tables
        
        each table will have two columns:

                a key
                a blob

And this would work. However, it would not allow us to take
advantage of the querying capabilities of a SQL db. We would be
using a SQL db in exactly the same fashion as we use Berkeley db.

In fact, we could get pluggable DB-replacements by creating a very
simple abstraction API. Something like the following would be
approximately sufficient (hand-wave, hand-wave):
        create/delete a db
        begin/abort/commit txn
        put(table, key, skel)
        skel = get(table, key)

Obviously, another way of doing this would be to replace all of the
implementation in libsvn_fs. This might allow us to make better use
of the storage facilities of the underlying DB. Instead of storing
pairs of keys and skels, each key would correspond to a collection
of columns. The individual atoms in the skels would be placed in
individual columns. This would allow us to query against those
columns, but that's about the only difference. ( I admit that it's
a big difference. )

It looks to me like some implementations of libsvn_fs will want
to use skels. DB is obviously an example. I suspect that an
implementation on plain text files would be another example. Why
reinvent this particular wheel?

However, a SQL-based store is an example where it would *probably*
make more sense to design a mapping which does not use skels.

Disclaimers: I'm just trying to understand the design. No criticism
is implied toward anyone or anything.

So Greg, my question is:

Am I capturing the issues which prompted you to make the remark
about having skels be DB-specific?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:36 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.