On Thu, 5 Apr 2001, Deven T. Corzine wrote:
>On Wed, 4 Apr 2001, Chip Salzenberg wrote:
>> According to Deven T. Corzine:
>> > On Sun, 1 Apr 2001, Chip Salzenberg wrote:
>> > > OK, I'm going to fix the db problem by eliminating db. :-)
>> > If the idea is to have a text-based format that's more transparent so that
>> > one can identify corruption visually, wouldn't it work as well to have
>> > mechanisms to dump the DB database into one or more such formats? I see
>> > the value in having a readable format, but is it necessarily preferable as
>> > the native format?
>> I think so. I like having all my normal tools (find, grep, etc.) work,
>> without a lot of translation rigamarole.
>I understand that, and there's certainly something to be said for it. At
>the same time, if you're needing to fall back on general-purpose tools like
>that, does that suggest that the application-specific tools are lacking?
>Text-based formats definitely have a convenience factor and a transparency
>that's nice, but you're also taking a performance/efficiency hit, to some
>degree or another. Is it possible to have your cake and eat it too?
I should introduce myself, and then I have some comments.
===== Introduction =====
I've been lurking on the list for, oh, 3 days now. I work for Informix
Software, and one of the areas I deal with is Open Source software.
Specifically, I look after the DBD::Informix module that works with Perl
and DBI. I also tend to get involved in other database related open
source projects (PHP, Tcl/Tk, Python, ...). I'm afraid I'm only likely
to be able to lurk on this group rather than contributing much more than
the occasional idea or gob of information. If it is relevant, I can
test on Sparc Solaris (currently 7) and/or Linux (currently RH6.2)
without having to negotiate with anyone, and I can sometimes find more
obscure machines to work on -- ask and I'll see what can be done.
===== Commentary =====
I'm not sure how Berkeley DB handles this stuff, but if you were using
Informix C-ISAM as the data access mechanism, then you could have your
cake and eat it. Specifically, C-ISAM stores the data in a .dat file,
and each record has a 'currency marker' at the end. Active records have
a newline '\n' and deleted records have a NUL '\0'. The access
information -- indexes -- are stored in a separate .idx file, and
contain lots of binary information. If you (a) chose to use character
representation for each field and (b) you were able to stick with fixed
length records, then you could have your cake and eat it -- the .dat
files would be legible (the '\0' would be a mild nuisance, but SVN would
seldom be deleting records), and the .idx files would not need to be
viewed. I suspect (b) is too stringent a requirement; even though
C-ISAM supports variable length records, you cannot index the variable
length portion of the records, and the variable length data is stored in
the index file, not the data file (don't ask why - I don't know; and it
stinks). More seriously, C-ISAM is a commercial product, which rules it
out from this project.
However, if the storage manager were designed to handle it, then it
would be possible to have a readable text representation of the data and
a not necessarily readable fast-access mechanism. Which I think counts
as having your cake and eating it in this context.
Jonathan Leffler (Jonathan.Leffler@Informix.com) #include <disclaimer.h>
Guardian of DBD::Informix v1.00.PC1 -- http://www.perl.com/CPAN
"I don't suffer from insanity; I enjoy every minute of it!"
Received on Sat Oct 21 14:36:28 2006