[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Bdb strings anomaly

From: Vadim Chekan <kot.begemot_at_gmail.com>
Date: Mon, 21 Dec 2009 22:31:03 -0800

Hi all,

Out of curiosity I wrote a script which dumps subversion bdb tables
and found interesting anomaly in "strings" table.
Every string there has a duplicate with empty value.
It is my understanding that "strings" allows duplicates to store very
large content in chunks under the same key. That's fine. But why every
small string (like file name) has a key duplicate? Looks like a bug to
me.
This bug does not prevent normal functioning because strings are
concatenated when read and empty value does not harm, but from
performance point of view, having 2x nodes in btree is not good.

Here is what I'm talking about:
=========== nodes ================
k:'0.0.0' v:'((dir 1 / 0 1 0) 0 0 )'
k:'0.0.1' v:'((dir 1 / 5 0.0.0 1 1 1 0 1 0) 0 1 0)'
k:'1.0.1' v:'((file 9 /test.txt 0 1 0 1 0 1 0) 0 1 1)'
k:'next-key' v:'2'
=========== strings ================
k:'0' v:''
k:'0' v:'((test.txt 5 1.0.1))'
k:'1' v:''
k:'1' v:'aaa'
k:'next-key' v:'2'
=========== revisions ================
k:'1' v:'(revision 1 0)'
k:'2' v:'(revision 1 1)'

Pay attention to "strings" key. Empty value is repeated for every string.

My environment:
svn, version 1.6.5 (r38866)
Linux ubuntu 2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 16:20:31 UTC
2009 i686 GNU/Linux

Here is the script:
===========================================================
#!/usr/bin/ruby
require 'bdb'

$env = BDB::Env.open('repo3/db', flags=BDB::INIT_MPOOL, mode=0)

def list_content(file, db_type)
    puts "=========== #{file} ================"
    db = $env.open_db(db_type, name=file)
    db.each do |k,v|
        puts "k:'#{k}' v:'#{v}'"
    end

    db.close
end

# checksum-reps
%w(changes copies nodes node-origins miscellaneous representations
strings transactions).
    each{|f| list_content(f, BDB::BTREE) }

%w(revisions uuids).
   each{|f| list_content(f, BDB::RECNO) }
===========================================================

-- 
From RFC 2631: In ASN.1, EXPLICIT tagging is implicit unless IMPLICIT
is explicitly specified
Received on 2009-12-22 17:18:14 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.