Hi all,
Out of curiosity I wrote a script which dumps subversion bdb tables
and found interesting anomaly in "strings" table.
Every string there has a duplicate with empty value.
It is my understanding that "strings" allows duplicates to store very
large content in chunks under the same key. That's fine. But why every
small string (like file name) has a key duplicate? Looks like a bug to
me.
This bug does not prevent normal functioning because strings are
concatenated when read and empty value does not harm, but from
performance point of view, having 2x nodes in btree is not good.
Here is what I'm talking about:
=========== nodes ================
k:'0.0.0' v:'((dir 1 / 0 1 0) 0 0 )'
k:'0.0.1' v:'((dir 1 / 5 0.0.0 1 1 1 0 1 0) 0 1 0)'
k:'1.0.1' v:'((file 9 /test.txt 0 1 0 1 0 1 0) 0 1 1)'
k:'next-key' v:'2'
=========== strings ================
k:'0' v:''
k:'0' v:'((test.txt 5 1.0.1))'
k:'1' v:''
k:'1' v:'aaa'
k:'next-key' v:'2'
=========== revisions ================
k:'1' v:'(revision 1 0)'
k:'2' v:'(revision 1 1)'
Pay attention to "strings" key. Empty value is repeated for every string.
My environment:
svn, version 1.6.5 (r38866)
Linux ubuntu 2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 16:20:31 UTC
2009 i686 GNU/Linux
Here is the script:
===========================================================
#!/usr/bin/ruby
require 'bdb'
$env = BDB::Env.open('repo3/db', flags=BDB::INIT_MPOOL, mode=0)
def list_content(file, db_type)
puts "=========== #{file} ================"
db = $env.open_db(db_type, name=file)
db.each do |k,v|
puts "k:'#{k}' v:'#{v}'"
end
db.close
end
# checksum-reps
%w(changes copies nodes node-origins miscellaneous representations
strings transactions).
each{|f| list_content(f, BDB::BTREE) }
%w(revisions uuids).
each{|f| list_content(f, BDB::RECNO) }
===========================================================
--
From RFC 2631: In ASN.1, EXPLICIT tagging is implicit unless IMPLICIT
is explicitly specified
Received on 2009-12-22 17:18:14 CET