[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Bdb strings anomaly

From: Vadim Chekan <kot.begemot_at_gmail.com>
Date: Tue, 22 Dec 2009 09:26:46 -0800

Yes, I'll try to figure it out. Just wanted to make sure that it
wasn't intentional for whatever reason.
I've already looked up strings bdb implementation and could not see
anything suspicious. Now it's time for debugger :)

Vadim

On Tue, Dec 22, 2009 at 8:37 AM, C. Michael Pilato <cmpilato_at_collab.net> wrote:
> IIRC, it's not really 2x nodes -- it's one extra row for each string key.
> So if a directory listing would normally consume a single string row, yes,
> theres 1 + 1 = 2 (or, 2x) rows used.  But if a file's contents would consume
> 10 strings rows, then it's still just the 1 additional empty row.  THAT
> SAID, it does certainly seem inefficient.
>
> Wanna dive into the code and work up a patch?
>
>
> Vadim Chekan wrote:
>> Hi all,
>>
>> Out of curiosity I wrote a script which dumps subversion bdb tables
>> and found interesting anomaly in "strings" table.
>> Every string there has a duplicate with empty value.
>> It is my understanding that "strings" allows duplicates to store very
>> large content in chunks under the same key. That's fine. But why every
>> small string (like file name) has a key duplicate? Looks like a bug to
>> me.
>> This bug does not prevent normal functioning because strings are
>> concatenated when read and empty value does not harm, but from
>> performance point of view, having 2x nodes in btree is not good.
>>
>> Here is what I'm talking about:
>> =========== nodes  ================
>> k:'0.0.0' v:'((dir 1 / 0  1 0) 0  0 )'
>> k:'0.0.1' v:'((dir 1 / 5 0.0.0 1 1 1 0 1 0) 0  1 0)'
>> k:'1.0.1' v:'((file 9 /test.txt 0  1 0 1 0 1 0) 0  1 1)'
>> k:'next-key' v:'2'
>> =========== strings  ================
>> k:'0' v:''
>> k:'0' v:'((test.txt 5 1.0.1))'
>> k:'1' v:''
>> k:'1' v:'aaa'
>> k:'next-key' v:'2'
>> =========== revisions  ================
>> k:'1' v:'(revision 1 0)'
>> k:'2' v:'(revision 1 1)'
>>
>> Pay attention to "strings" key. Empty value is repeated for every string.
>>
>> My environment:
>> svn, version 1.6.5 (r38866)
>> Linux ubuntu 2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 16:20:31 UTC
>> 2009 i686 GNU/Linux
>>
>> Here is the script:
>> ===========================================================
>> #!/usr/bin/ruby
>> require 'bdb'
>>
>> $env = BDB::Env.open('repo3/db', flags=BDB::INIT_MPOOL, mode=0)
>>
>> def list_content(file, db_type)
>>     puts "=========== #{file}  ================"
>>     db = $env.open_db(db_type, name=file)
>>     db.each do |k,v|
>>         puts "k:'#{k}' v:'#{v}'"
>>     end
>>
>>     db.close
>> end
>>
>> # checksum-reps
>> %w(changes copies nodes node-origins miscellaneous representations
>> strings transactions).
>>     each{|f| list_content(f, BDB::BTREE) }
>>
>> %w(revisions uuids).
>>    each{|f| list_content(f, BDB::RECNO) }
>> ===========================================================
>>
>>
>
>
> --
> C. Michael Pilato <cmpilato_at_collab.net>
> CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
>
>

-- 
From RFC 2631: In ASN.1, EXPLICIT tagging is implicit unless IMPLICIT
is explicitly specified
Received on 2009-12-22 18:27:23 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.