[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

svn/fs.py and buffering

From: <harry_at_hjackson.org>
Date: Thu, 28 Jun 2012 18:53:33 +0100

Hi all,

I just ran into an interesting "feature" using svndbadmin. Basically
subprocess.Popen in fs.py is not buffered and if you run an strace on it you
can see thousands of system calls to reading one character at a time. I'm not
sure if this was a design issue or not but it certainly impacts performance
for me when using this command. On a small test repo with the viewvc database
fully purged and no buffering:

time /usr/lib/viewvc/bin/svndbadmin -v update /usr/local/vault/
real 3m47.653s
user 1m6.610s
sys 2m15.869s

with buffering at 4096
real 1m50.753s
user 0m25.862s
sys 1m1.334s

Note this is on a very small svn repo with only 36 revisions. The following
diff is to

subversion/bindings/swig/python/svn/fs.py

that shows the simple change needed to achieve the same buffering as the
local system, 4096 in my case.

117c117
< p = _subprocess.Popen(cmd, stdout=_subprocess.PIPE,

---
> >     p = _subprocess.Popen(cmd, bufsize=-1, stdout=_subprocess.PIPE,
Initially the only reason I could think of for doing it this way is due to
large binary files with no newline character ie the ability to slurp a large
file into RAM and blow something up, but this is running "diff" and for
binary files it will only tell you if they differ ie two one meg binary files 
that differ give me:
$ diff test.img test2.img
Binary files test.img and test2.img differ
So is it safe to add buffering here?
-- 
Harry
Received on 2012-06-28 20:38:35 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.