[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Branching slow 1.8.11 https

From: Bert Huijben <bert_at_qqmail.nl>
Date: Tue, 31 Mar 2015 15:19:47 +0200

> -----Original Message-----
> From: Johan Corveleyn [mailto:jcorvel_at_gmail.com]
> Sent: dinsdag 31 maart 2015 14:13
> To: users_at_subversion.apache.org
> Cc: Bert Huijben; Philip Martin; Ben Reser
> Subject: Re: Branching slow 1.8.11 https
>
> On Tue, Mar 31, 2015 at 2:19 AM, Johan Corveleyn <jcorvel_at_gmail.com>
> wrote:
> > On Sun, Mar 29, 2015 at 7:57 PM, Johan Corveleyn <jcorvel_at_gmail.com>
> wrote:
> >> On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben <bert_at_qqmail.nl> wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Johan Corveleyn [mailto:jcorvel_at_gmail.com]
> >>>> Sent: vrijdag 27 maart 2015 22:03
> >>>> To: users_at_subversion.apache.org
> >>>> Subject: Branching slow 1.8.11 https
> >>>>
> >>>> Does the following ring a bell for someone?
> >>>>
> >>>> Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to
> >>>> 1.8.11 (CollabNet package). Some time after that, we discovered that
> >>>> branching was very slow. I'm talking about pure server-side branching
> >>>> ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11
> >>>> client (tried both from same machine as the server, and from another
> >>>> machine on the LAN (100 Mbit)).
> >>>>
> >>>> - Branching trunk (containing many directories and files): 6-8 minutes
> >>>> - Branching a subfolder of trunk: 20-30 seconds (still very slow)
> >>>> - Branching a single file is fast (< 0.5s or so).
> >>>>
> >>>> So it seems the performance degrades depending on the depth or size of
> the
> >>>> tree.
> >>>>
> >>>> Now, it gets more interesting:
> >>>> - The resulting rev file on the server is always very small (as it
> >>>> should be, it contains only a lightweight 'copy' of the trunk node).
> >>>> - Our repos is currently served via https (Apache 2.2.29).
> >>>> - Branching with file:/// urls is fast (branching trunk takes 0.6s).
> >>>> - When starting an svnserve instance serving the same repository, and
> >>>> branching with svn:// urls, it's fast as well (also 0.6s).
> >>>> - We reproduced it on a copy of the production repo.
> >>>> - Experimenting with the test copy, we found that
> >>>> $repos/dav/activities.d contains ~2000 files. When we clear that
> >>>> directory, the branching times go down by more than half (~2 minutes
> >>>> for trunk, ~10s for subdir of trunk --- i.e. still slow, but it
> >>>> definitely has an impact).
> >>>> - With a 1.7 client connecting with neon, the problem is the same.
> >>>> - During the 'svn copy', an httpd child consumes a lot of cpu (around
> >>>> half a core).
> >>>> - There is no authz configured for this repo (SVNPathAuthz off).
> >>>> - Backend is still in 1.5 format (we have not run svnadmin upgrade
> >>>> yet, a dump+load is planned in a couple of weeks).
> >>>>
> >>>> So it seems clearly mod_dav_svn related (and not for instance related
> >>>> to the FSFS backend).
> >>>>
> >>>> I don't think we have anything special in our httpd config:
> >>>> [[[
> >>>> <Location /test_svn>
> >>>> SVNInMemoryCacheSize 131072
> >>>> SVNCacheFullTexts on
> >>>> SVNCacheTextDeltas on
> >>>> SSLRequireSSL
> >>>> AuthName "TEST Subversion Repository"
> >>>> AuthType Basic
> >>>> AuthBasicProvider ldap
> >>>> AuthBasicAuthoritative off
> >>>> AuthLDAPURL "ldap://redacted:389"
> >>>> AuthLDAPBindDN "redacted"
> >>>> AuthLDAPBindPassword redacted
> >>>> Require ldap-group redacted
> >>>> DAV svn
> >>>> SVNPath /path/to/test_repos
> >>>> SVNPathAuthz off
> >>>> </Location>
> >>>> ]]]
> >>>>
> >>>> Any ideas?
> >>>> Why the cpu usage by the server, what's it doing?
> >>>> What is the dav/activities.d directory for? How come it contains so
> >>>> many files? Is it ok to purge the old files from that directory?
> >>>
> >>> Httpd's mod_dav was updated in some recent version to do a full lock
> traversal on copies and moves. I think we already applied some optimizations,
> but the real fix would be that mod_dav shouldn't do this work (which our repos
> layer already does).
> >>>
> >>> I'm not sure which release we applied the first set of optimizations.
> >>>
> >>
> >> Thanks for refreshing my memory.
> >>
> >> So the problem is known as issue #4531 (server-side copy (over dav)
> >> uses too much memory) [1]. The memory usage issue has been fixed in
> >> SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains
> >> (copy is no longer O(1), but depends on the size of the tree being
> >> copied). That's a direct violation of one of Subversion's "old selling
> >> points" vs. CVS: that branching / tagging is O(1). Branching / tagging
> >> taking several minutes brings back "fond memories" from CVS' days.
> >>
> >> As Philip pointed out in his last comment on #4531 [2]: "This issue is
> >> related to a change in mod_dav in 2.2.25 to fix PR54610 which
> >> added a walk over the copy source looking for lock tokens." (also
> >> released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected --
> >> older httpd's won't have this problem I guess).
> >>
> >> Again quoting Philip: "Apache knows in advance that the walk is
> >> redundant in cases such as Subversion's URL-to-URL copy but Subversion
> >> cannot avoid the read access. We should attempt to fix mod_dav to
> >> avoid the walk where possible."
> >>
> >> So my hope rests with Philip and others who might have the necessary
> >> knowledge to fix this in mod_dav. It's really not acceptable that
> >> branching / tagging (or I'm guessing also: moving a large tree with a
> >> server-side move) takes several minutes.
> >>
> >> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4531
> >> [2] http://subversion.tigris.org/issues/show_bug.cgi?id=4531#desc12
> >
> > I think I've found a workaround: it seems the tree walk by mod_dav is
> > avoided when the request has a header Depth with value 0. I've tried
> > adding
> >
> > <If "%{REQUEST_METHOD} == 'COPY'">
> > RequestHeader set Depth 0
> > </If>
> >
> > to the Location block of SVN, and the copy is fast again! And the good
> > thing is: it's still a fully recursive copy :-) (otherwise it wouldn't
> > be much of a workaround).
> >
> > 'svn copy' time for a very large tree (artificially generated with
> > ~50000 folders and ~250000 files) is now down to 1,5 seconds (still
> > three times slower than the same via file:/// or svn://, but good
> > enough, and not O(sizeof(tree)) anymore).
> >
> > Is this workaround safe? Thoughts?
> > It might even be something that can be exploited by our client, when
> > 'svn copy'ing ... (though a "normal" server-side fix for this problem,
> > within the normal workings of mod_dav, would of course be better
> > still).
>
> Seems this workaround is pretty OK for now (apparently the subversion
> code on the server ignores the Depth:0 for COPY requests, so the copy
> is handled like a normal recursive copy).
>
> Bert suggested on irc to make the setting of the header also dependent
> on the useragent string.
>
> For completeness: I'm now no longer seeing the 1,5 seconds time for
> copying over dav. Today it's more like 0,5 - 0,7 seconds, i.e. the
> same as with file:// and svn://. Maybe something was slowing down my
> network temporarily yesterday evening.

[[
Index: subversion/mod_dav_svn/repos.c
===================================================================
--- subversion/mod_dav_svn/repos.c (revision 1670075)
+++ subversion/mod_dav_svn/repos.c (working copy)
@@ -4447,6 +4447,14 @@
       return NULL;
     }
 
+ if (params->root->info->r->method_number == M_COPY
+ && params->root->info->repos->is_svn_client)
+ {
+ /* We don't need to check if there are locks on the MOD_DAV level,
+ as we can handle that far more efficient on the FS level */
+ depth = 0;
+ }
+
   ctx.params = params;
 
   ctx.wres.walk_ctx = params->walk_ctx;
]]

Implements the same hack on the mod_dav_svn level without requiring a config file change.

        Bert
Received on 2015-03-31 15:21:07 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.