On Mon, Dec 17, 2012 at 1:29 PM, Branko Čibej <brane_at_wandisco.com> wrote:
> Resent to the correct list.
>
> On 17.12.2012 11:41, Branko Čibej wrote:
> > On 17.12.2012 11:25, Daniel Shahaf wrote:
> >> stefan2_at_apache.org wrote on Mon, Nov 26, 2012 at 00:21:26 -0000:
> >>> Author: stefan2
> >>> Date: Mon Nov 26 00:21:26 2012
> >>> New Revision: 1413451
> >>>
> >>> URL: http://svn.apache.org/viewvc?rev=1413451&view=rev
> >>> Log:
> >>> On the cache-server branch.
> >>>
> >>> * BRANCH-README: add
> >>>
> >>> Added:
> >>> subversion/branches/cache-server/BRANCH-README
> >>>
> >>> Added: subversion/branches/cache-server/BRANCH-README
> >>> URL:
> http://svn.apache.org/viewvc/subversion/branches/cache-server/BRANCH-README?rev=1413451&view=auto
> >>>
> ==============================================================================
> >>> --- subversion/branches/cache-server/BRANCH-README (added)
> >>> +++ subversion/branches/cache-server/BRANCH-README Mon Nov 26 00:21:26
> 2012
> >>> @@ -0,0 +1,109 @@
> >>> +Goal
> >>> +====
> >>> +
> >>> +Provide a stand-alone executable that will provide a svn_cache__t
> >>> +implementation based on a single shared memory. The core data
> >>> +structure and access logic can be taken from / shared with today's
> >>> +membuffer cache. The latter shall remain available as it is now.
> >> memcached solves the problem you're stating above, and it's an
> >> independent third-party project.
>
The ineffectiveness of our use of memcached in 1.6 had
prompted the development of membuffer in the first place.
Despite the relevant APR bug that got fixed only recently,
there are fundamental limitations compared to a SHM-based
implementation:
* Instead of reading directly addressable memory, memcached
requires inter-process calls over TCP/IP. That translated into
~10us latency. The performance numbers I found (~200k
requests/s on larger SMP machines) are 1 order of magnitude
less than what membuffer achieves with 1 thread.
* More critically, memcached does not support for partial data
access, e.g. reading or adding a single directory entry. That's
1.6-esque O(n^2) instead of O(n) runtime for large folders.
> Your solution is specific to
> >> Subversion (it's in libsvn_subr and is not in the public API). If
> >> you're solving the same problem memcached does, why does your solution
> >> need to be specific to svn? Should it be a standalone tool that
> >> Subversion interfaces to as an optional dependency, and any other
> >> memcached consumer can switch to too?
>
The cache process does not need to be SVN-specific. Other tools
might, in theory, use it as well. However, there will only be some
negotiation and locking API while the actual data access is done
by the client (it's shared memory after all).
So, 3rd party tools using the SVN cache server would need to
talk to svn__cache_t, for instance.
> >> I don't mean to discourage you from doing this work; I just wonder
> >> whether the non-public parts of libsvn_subr is the right place for
> >> it to live in.
> > I've been wondering about all this caching, actually. There's memacache,
> > as Daniel mentions, and there's redis, and a bunch of other caching
> > solutions that have different strenghts and weaknesses. Yet here we are,
> > reinventing the wheel (and if I read the mails on the topic correctly,
> > having lots of fun while doing that).
>
I'm not reinventing the wheel. I'm construction a new one because
the old one does not fit.
> > It would be much better if fsfs could be configured to use one of
> > several caching servers and then the administrator would worry about the
> > rest. I think it's perfectly fine to require one of them.
>
Well then, I've got good news for you, sir. SVN supports memcached
since 1.6. Simply configure it properly.
> > I realize it's too late to do this for 1.8. But I doubt rolling our own
> > cache server makes any kind of sense.
>
It does. Proven with SVN 1.6.
-- Stefan^2.
--
Certified & Supported Apache Subversion Downloads:
*
http://www.wandisco.com/subversion/download
*
Received on 2012-12-20 02:08:42 CET