On Fri, Nov 7, 2014 at 6:04 PM, Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com>
> Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> writes:
> > Let's not repeat the revprop caching debacle. In Berlin this year, you
> > told us that you had identified issues with it and decided to disable it
> > in VisualSVN. Had you told us before 1.8, we might have found that the
> > underlying infrastructure is too restrictive. If not, you definitely
> > have found the init race under Windows; it's manifest in the Apache error
> > logs. Everything that happened this summer wrt. to revprop caching
> > could have been dealt with before 1.8.0.
> Not quite so. Ivan was on vacation and completely offline at the moment
> we (Sergey Raevskiy <sergey.raevskiy_at_visualsvn.com> and I) found and
> the problems with revprop caching . This occured while we were
> a repository replication solution for Subversion 1.8, which happens to
> revprop modifications. We stumbled upon these problems on 19-20 August
> and the report followed immediately.
> Undoubtedly this could not have somehow happened prior to 1.8.0.
Good to hear from you again!
I do remember when, during the Sheffield hackathon, Ben pointed me
to your post and I was like "Of course that deployment is not supported,
that's documented in the release nodes. Look here ... WTF?!" Which was
the exact moment when things got hectic.
BTW, I don't want to shift blame for what happened, it is still my
OTOH, there are a few things to be learned from it:
* The feature came with some restrictions, i.e. single machine,
consistent cache settings in all server processes on that machine.
-> Restrictions must be identified and communicated early on.
-> Work towards removing those restrictions.
* A tendency to make the feature work instead of fail,
i.e. the feature was considered optional and not hardened.
-> Destructive testing, thinking of ways to break things
-> Review & explanation such that people may formulate their own ideas
* The underlying communication mechanism had "deteriorated" or time,
starting with plain APR SHM (did not work on Windows), then switching
to mmap + locking + all sorts of complications.
-> SHM is probably not portable, don't use it.
-> Avoid platform-specific code and data
-> If a part of your design goes through multiple adapting iterations,
re-evaluate at the next higher design level.
* Little feedback on the feature,
apart from "I'm getting svnadmin warnings on r/o repositories"
-> Make sure and testable that the feature is actually active.
-> Actively seek feedback.
-> Give feedback asap.
Wrt. log addressing, I think everyone involved so far did a pretty
good job at addressing these issues to the degree they apply.
For instance, the initial f7 design restrictions were discussed in
Berlin and soon after, they had been resolved. It's not always a
pleasant process and I hope we get better at doing it in the future,
but it is the Right Thing to Do.
Received on 2014-11-07 19:52:39 CET