[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Implement major FSFS performance related changes in the experimental FSX format

From: Ivan Zhakov <ivan_at_visualsvn.com>
Date: Thu, 18 Sep 2014 16:26:09 +0400

On 11 September 2014 18:03, C. Michael Pilato <cmpilato_at_collab.net> wrote:

> Ivan: As a reviewer of the code in question, there's really no way
> you could have established (for yourself) a legitimate objection
> about the "general quality" of the code without first spotting
> some specific things that bothered you. Of course, if while
> reviewing the code you spotted so many such specific things
> that you lost count or couldn't keep track, that general
> concern would emerge.

This is a case when the code contains so many complications and
unclear design and implementation issues that nobody can effectively
review it. To show this I'll just retell the story.

1. We have discussed on Berlin 2013 hackathon that the log-addressing
feature should be implemented in an experimental fs-type [1].
stefan2 expressed that while he is confident that FSFSv7 is solid code,
it's also quite critical and could easily take a year or more to fully
stabilize. Attendees felt that perhaps it would be best to introduce FSFSv7
as a new, experimental fs-type. Stefan said he had been thinking
about the same thing himself, even considering a different name for
his implementation.

Then the plan was changed but there are no any footprints on the
dev list about the reasons to change the direction.

2. The log-addressing feature was implemented in the branch. The branch
was proposed for review in [2].

At this point I volunteered to review the code. My initial feeling
was that I'm about '-0.5' to merge this branch. The reasons were
the following:
1) I do not think that the value of the log-adressing feature worth
   the code burden it produces.
2) I think that code is over-complicated and too error-prone.

I raised my concerns in that thread.

3. I suggested to use a 3-vote policy for this branch but this suggestion
was not actually implemented.

The formal vote thread was not created. There is a single informal '+1'
to merge this branch. There are also statements that two other committers
have started the review but there are no any footprints in the dev list
that they have even finished it.

Then the log-addressing feature was pushed to the trunk without any notice
in the dev list.

I'm not a big fan of the 3-vote policy for branches but I do not understand
why this policy was not actually applied for this particular branch.

4. On the Berlin 2014 hackathon serious design flaw and significant
performance degradation have found in the log-addressing feature.

I'm talking about storing indexes in separate files. This is a big issue
and it is not a surprise that this issue was not found before merging
to trunk. I treat this as a sign that nobody other that an sole author
understands what actually happens in the code.

5. Accidentally, significant new issues were detected in the revprop-caching
feature (see [3], [4], [5] etc).

It does matter because the revprop-caching was implemented with the same
level of overconfidence as the discussed log-addressing feature. We have
got enormous number of issues in the revprop-caching feature.

The frightening thing is that the already released feature was
disabled in the trunk and we are going to disable it in the next
patch release. This is an exceptional case. I do not remember
other cases when the released feature becomes disabled in the patch

Once again. This is an exceptional case and I do not see any reasons
to be carefree and say something like 'bugs always happen'.

6. At this point I have switched to '-1' about the log-addressing feature.

It's obvious that we will be unable to easily disable the log-addressing
feature as we have done for the revprop-caching. And we can get many
repositories corrupted. And the format can be changed (as it happened
at the Berlin hackathon when the feature is supposed to be properly
reviewed and already merged to trunk). And we will be obliged to support
repositories with the current log-addressing format forever or ask users
to dump repositories using svn 1.9 and load using svn 1.10.

These are the main reasons why I'm vetoing this feature from being
released in its current state. Also I still think that there are
no performance benefits for the majority of users.

This is the full story. I personally think that this is far enough
for the veto.

The current question is the following: is it a legitimate veto or not?

I think it is fully legitimate. The topic is discussed for the long
time but nobody wants to put an additional '+1' for the discussed
code (formally or informally). I think this happens because of the fact
that the code is too big and too overcomplicated. So we should not
release it in order to prevent the possible disaster.

Do we really need to find the particular bugs in this situation?

What will happen if data corruption or similar issues will be found?

How we will be able to proof that there are no more other
critical issues (when the found ones will be fixed)?

Once again, I know that bugs always happen. But we are talking
about significant *repository* format change. We should
be very sure that everything is fine and the new format is stable
(with all the implementation details).

[1] http://wiki.apache.org/subversion/Berlin2013#FSFSv7_Branch_Reintegration

[2] dev_at_s.a.o thread "Log-addressing branch ready for review"

[3] dev_at_s.a.o thread "[RFC] Revision property caching and named atomics"

[4] dev_at_s.a.o thread "named_atomic tests failures on Windows"

[5] http://svn.haxx.se/dev/archive-2014-08/0007.shtml

Ivan Zhakov
Received on 2014-09-18 14:27:02 CEST

This is an archived mail posted to the Subversion Dev mailing list.