[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1637184 - in /subversion/trunk/subversion: libsvn_fs_fs/ tests/libsvn_fs/

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Fri, 7 Nov 2014 00:44:07 +0100

[This goes mostly to Ivan but the last part goes to Brane. Don't want to
sub-thread here.]

On Thu, Nov 6, 2014 at 8:24 PM, Branko Čibej <brane_at_wandisco.com> wrote:

> On 06.11.2014 20:03, ivan_at_apache.org wrote:
> > Author: ivan
> > Date: Thu Nov 6 19:03:31 2014
> > New Revision: 1637184
> >
> > URL: http://svn.apache.org/r1637184
> > Log:
> > Make FSFSv7 repositories always use consistent addressing mode, instead
> of
> > saving revision number from which logical addressing was enabled.
>
>
> Congrats on finally doing something concrete about FSFSv7.
>

+1 on the meta point.

It looks a bit like a rushed commit, though (partly corrected by the
r1637186 commit). Documentation update is missing:

structure:Filesystem format options
    * update addressing mode description

fs_fs.c:read_format()
    * update docstring

transaction.c:store_l2p_index_entry()
    * update docstring

However.
>
> This is a really major functional change. I would have expected some
> discussion on dev@ about why you think this is necessary, or even
> useful, before you committed this. Specifically:
>
> > From the performance point of view there will be no big benefits to
> enable
> > log addressing for an existing repository, because the existing old part
> > of the repository will remain to be addressed physically.
>
> I disagree with your assessment. Certainly, as long as there are "live"
> delta chains in the repository that reach all the way into
> physically-indexed content, there will less performance benefit from
> logical addressing than in a "pure" FSFSv7. But this state will not
> persist "forever", certainly not for actively changed content.
>
> The second problem is your assumption that dumping and loading a
> repository is desired and/or cheap. That really depends on the size of
> the repository (and other factors). By making a dump/load cycle
> mandatory for upgrading a repository to log-addressing mode, you have
> effectively killed this feature. Without any discussion, I might add.
> The fact that logical addressing can be enabled on an existing
> repository without requiring a costly upgrade has been one of the
> *major* points of FSFSv7.
>

I'm -.5 on removing the "mixed addressing" feature for exactly the
points made by Brane. If it would involve massive complications
in other parts of FSFS or otherwise require lots of additional code,
I would be much more willing to accept that restriction.

Right now, I only see one benefit of not having mixed mode repos:
The txn folder does need to be renamed. That is a trivial change and
only of interest for people writing low-level replicators. Both companies
that I'm aware of offering those know about the change.

Apart from that, you create the opportunity to create sharded f7
repositories (create as f6 and immediately upgrade) that will never
use log addressing but still gets the other format improvements.

>
> > On the other hand, consistent addressing allows us to omit some tricky
> code.
>

I completely disagree. It only removed the ~200 LOC (including
the follow-up commit) of the TXN upgrade logic. All other changes
are trivial. And the code in question was heavily commented. If
there were questions / more clarification needed, I'd be happy to
improve it. Maybe we would actually find a loophole ...

This argument is of the same ilk as "code churn." If tricky code were a
> problem, we'd never have brought Subversion to where it is now.
>
> > This also fixes problems with long living svn_fs_t instances during the
> hot
> > repository upgrade in background.
>
> Which problems? You can't just say "fixes problems" without pointing
> them out. As far as I can see, the hot upgrade code is pretty solid.
>

If there were problems with the upgrade code, I'd really like to
know them because I've spent many days trying to come up
with sequences involving mixtures of old and new servers etc,
that could potentially corrupt the repository.

There already is a thread on this and I'd be more than happy to
continue the discussion there.

> Furthermore, hot upgrades are a new feature in FSFSv7: So, if you have
> an issue with that, by all means propose that we forbid hot upgrades, as
> that will not be a regression compared to FSFSv6. Do not rip out a
> completely orthogonal, important feature instead.
>

Brane, you might confuse this with 'svnadmin pack'. The thing
about upgrade is that we have a nasty FS API limitation that
is a missing concept of read sessions / locks. Everyone can
open a repo, keep the svn_fs_t, upgrade the repo and then
*write* to it using the old FS pointer with the old format info.
IOW, we must make sure that writers can happily ignore any
format info as long as they are not trying to write in an even
newer format than specified on disk.

I guess, we never told the users that these hot upgrades are
a risky operation and should be avoided when feasible. We
should make up leeway in 1.9 release notes.

The solution in FSFS is to make the write access miss some
essential data (global ids can no longer be parsed etc.). In
previous releases, that failure can be anything (parser error,
file not found, ...) or may not occur at all (when it is safe to
ignore the new format unless someone actually makes use
of its features). I'm 80% confident that we actually covered
all bases in all previous releases but I will not guarantee it.

For f7, the renamed txn folder makes any write attempts
fail with some "file not found". Attempts to read log. addr.
revs as <f7 fails with a parser error for unpacked and a
"file not found" for packed revs. Moreover, SVN 1.9 cannot
commit to formats > 7 as it re-reads the actual format file
before starting the commit.

-- Stefan^2.
Received on 2014-11-07 00:46:54 CET

This is an archived mail posted to the Subversion Dev mailing list.