[PATCH] Introduce per-instance filesystem UUIDs

From: Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com>
Date: Fri, 8 Aug 2014 21:24:12 +0400

Hi,

I would like to propose a patch for the problem discussed in
http://svn.haxx.se/dev/archive-2014-04/0245.shtml

Please see the details below.

Log message:
[[[
Avoid shared data clashes between repositories duplicated using 'hotcopy',
see http://svn.haxx.se/dev/archive-2014-04/0245.shtml

This is not a "scalability issue" (as stated in the referenced thread), but
rather a full-fledged problem. We have an ability to share data between
different objects pointing to same filesystems. The sharing works within
a single process boundary; currently we share locks (svn_mutex__t objects)
and certain transaction data. Accessing shared data requires some sort of
a key and currently we use a filesystem UUID for this purpose. However,
this is *not* good enough for at least a couple of scenarios.

Filesystem UUIDs aren't really unique for every filesystem an end user might
have, because they get duplicated during hotcopy or naive copy (copy-paste).
Whenever we have two filesystems with the same UUIDs open within a single
process, the shared data starts clashing and things can get pretty ugly.
For example, one can experience random errors with parallel commits to two
repositories with the same UUID (hosted by Apache HTTP Server). Another
example was recently mitigated by http://svn.apache.org/r1589653 — we did
encounter a deadlock within nested 'svnadmin freeze' commands executed for
two repositories with the same UUID.

Errors that I witnessed include (but might not be limited to):

- Cannot write to the prototype revision file of transaction '392-ax'
because a previous representation is currently being written by this
process (SVN_ERR_FS_CORRUPT)

- Can't unlock unknown transaction '392-ax' (SVN_ERR_FS_CORRUPT)

- Recursive locks are not supported (SVN_ERR_RECURSIVE_LOCK)
# This used to be deadlock prior to http://svn.apache.org/r1591919

Fix the issue by introducing a concept of "instance UUIDs" on the FS layer.
Basically, this gives us an ability to distinguish filesystem duplicates or
near-duplicates produced via our API (svn_fs_hotcopy3(), to be exact). We
can now have different filesystems with the same "original" UUID, but with
different instance UUIDs. With this concept, it is rather easy to get rid
of the shared data clashes described above.

* subversion/libsvn_fs_fs/fs.h
(SVN_FS_FS__MIN_INSTANCE_UUID_FORMAT): New.
(fs_fs_data_t.instance_uuid): New.

* subversion/libsvn_fs_fs/fs_fs.c
  (read_format, svn_fs_fs__write_format): Support the new 'instance-uuid'
    format option.
  (svn_fs_fs__open): Initialize the instance UUID when it is present and
    fallback to the original UUID of a filesystem whenever the instance
    UUID is absent.
  (upgrade_body, svn_fs_fs__create): Generate a new instance UUID when
    necessary.

* subversion/libsvn_fs_fs/hotcopy.c
(hotcopy_create_empty_dest): Unconditionally generate a new instance UUID.

* subversion/libsvn_fs_fs/fs.c
(fs_serialized_init): Use a combination of two UUIDs as the shared data
key (see the huge comment block about why we concatenate them).

* subversion/tests/cmdline/svnadmin_tests.py
  (check_hotcopy_fsfs_fsx): Allow different 'instance-uuid' format options
    when comparing the db/format contents.
  (freeze_freeze): Do not change UUID of hotcopy for new formats supporting
    instance UUIDs. For new formats, 'svnadmin freeze A (svnadmin freeze B)'
    should not deadlock or error out with SVN_ERR_RECURSIVE_LOCK even if 'A'
    and 'B' share the same UUID.

* subversion/tests/libsvn_fs_fs/fs-fs-pack-test.c
  (write_format): Rework this helper. Before this changeset, we were always
    rewriting 'db/format' contents with predefined values. What we really
    want here is to switch the given filesystem to a sharded layout with a
    specific number of revisions per shard. We might as well achieve this
    by patching the 'layout' format option and doing so would not somehow
    affect the new 'instance-uuid' format option. Implement this logic,
    rename the function into ...
  (patch_format): ...this, and, finally ...
  (create_packed_filesystem): ...update the only caller.
]]]

Are there any objections to this change?

Regards,
Evgeny Kotkov

text/plain attachment: instance-uuid-v1.patch.txt

Received on 2014-08-08 19:25:02 CEST

This message: [ Message body ]
Next message: Branko Čibej: "Re: [PATCH] Introduce per-instance filesystem UUIDs"
Previous message: Stefan Fuhrmann: "Re: svn commit: r1616696 - /subversion/branches/1.8.x/STATUS"
Next in thread: Branko Čibej: "Re: [PATCH] Introduce per-instance filesystem UUIDs"
Reply: Branko Čibej: "Re: [PATCH] Introduce per-instance filesystem UUIDs"
Reply: Stefan Fuhrmann: "Re: [PATCH] Introduce per-instance filesystem UUIDs"
Reply: Ivan Zhakov: "Re: [PATCH] Introduce per-instance filesystem UUIDs"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]