[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: RFE: pack revprops shards

From: Hyrum K. Wright <hyrum_at_hyrumwright.org>
Date: Fri, 24 Apr 2009 15:44:53 -0500

Just playing devil's advocate on this issue. I'm not against it by
any means, but have a few questions.

On Apr 24, 2009, at 3:11 PM, Blair Zajac wrote:

> One could also have the revprops stored as single files if they are
> modified
> after packing. The lookup code would first try to open the single
> revprop file
> and if it doesn't exist, then it goes to the packaged file. The
> packer could
> allow for multiple repackings.

In repositories which have a large number of modified-then-packed
revprops, this leads to much more storage, and double the I/O. I'm
kinda wary of this.

> Mark Phippard wrote:
>> I see a couple of options:
>> 1) Start storing revprops in SQLite. We are already using it for
>> rep-sharing. This removes the need to pack the revprops and possibly
>> even opens the door for future features users have asked for, such as
>> being able to query revprops.

I think this is the best long-term solution. The implementation
should be pretty straight forward, it's just a matter of somebody
picking it up.

>> 2) Allow packing of revprops, but issue an error if an attempt is
>> made
>> to edit a packed revision. This seems like a pretty small need.
>> Perhaps rev 0 could always be unpacked since there are some tools
>> that
>> store things in the revprops of rev 0.

That's a possibility; it'd be kinda like a permanently disabled pre-
revprop hook. However, I can see the complaints from folks who take
this step and then want to disable it. Also, the idea of special
casing a particular revision raises red flags in my sense-o-meter.

>> On Fri, Apr 24, 2009 at 3:35 PM, Osvaldo Pinali Doederlein
>> <osvaldo_at_visionnaire.com.br> wrote:
>>> I started this RFE in the Subversion blog: "Packing of the /db/
>>> revprops
>>> shards. These are still accumulating hundreds of thousands of TINY
>>> files
>>> (avg 150 bytes) in my poor Windows server (NTFS really doesn't like
>>> small files)... with packing, each of these 1000 prop files would be
>>> replaced by a single ~150Kb blob."
>>> Answer from Hyrum Wright: "Revprops are mutable, and as such their
>>> size
>>> may change. Modifying a packed revprop would cause the entire
>>> shard to
>>> be rewritten, not just the modified value. Aside from the
>>> performance
>>> issues, this also causes race conditions when multiple revprops are
>>> being edited at the same time. All of these concerns mean that
>>> packing
>>> of revprops probably won't happen any time soon. What might happen
>>> is a
>>> migration of revprops to a better storage mechanism, such as sqlite,
>>> though there are no current plans for that."
>>> Even though revprops packing (with an ideal behavior) is not easy to
>>> implement, it's still a highly desirable feature so I propose
>>> opening a
>>> bug to track it. I have some suggestions that may even make this
>>> viable
>>> for an 1.6.x update:
>>> - Yes revprops are mutable, but in practice they are mostly-readonly
>>> data, remarkably for old revisions. It's not uncommon that revprops
>>> changes be blocked (e.g. SourceForge.net did that until recently).
>>> In
>>> many companies this is also mandated. In such cases, revprops ARE
>>> read-only, so revprops could be packed trivially (with the same
>>> simple
>>> file layout used by packed revs).
>>> - So, in a first attempt we could have a simple implementation of
>>> packed
>>> revprops, with the following constraint: Once a specific revprops
>>> shard
>>> is packed, any further attempt to change any of those revprops
>>> will be
>>> refused. The admin could use hooks to make the whole thing smoother,
>>> e.g. making sure that complete shards are only packed if older
>>> than a
>>> month, or disallowing revprops updates even in non-packed
>>> revisions so
>>> the rule is simpler and there are no update failures.
>>> - Additionally, the update of packed revprops could be supported
>>> (with
>>> the same simple storage format) even though it COULD be an expensive
>>> operation (lock the entire shard or even the entire repo, rewrite it
>>> completely). It's a reasonable option if, for somebody's repo,
>>> revprop
>>> updates are permitted but not very common. For some users (like my
>>> company), that don't make heavy use of revprops, those packed shards
>>> weight in the low hundreds of Kb, so the brute-force update would
>>> still
>>> run in a split-second with NO usability disadvantage at all.
>>> - This initial design wouldn't require a new repository format. We
>>> could
>>> just have an option like "svnadmin pack --revprops", so by default
>>> (with
>>> format=5) only revs are packed, but one could optionally pack the
>>> revprops too. The fsfs layer would have to detect if a revprops
>>> shard is
>>> packed, but this is necessary anyway (just like with packed revs)
>>> because the most recent shard is typically not packed. If a future
>>> version of SVN introduces a smarter storage for packed revprops
>>> that can
>>> handle frequent updates with low overhead and no coarse-grained
>>> locking,
>>> that would be accommodated and distinguished by a different
>>> repository
>>> format.
>>> A+
>>> Osvaldo
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1897254
> To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_subversion.tigris.org
> ].


To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_subversion.tigris.org].
Received on 2009-04-24 22:45:52 CEST

This is an archived mail posted to the Subversion Users mailing list.