[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: revprop packing for fsfs

From: Hyrum K. Wright <hyrum_at_hyrumwright.org>
Date: Wed, 29 Apr 2009 18:33:37 -0500

On Apr 29, 2009, at 5:19 PM, Paul Querna wrote:

> On Tue, Apr 28, 2009 at 11:58 PM, Daniel Shahaf <d.s_at_daniel.shahaf.name
> > wrote:
>> Peter Samuelson wrote on Tue, 28 Apr 2009 at 16:50 -0500:
>>>
>>> [Paul Querna]
>>>> Last time this was discussed was in October 2008[3], when hwright
>>>> had
>>>> packed revprops, but later reverted the work as revprops are
>>>> mutable
>>>> in r33724.
>>>
>>> Is there a reason to try too hard to avoid allocation holes in the
>>> revprops file? Seems to me you could just align each revision to,
>>> say,
>>> 16 or 32 bytes, and if a propedit operation wants to make it too
>>> big to
>>> fit, you move it to the end.
>>
>>> Maybe increment a "hole counter" and if that gets to be too large
>>> a fraction of shard size, take a lock and repack the file.
>>>
>>
>> How can you do the "repacking" step atomically?
>
> And this gets to the root of the problem. Either we spend time making
> a half-complete database equivalent for storing props, or just keep it
> simple and use a library for it. (SQLite, whatever).
>
>>> This would only be problematic if some automated process were to
>>> edit a
>>> lot of revprops from comparatively old revisions. (Which I honestly
>>> don't think is a case worth optimizing for.)
>>>
>>>> My question for the list: Is the design of using SQLite as the
>>>> primary
>>>> data storage for revprops acceptable?
>>>
>>> A flat file seems preferable, unless there are serious problems with
>>> that which I haven't seen.
>>>
>
> Block re-use, and inline editing seem deceptively simple. If you don't
> need partial file editing or block reuse (ie, you could use an append
> only schema), then yeah, a flat file seems much more viable, but those
> things are just a deep hole you could avoid by using a pre-built
> library like SQLite (or BerkeleyDB for that matter).
>
> I'm hacking up a patch using SQLite for storing the revprops right
> now, hopefully be able to post it in a few days when I get some more
> hacking time in on it.

How about the attached? I'm testing now, but it looks to give a ~90%
space improvement in revprop storage. Still a few kinks, but if
you've got some time to test it, I'd appreciate it.

-Hyrum

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1987190

On Apr 29, 2009, at 5:19 PM, Paul Querna wrote:

> On Tue, Apr 28, 2009 at 11:58 PM, Daniel Shahaf <d.s_at_daniel.shahaf.name
> > wrote:
>> Peter Samuelson wrote on Tue, 28 Apr 2009 at 16:50 -0500:
>>>
>>> [Paul Querna]
>>>> Last time this was discussed was in October 2008[3], when hwright
>>>> had
>>>> packed revprops, but later reverted the work as revprops are
>>>> mutable
>>>> in r33724.
>>>
>>> Is there a reason to try too hard to avoid allocation holes in the
>>> revprops file? Seems to me you could just align each revision to,
>>> say,
>>> 16 or 32 bytes, and if a propedit operation wants to make it too
>>> big to
>>> fit, you move it to the end.
>>
>>> Maybe increment a "hole counter" and if that gets to be too large
>>> a fraction of shard size, take a lock and repack the file.
>>>
>>
>> How can you do the "repacking" step atomically?
>
> And this gets to the root of the problem. Either we spend time making
> a half-complete database equivalent for storing props, or just keep it
> simple and use a library for it. (SQLite, whatever).
>
>>> This would only be problematic if some automated process were to
>>> edit a
>>> lot of revprops from comparatively old revisions. (Which I honestly
>>> don't think is a case worth optimizing for.)
>>>
>>>> My question for the list: Is the design of using SQLite as the
>>>> primary
>>>> data storage for revprops acceptable?
>>>
>>> A flat file seems preferable, unless there are serious problems with
>>> that which I haven't seen.
>>>
>
> Block re-use, and inline editing seem deceptively simple. If you don't
> need partial file editing or block reuse (ie, you could use an append
> only schema), then yeah, a flat file seems much more viable, but those
> things are just a deep hole you could avoid by using a pre-built
> library like SQLite (or BerkeleyDB for that matter).
>
> I'm hacking up a patch using SQLite for storing the revprops right
> now, hopefully be able to post it in a few days when I get some more
> hacking time in on it.

How about the attached? I'm testing now, but it looks to give a ~90%
space improvement in revprop storage. Still a few kinks, but if
you've got some time to test it, I'd appreciate it.

-Hyrum

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1987190

Received on 2009-04-30 01:34:37 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.