[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r35086 - trunk/subversion/libsvn_wc

From: Greg Stein <gstein_at_gmail.com>
Date: Fri, 9 Jan 2009 10:32:11 -0800

On Fri, Jan 9, 2009 at 09:33, Blair Zajac <blair_at_orcaware.com> wrote:
> Greg Stein wrote:
>> If you can show that the performance of selecting N rows from a
>> properties table and constructing a prop hash is the same or faster
>> than deserializing a BLOB from the same row as the node... then, sure.
>> Maybe that makes sense.
>> Also note that these tables are *private* to the WC. We do not have to
>> make any concessions for external scripts, which should be using WC
>> APIs anyways. As such, we are optimizing for overall performance. If
>> we can get clarity at a small perf cost, then sure... we'd do that.
>> But for the moment, I believe that a BLOB is going to be significantly
>> faster than N rows. I'm happy to see somebody demonstrate otherwise
>> tho...
> Isn't this premature optimization? Do we know how much slower using a BLOB
> will be than a separate table? Sure, it will be slower, but how much?
> From what I understand when designing schemas, you start off with a
> normalized one than denormalize it as necessary for performance.

You can start anywhere you'd like.

Hyrum and I both agree that the BLOB usage looks to be faster and is
better aligned with our expected access patterns (ie. multiple prop
values queries occur more often than a single prop query; think
"svn:eol-style" and "svn:keywords" on every non-binary file).

You can call it premature optimization, or you can call it experience.
But our gut feeling says "use a BLOB". If your gut says otherwise,
then explain (or demo with code) to Hyrum and I that we're mistaken.
With the blob-based schema, we pull information on a node, and can
tell *without a second SELECT statement* whether any properties exist,
and what their names/values are.

Received on 2009-01-09 19:32:28 CET

This is an archived mail posted to the Subversion Dev mailing list.