On Fri, Jan 9, 2009 at 09:33, Blair Zajac <blair_at_orcaware.com> wrote:
> Greg Stein wrote:
>...
>> If you can show that the performance of selecting N rows from a
>> properties table and constructing a prop hash is the same or faster
>> than deserializing a BLOB from the same row as the node... then, sure.
>> Maybe that makes sense.
>>
>> Also note that these tables are *private* to the WC. We do not have to
>> make any concessions for external scripts, which should be using WC
>> APIs anyways. As such, we are optimizing for overall performance. If
>> we can get clarity at a small perf cost, then sure... we'd do that.
>> But for the moment, I believe that a BLOB is going to be significantly
>> faster than N rows. I'm happy to see somebody demonstrate otherwise
>> tho...
>
> Isn't this premature optimization? Do we know how much slower using a BLOB
> will be than a separate table? Sure, it will be slower, but how much?
>
> From what I understand when designing schemas, you start off with a
> normalized one than denormalize it as necessary for performance.
You can start anywhere you'd like.
Hyrum and I both agree that the BLOB usage looks to be faster and is
better aligned with our expected access patterns (ie. multiple prop
values queries occur more often than a single prop query; think
"svn:eol-style" and "svn:keywords" on every non-binary file).
You can call it premature optimization, or you can call it experience.
But our gut feeling says "use a BLOB". If your gut says otherwise,
then explain (or demo with code) to Hyrum and I that we're mistaken.
With the blob-based schema, we pull information on a node, and can
tell *without a second SELECT statement* whether any properties exist,
and what their names/values are.
Cheers,
-g
Received on 2009-01-09 19:32:28 CET