[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: repository GUIDs

From: John Barstow <John_Barstow_at_gfsg.co.nz>
Date: 2002-12-12 22:16:36 CET

> It feels right, but sometimes it's necessary to cluster an index on an
> Alternative Key or indeed even on a non-unique column rather than the
> Primary Key for various reasons. The reasons in this case are space
> optimization related. Our funky base-whatever IDs are much shorter than
> the number of bytes a GUID has. Monotonically increasing integers also
> use a B+-tree index more efficiently.

A few counter-points here.

Monotonically increasing integers are very difficult to handle in
distributed scenarios. There are inevitably key conflicts whenever inserts
occur.

Not all indexes need to be implemented as B+ trees. I don't have my purple
book handy (the movers are supposed to bring it next week), but there are a
number of indexing schemes that work just fine for GUIDs. I'm thinking
bitmap indices off the top of head.

Monotonically increasing integers as primary keys create insertion hot-spots
in many database implementations, since new records are always inserted at
the end.

Index selectivity is likely to be low anyways. There are likely to be very
few repositories in the majority of installations, meaning that a query
optimizer is unlikely to use the RepositoryID index at all. It's much more
likely to use criteria such as revision number, date ranges, or even
properties to select an index, since those tend to be highly selective.
Even in the case of thousands of repositories, revision number is probably
more selective than repository ID - in such cases, you're likely to have
millions of records in the linked tables.

"Premature optimization is the root of all evil." Don't recall who said it,
but it's true
even for databases. Dropping a poorly selected primary key got me a 40%
performance gain once - it was actually 40% slower to use the index than to
just scan all the records, and we had millions of records.

John C Barstow

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Dec 12 22:11:00 2002

This is an archived mail posted to the Subversion Dev mailing list.