[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Předmět:,Backends issues overview and SQL backend features

From: Jan Horák <horak.honza_at_gmail.com>
Date: Mon, 04 Jan 2010 17:28:40 +0100

Hi, while I'm preparing to SQL backend analyses/design I've wrote up
some FSFS and BDB issues and then some features, that can be expected
from SQL backend. I will appreciate any reaction to following points
(sorry for the length, but I tried to short it as much as possible).

First FSFS and BDB issues:
------
* BDB is not portable

* network file system only together with FSFS backend

* repository is generally unreadable to human

* FSFS doesn't suffer from portability issues

* not easy to add new indexes in FSFS

* the indexing itself is relatively complicated feature in FSFS

* all changes in data scheme will involved relatively much changes in
the code in FSFS and BDB

* In FSFS slowly speed of the commit operation finalization and the Head
revision checkout

* generally no serious reliability problems known any more.

Now expected features of the SQL backend:
------
* we cannot expect much better performance, but it can offer a lot of
new possibilities - some are following

* Some users just want it - SVN doesn't really need an SQL backend (FSFS
rocks), but it would make a lot of people feel better, if their
repositories could be stored in regular SQL Server. Something they
understand and feel comfortable with.

* David Weintraub wrote: In large corporate environments, this can be a
selling point. Typical Pointy Headed Manager's Comment: „SQL! That means
it must be good.“

* Mark Phippard: I can picture a large hosting site like SourceForge
using a clustered SQL repository that is front-ended by a large number
of load-balanced Apache servers and getting very good response times.
Since you would get a robust client/server architecture for free with
most SQL engines, it offers a lot more possibilities for intelligently
and safely splitting the workload across machines.

* Adding new indexes is simple

* caching and indexing itself is better, how much this influence the
total speed, it is question.

* changing the SQL engine, while the changes in the code will not be
very large

* Due to problems with implementation the DAGs in SQL we cannot expect
better or even same speed of operations, like in FSFS or BDB backends.

* Similar worse results (like speed) we can expect considering the
database size (indexes have to be stored besides data)

* We can expect faster Head revision checkout and finalization of the
commit, than FSFS offers, so it would be more suitable for large
installations where many readers access the repository.

* Kevin Broderick wrote: … Many, if not all organizations already have
databases of some sort (or of multiple sorts) in place. That usually
implies that the infrastructure around the database - network, server,
backups, admin tools, monitoring, etc. also in place ...

* As mentioned before, SQL backend would be good suitable for larger
installations and it would not be problem store big amount of data in
some power-full database (e.g. Oracle).

* platform-independent

* good accessible using network

* good read-able by human without need of using db_dump utilities

* Dominic Anello wrote: it can offer more robust query interface into
the repository (we can use queries like „Where were all modifications to
somefile.h made“ or „What tags have been made off of the project-2.3.1
branch“ without using the log).

As I said, any reaction will be appreciated.

-- 
Regards
Honza Horák
E-mail: horak.honza_at_gmail.com
Received on 2010-01-04 17:29:18 CET

This is an archived mail posted to the Subversion Dev mailing list.