On Donnerstag, 1. April 2010, Greg Stein wrote:
> 2010/3/31 Jan Horák <horak.honza_at_gmail.com>:
> > 30.3.2010 13:55, Philipp Marek wrote:
> >> * Furthermore, how about allowing the plain data to reside in files?
> >> Would make the database much smaller, and then these data blocks
> >> could possibly be shared among multiple repositories.
> >> (Really easy, too, if they're named by their SHA1, for example).
> >> That should allow for zero-copy IO, too (at least for sending data).
> > The question is, how much faster it would be.. I would like to make a
> > simple test to simulate this soon and estimate the percentage
> > difference..
> My gut says "not that much faster". In most scenarios, the network
> bandwidth between the client/server will be the bottleneck. Reading
> the data off a disk (rather than from a DB) is not going to make the
> WAN connection any faster.
> On a LAN, you might have enough network bandwidth to see bottlenecks
> on the server's I/O channel, but really... I remain somewhat doubtful.
It's not about the raw speed alone.
Of course, having to tell a database which BLOBs to fetch (which might readily
be stored out-of-line, eg. with PostGreSQL [pg_largeobject, TOAST]), getting
them via a socket into a buffer, and writing the buffer to another socket,
everything with protocol header adding/removing etc. *has* to be slower than a
It's a bit about latency (which might not be really an argument, as the
database has to be queried anyway), and about CPU load.
For small setups the HTTP server and the database will be on the same host; if
we can increase the performance by 10% it means that 10% of all people won't
need to buy a faster/larger server.
> I'd go with the "store content in the database" until performance
> figures (or a DBA) demonstrates it is a problem.
Received on 2010-04-02 07:35:57 CEST