Re: functions that would help TSVN

From: Stefan Sperling <stsp_at_elego.de>
Date: Tue, 1 Mar 2011 12:45:37 +0100

On Mon, Feb 28, 2011 at 08:35:44PM +0100, Stefan Küng wrote:
> One other thing I'd like to discuss: currently all svn functions use
> streams and provide the data in callbacks to save memory. While I
> fully understand that, I'd like to have at least the
> svn_client_proplist() function to also provide all results in one
> (big) memory hunk. Because right now, to save memory and to avoid
> timeout problems in the callback, svn_client_proplist() does a db
> query for each and every folder and then calls the callback function
> for every folder separately.
> But that is painfully slow if there are hundreds of folders in a
> working copy - one db query for every folder!

This is only true for the BASE tree. For the working tree, we already
use a single query to pull all properties out of the database at once
(see svn_wc__prop_list_recursive, called by svn_client_proplist).
The plan is to use this approach for the BASE tree, too.

> Since most UI clients need all the data in memory anyway, I'd like
> to have a separate svn_client_proplist() API that does *one* db
> query and returns all the results in one go.
> There are several reasons:
> * as mentioned, most UI clients will need all data in memory anyway.
> For example in TSVN I just add the data in the callback to one big
> list/vector/map and start using that data after the function
> returns.

I don't think we need a separate function that does the allocations
on behalf of the callback.
The callback is free to store the data in any way it wants.

> * it is much faster (and I mean *much* faster here, from several
> seconds or even minutes down to a few milliseconds or maybe two or
> three seconds)
> * in case there's not enough RAM available: I can always tell users
> to install more RAM to get it working. But there's no way to make it
> faster with the current callback implementation - there just are no
> faster harddrives or much faster processors.

If the callback takes care of allocations, it can fail more gracefully
than the libraries can. E.g. the callback could decide to cancel the
operation, or to display data it's already got, free some memory, and
continue.

> * the chance that there's not enough RAM available is very small:
> assuming a million properties, each property using 1kb will result
> in 1GB or RAM - most computers today have 3GB, new ones have 4GB and
> more. So even in such rare situations with *huge* working copies the
> chance of too less RAM is very small.

Some operating systems still have resource limits that are lower than that.

> So: for UI clients please provide fast APIs that use more RAM - keep
> the existing APIs that use as less memory as possible for those
> clients who need those.

The libraries provide great flexibility with just one API.
The existing API already gives you the option of using memory the way
you want. So I don't see a reason to add a special-purpose API that
does the allocation on behalf of the callback.
Received on 2011-03-01 12:46:25 CET

This message: [ Message body ]
Next message: Stefan KÃ¼ng: "Re: functions that would help TSVN"
Previous message: Philip Martin: "Re: [PATCH] Compiling subversion trunk with httpd trunk code fails"
Next in thread: Stefan KÃ¼ng: "Re: functions that would help TSVN"
Reply: Stefan KÃ¼ng: "Re: functions that would help TSVN"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]