On 31 January 2016 at 14:03, Stefan Fuhrmann <stefan2_at_apache.org> wrote:
> Hi there,
>
> When the server needs to transmit the list of changed paths
> in a revision, its memory usage is O(#changes), i.e. practically
> unbound. The problems are:
>
> * FS and repos API require a full collection of all changes
> Most consumers simply scan that data once. So, they can
> just as well work on a one change at a time basis as a
> callback.
>
> * The data types are different, so we end up with two copies.
> A revised svn_fs_path_change_t with identical
> svn_repo_path_change3_t can fix this. Minor adjustments
> to the element data types further improve efficiency.
>
> I've played with a revised version of svn_fs_paths_changed
> and svn_repos_get_logs to use callbacks for the individual
> path changes. Effectively, the FS layer can now pump the
> info through the repos / authz filter into the RA layer.
Here is some feedback.
Making FS API streamy is a good goal. But as far I remember repos
layer still has to store all changed path in hashtable for
authorization purposes.
Also I think we should not use callbacks to deliver data from the FS
layer. Currently FS API is passive and I think it should remain the
same: FS API users may invoke FS function from callback and this will
require FS implementation to be fully reentrant (to avoid problems
like fixed in r1718167 [1])
Instead of callbacks we should create svn_fs_changed_paths_t and some
kind of iterator of it. As a nice benefit FS API user may request only
interesting information by providing NULL for some arguments (like
WC-NG API).
I also noticed that svn_fs_nodeid_t member is removed from svn_fs_path_change3_t
in r1727822. Are you sure about this change? Some FSFS bugs are not a
reason to drop this kind of information. We should fix nodeid in
changed path and maybe eventually replace it with svn_fs_node_t.
[1] http://svn.haxx.se/dev/archive-2015-12/0006.shtml
--
Ivan Zhakov
Received on 2016-02-03 13:36:30 CET