Greg Stein <gstein_at_gmail.com> writes:
> Note: beyond the review below, I'm registering a general concern
> around this notion of "hey! let's drop off this data into some global
> variables^W^Woh, I mean the wc state database... and then some time
> later, we'll perform some operation around them".
>
> This is *so* ripe for danger, I'm depressed.
>
> Honestly. This "temporary table" is just another name for "global
> variable". "Results are collected in a table". Really? REALLY? How is
> that different from a global? Functions going this way and that,
> dropping off rows in the table as a *side effect*, and then along some
> other code path later in the control flow, we decide to pick them back
> up and run with them.
>
> Stinky.
I started doing this for revert. In the database a recursive revert is
something like:
DELETE FROM nodes WHERE local_relpath LIKE ... AND op_depth > ...
DELETE FROM actual_node WHERE local_relpath LIKE ...
It doesn't make sense to do that node-by-node, for a copied tree it's
wrong to revert a parent before a child or a child before a parent.
It's also painfully slow to do it node-by-node. Try a recursive propset
on a tree followed by a revert, that's a database only operation and the
database change happens in seconds as a single transaction, but it
crawls on a network disk when done node-by-node.
Given that the recursive operation is a much better way to do it, how
does notification work? The caller can't simply do it after the delete,
as once the rows have been deleted it's too late to determine which rows
were affected. The delete function has to somehow tell the caller which
rows were affected. It could allocate memory and build a list of rows
before deleting them, essentially that would be duplicating a chunk of
the database into memory (revert needs more than just the local_relpath,
it needs the conflict file names as well). During the subsequent walk
over the working files it is necessary to query the "list" for a given
path, so the "list" needs to be some sort of tree or table.
Doing all that memory management and indexing by hand is silly when we
have a database to do it for us. The temporary tables are not exactly
global, they are per-connection to the database. The function that
populates the table and the function that queries the table are both
called from within the same public svn_wc_xxx interface. So the table
doesn't have any significance outside that function.
What alternative implementation do you suggest?
--
Philip
Received on 2011-04-06 16:55:34 CEST