[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: changelist feature -- keep it? tweak it? scrap it?

From: Jonathan Gilbert <o2w9gs702_at_sneakemail.com>
Date: 2007-01-17 17:49:13 CET

At 09:15 PM 1/16/2007 -0600, Karl Fogel wrote:
>> To keep track of changelists, one needs to do some bookkeeping... just
>> associate a name with a list of files. You're advocating that every
>> single client app create some sort of private database to do this?
>> But wait... we already *have* a database of sorts... the entries file!
>> It's sitting right there, in front of every app. It's utterly
>> trivial to store a label in an svn_entry_t.
>> So that's exactly the API I wrote: (1) an API to add/remove a
>> changelist label from a path (by adding/removing it from the
>> underlying svn_entry_t). (2) an API to search the working copy for
>> all paths matching a changelist. Basic database functionality,
>> nothing more.
>And really poor database functionality, from an indexing standpoint.
>Even though users almost always deal with changesets by name,
>SVN has to reach them backwards: to reconstruct changelist X,
>we ask every file if it's in X, instead of asking X what files are in it!
>In large working copies, this isn't great... :-)

Why not change the current API, which does "set changelist for file X to
Y", into two functions: "add file X to changelist Y" and "remove file X
from changelist Y"? The current implementation would simply return an error
when trying to add a file that is already in one changelist to another one,
but it leaves the door open for changes to the way the data is actually

Similarly, while certain operations naturally strobe all of the files
anyway as part of their normal processing and thus have virtually no
overhead in asking each file X if it is a part of changelist Y, it's
certainly true that some operations would like to be able to enumerate the
files specific to a changelist as quickly as possible. If a function is
added to the API to "list all files that are part of changelist Y", it can
be initially implemented as a recursive scan, the way it currently has to
be done anyway, but then applications that are built using it will
automatically experience a performance gain if/when the database structure
is reworked.

By designing the API carefully, the back-end implementation details can be
completely masked, and the door kept open for changing the implementation
to better support the end goal feature set. Isn't that one of the main
goals of putting implementation into a library? :-) (the other one being

>The issue is not that other clients are hindered, nor is it that the
>current library *code* is too complex. The issue is about APIs:
>every time we add something, we're stuck with it, and my instinct
>is that down the road we're going to regret these interfaces --
>"regret" in the sense that, as clients figure out what they really
>want from changelists, the APIs we added now will turn out to
>be slightly off from what's needed. Sure, we can add whatever
>the clients need later, and just have some old cruft in our APIs
>that no one uses. Small prices, paid repeatedly, add up, that's all.

This is what I'm talking about above :-) The lesson isn't that we should
put off implementing anything until we know exactly how it needs to be
implemented, but rather that we simply need to spend a bit more time
thinking about the interface we give, so that we prevent UI authors from
taking shortcuts while keeping things flexible enough for them to be
improved upon later.

>(I think your point about the library changes being really tiny cuts
>both ways. If they're small, then the benefits of supplying them
>for all clients can't be that large either.)

As was mentioned in another rebuttal, it's not about the size of the
changes, but about where they're placed. Even though the functionality is
simple, virtually *every* client is going to want a piece of the action,
and if they all make their own databases, then they won't be talking to one
another. That Eclipse user is going to be pissed off when he discovers that
all the changelists he set up in the nice Eclipse GUI don't exist as far as
the command-line client is concerned :-)

>If we absolutely *must* provide library-level support, I wish we'd do it
>in the configuration area code instead of libsvn_wc, so that we could
>at least index changesets in a conceptually appropriate way, with both
>proper performance and proper semantics.

Moving the existing API to the configuration area code won't improve its
interface. Subversion currently has no centralized concept of working
copies on a system -- they exist on their own, like little islands in the
rest of the filesystem, no? Given this, does it not make sense that
changelists, since they are not managed by the server, would be WC-specific?

Perhaps I'm just misunderstanding and it would make perfect sense to put
WC-specific functionality into the configuration area code. I don't know
the Subversion code-base all that well...

Jonathan Gilbert

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Jan 17 17:50:12 2007

This is an archived mail posted to the Subversion Dev mailing list.