[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: RFC: Revision indexes for 1.1

From: Branko Čibej <brane_at_xbc.nu>
Date: 2004-04-20 01:39:44 CEST

kfogel@collab.net wrote:

>Greg Hudson's comments echo mine, so I won't repeat those here.
>
>
Then my reply to his post is addressed to you, too. :-)

>Just a few other questions, though:
>
>Branko Čibej <brane@xbc.nu> writes:
>
>
>>3.1 DB schema changes
>>
>>The filesystem grows a new table, "revpropindex", with the following schema:
>>
>> (PROP-NAME PROP-VALUE) -> REVNUM
>>
>>Non-unique indexes are allowed.
>>
>>
>
>Is there a reason not to do non-uniqueness like this:
>
> (PROP-NAME PROP-VALUE) -> (REVNUM1, REVNUM2, REVNUM1 ...)
>
>and retain unique indexes?
>
>
Mike answered that one.

>>No other changes are necessary. For forward compatibility, servers that
>>do not implement revision indexes will ignore this table; for backward
>>compatibility, if the table does not exist in the repository, revision
>>indexing and search operations are disabled. The dumpfile format does
>>not change, as the contents of the revision index can be reconstructed
>>from revprop data.
>>
>>
>
>If the feature is allowed to be unavailable anyway, then I'd prefer
>not to have the special rule that "svn:date" and "svn:name" are always
>indexed. Why not let them be controlled in the same way as everything
>else?
>
>
This feature is allowed to be unavailable for 1.x backward compatibility
only. All repositories created by 1.1 and later would have to index
dates and labels. Also I don't expect someone to downgrate the server to
1.0.x yet keep the repository in the 1.1 format, but we have to cover
that possibility because of our compatibility guarantees. (They'd still
have to reload the repo at the next upgrade, to regenerate the index.)

It does make sense to not change the dump format, though, since all the
information in the index is redundant, almost by definition.

>>3.2 Multiple indexes per revision
>>
>>The values of properties that allow multiple keys per single revision
>>are represented in a newline-terminated list, one value per line (like
>>the svn:ignore property on directories). Each value is added as a
>>separate key to the index.
>>
>>
>
>I didn't understand this, sorry. Can you maybe give an example, or
>try saying it another way? (Probably this is somehow related to my
>earlier question about unique indexes.)
>
>
I hope I answered this adequately in my other reply.

>>3.4 FS/Repos API changes
>>
>>When opening an existing repository, the FS layer must not error out if
>>the revpropindex table does not exist.
>>
>>The repos layer grows a new function,
>>
>> svn_repos_revision_search(propname, propvalue)
>>
>>which returns a list of revision numbers. The list can be empty. No
>>error is returned if a property is not indexed or revision indexing is
>>not enabled in the repository (i.e., if the repository schema version is
>>older than the server version).
>>
>>
>
>Why is it better to return an empty list than an
>SVN_ERR_UNSUPPORTED_FEATURE error or something like that?
>
>
For one thing, I think it's completely irrelevant (from the client's
perspective) whether the feature is unsupported or the list of matches
is actually empty. On the other hand it will probably turn out that the
RA implementation can't avoid throwing the error. This is both an
implementation detail and a UI issue.

>>The propset, propchange and propdel repos-level wrappers must maintain
>>the revpropindex table (optimization hint: when changing multi-value
>>properties, only values deleted from or added to the list need to be
>>processed).
>>
>>The function svn_repos_dated_revision changes: first, it calls
>>svn_repos_revision_search("svn:date", timestamp). If this returns a
>>non-empty list, it returns the oldest revision from this list. Otherwise
>>it performs the current binary search. (The binary-search implementation
>>must stay for backward compatibility. It can be removed in 2.0.)
>>svn_repos_committed_info and svn_repos_history get similar changes.
>>
>>
>
>Ooooh. I have a feeling if I understood your earlier thing about
>"properties that allow multiple keys per single revision", I'd
>understand this part :-(.
>
>
Nah, actually I screwed up this thing about date-based searching, and
multi-indexes have no bearing on it...

>>4. Implementing revision names
>>
>>Using the mechanism described above, we can add symbolic names to a
>>revision or a set of revisions. To do this we introduce a new revision
>>property, "svn:name", that contains a newline-separated list of symbolic
>>names assigned to a revision. The values are non-unique: that is, a
>>single symbolic name can group several distinct revisions.
>>
>>
>
>If we're calling these "labels", then let's use "svn:label".
>
>
I'm only calling them "labels" here because of the earlier proposals. My
candidate would be "symbolic revision names", mainly because I expect
this would stop the flow of "how can I set a label on a single file"
type of questions. But that's a bikeshed.

>>While the existing "prop(get|set|edit) --revprop" functionality is
>>sufficient for setting and maintaining revision names, it is not really
>>useful. I propose the following changes to the UI:
>>
>>4.1 Extend the format of the "-r" command-line option
>>
>>Currently the -r command-line option accepts a revision number or a date
>>(range):
>>
>> -r revnum|{date}[:revnum|{date}]
>>
>>The {date} specifier is internally converted to a revision number. We
>>add another specifier, [labelname], that is also converted to a revision
>>number.
>>
>>
>
>This sounds great (exactly the way CVS does it too).
>
>
Yes, and it's also nice and logical (funny that, coming from CVS :-)

>>Note: Since label values are non-unique, a [label] specifier can refer
>>to a list of revision numbers. Such lists useless for "svn update" or
>>"svn export"; however, "svn merge" could be extended to handle
>>multi-revision merges (cherry-picking, right?). We should support an
>>analogous format, "-r revnum,revnum,..." for specifying an explicit list
>>of revision numbers; this is also needed for defining multi-revision labels.
>>
>>
>
>Yes, and even before that's supported, we can have the [labelname]
>expansion be comma-separated when multiple revisions come back. That
>way the -r option will give a syntax error for the stuff it can't
>handle yet.
>
>
Indeed. I'd thought about saying this too, but in fact it's not
efficient for the actual implementation to do text-based replacement of
the option values. And cmdline isn't the only client, of course. There's
some magic to be done in the svn_client API.

>>4.2 svn label [-r revnum/range/list] label-name
>>
>>Adds a label to the specified revision(s). All forms of the -r option
>>are supported (including label specifiers, of course). The default is to
>>label HEAD.
>>
>>4.3 svn labeldel [-r revnum/range/list] label-name
>>
>>Remove a label from the specified revision(s). If -r is not specified,
>>remove all instances of the label.
>>
>>
>
>I'm not quite as opposed as Greg Hudson to having new commands for
>this, but would like to first do without and see whether or not it's a
>problem for people. 'svn propset --revprop -rN svn:label VALUE' isn't
>so hard, especially for an early adopter. It feels premature to add
>dedicated subcommands for workflow-specific uses of properties, before
>we've had a chance to see how often and in what way the properties
>actually get used.
>
>
The first use I see for multi-revision labels is for marking
patch-release merge candidates. It would probably make sense to add a
"labelget" command that returns a list of revision numbers associated
with a label, too -- helpful until multi-version merges are implemented,
and probably useful in any case (and incidentally it's yet another
operation that can't easily be simulated by the current revprop commands).

>>All these functions need equivalents in the client library; the RA layer
>>only has to expose svn_repos_revision_search. "svn label: and "svn
>>labeldel" can be implemented as simple revprop manipulations, although
>>implementing them on the server would make multi-revision labeling faster.
>>
>>
>
>This optimization is independent of the command set, if we implement
>the '-rN,M,...' syntax.
>
>
Yes.

>>5. Future notes
>>
>>Currently no history is recorded about revprop changes. This is an
>>oversight that makes Subversion behave slightly at cross-purposes with
>>configuration management philosophy. Unfortunately, in order to record
>>historical changes to revprops, a slightly more drastic change is needed
>>not just to the schema and API, because these changes would have to be
>>recorded in a new kind of transaction. Thus this kind of history
>>tracking cannot be implemented before 2.0.
>>
>>
>
>I agree, and wonder how high priority this should be.
>
>
Currently it's an inconvenience, but not a showstopper; after all it can
always be simulated with a pre-revprop-change hook. But I'd not like to
see 2.0 without this.

>Look at it this way: assume that *every* change is versioned in the
>sense that
>
> - it can be rolled back to any previous point
> - it has some metadata (a log message) associated with it
>
>
Ah no, the first but not the second -- you don't have to associate a log
message with every change in the repository; log messages are associated
with commits, not metadata changes. But you should be able to see, for
example, what the _original_ log message for a revision was, and also
replay metadata changes (this is indispensable for asynchronous
repository replication, for example).

>...then a change to metadata itself must be rollbackable and have
>associated metametadata. You can see how this begins to stack up.
>
>
Modify that to "a change to mutable metadata", and the termination
condition is part of the statement.

>It's not totally impossible to implement the infinite tower, it's just
>a pain. Subversion has chosen to "bottom out" at the first level --
>the metadata associated with a commit is not versioned, it's just
>metadata. Is that at odds with CM philosophy, or is it more that if
>one wants something versioned, one should put it under version
>control?
>
>
Not all metadata is associated with a commit (ACLs are an example). But
apart from that, yes, if the metadata associated with a commit is
mutable -- as it is in Subversion -- then it is indeed at odds with CM
philosophy if changes to that metadata aren't tracked. Imagine someone
changing the value of svn:author; don't you agree that is useful to know
what the original value had been, and who made the change, and when?

There's a nice paper that discusses some of these issues at

    http://www.accurev.com/accurev/info/timesafe.htm

Anyway, I think we can postpone this part of the discussion until the
2.0 design phase (or at least take it to another thread).

-- Brane

P.S.: For the record, I'd consider AccuRev to be our real competition
going forward, rather than CVS (or BitKeeper). It has a totally
outstanding SCM model.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Apr 20 01:39:35 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.