Re: expanding custom keywords in dump

From: Nico Kadel-Garcia <nkadel_at_gmail.com>
Date: Sun, 2 Feb 2014 10:43:02 -0500

Branko, have I ever mentioned how nice it is to see employees of the
companies producing the core source code doing thoughtful posts? It's
one of the main reasons I've appreciated Subversion over the years.

On Sun, Feb 2, 2014 at 1:41 AM, Branko Čibej <brane_at_wandisco.com> wrote:
> On 02.02.2014 04:14, Nico Kadel-Garcia wrote:
>> On Sat, Feb 1, 2014 at 1:07 AM, Ben Reser <ben_at_reser.org> wrote:
>>
>>> Branko gave a perfectly reasonable answer. Beyond that I honestly don't get
>>> the point of these two emails. FreeBSD uses keywords because as an open source
>>> project they ship source. Even more importantly they have downstream projects
>>> (e.g. Apple uses their find command
>>> http://opensource.apple.com/source/shell_cmds/shell_cmds-175/find/main.c ). I
>>> can't think of a better way of tracking versioning for them once the source
>>> leaves their version control system and potentially goes into another. Yes
>>> there are all sorts of annoying bits about this.
>> If your build system has to rely on source control based fields,
>> process the code in your build system. Putting the keyword processing
>> in the source control itself certainly dates back ti RCS and CVS, and
>> has been the bane of comparing source trees in working copies, and of
>> of actually reviewing the source control that will be used for
>> building software. It profoundly interferes with generating
>> replicatable code in multiple build or test environments.
>
> You're totally missing the point of this thread. No-one said anything
> about build systems; the original poster's requirement is tracking
> upstream versions of files, not complete source trees. No amount of
> *.h.in magic is going to do that.

Build systems are the strongest reason, I think, to *want* the
keywords. If you're in a working copy of a source tree, or a source
controlled website, bulky nest of configuration files, or other such
environment, you've usually got access to the source control logs. If
you then need to include the "Author" or "Date" or "URL" for the
source, that's understandable. I've worked with that. I applied that
in particular to a build system for "kickstart" files, where I wanted
the "HeadURL" and "Revision" for each script added to the "pre" and
"post" sections to have that keyword content, and the entire generated
kickstart files to themselves be timestamped and have their own,
individual "HeadURL" and "Revison"

I wound up tweaking the system to do just what I said: prepend the
relevant information to each ""pre" and "post" script when it was
added to the master file, and put a header on top of the kickstart
files to denote the assembled file build and SVN Revision and HeadURL.
Then I committed *that*. That way, no keywords, and when I created new
branches and variants of the original code, I didn't have to remember
to hand-set "svn:keywords" or to inject a "pre-commit" file to ensure
consistent use in new files.

Doing it as a post-checkout process frees up the developer to use more
arbitrary processing. It does take time to learn and to write tools to
match.

> Obviously, in an ideal world, one would be able to migrate these kind of
> metadata between different VCS. Likewise obviously, the world is not
> ideal, and in-file keywords are a reasonable alternative if no other
> tooling can be devised. In the case of Subversion->Perforce migration,
> one could argue that it's Perforce's fault for not having an equivalent
> to Subversion's properties that could store the source repo metadata.

The in-file keywords are an easy first approach, and I admit they're
common. The problems occur over time, and with advanced use. It's
fairly unreasonable to expect Perforce tooling to be able to support a
3rd party add-on, such as the "FreeBSD" keyword, that is not in
Subversion's main codeline.

> Compared to inventing a separate-but-parallel database for maintaining
> these metadata, and all the surrounding tooling that this implies,
> expanded keywords in the files themselves appear positively benign,
> especially when they're not going to change except from further upstream
> imports.

Wll, that's just what the FreeBSD people did by adding a new keyword.
They used the Subversion repository metadata. That data is clearly
derivable at working copy creation time. That's how the keywords get
filled in anyway by the keyword expansion, so it's not a metadata
invention or storage problem there. By generating those keyword
fillins as a post-export or post checkout process, arbitrary keywords
can be created and inserted on a consistent basis *without* the
confusion and inconsistencies inevitable in directly keyword expanding
the source control file content. The developer is already using a
"separate database", namely the Subversion metadata that is not
inherent to the file content, that means a gain in flexibility and in
maintenance of "these files are keyword enabled" and "these files are
not.

Wha'ts lost is the "automagic" nature of the keyword expansion in the
primary working copy. And that's always been problematic when
stressed, even if it's made some code easier to read for new users.

I'm a strong promoter of the same approach to end-of-line handling.
Working environment processing and re-normalization of upstream source
is safer done on a case by case basis, as a build or deployment
process, and preserving the contents of the actual repository in a
pristine state. Keywords demand checkout post-processing, anyway.
Keeping it in the build or deployment system, as part of generating
the modified files *after* checkout, can allow far more flexibility.

> Last but not least ... Subversion does not expand keywords unless
> explicitly told to. This was a conscious decision we made to discourage
> exactly the kind of abuse you're griping against. But you'll have a hard
> time to find a single VCS glove that fits all potential users' feet.

Amen!!! I'm very, very glad that Subversion did *not* enable keywords
by default. Some of the worst difficulties I've had were when people
started enabling keywords erratically. Files get duplicated or added
without keywords enabled, then later keywords are enabled, and the use
of the "svn diff" command wound up burdened by spurious keyword
differences. And the "diffs" between two developers' working copies of
the same code lead to complex and unstable editing of local "diff"
commands and diff outputs, and truly horrendous nightmares of "patch"
changes to negotiate the inconsistent use.

I've got a long tale of woe about Perforce source code management of a
Linux kernel that I'll tell sometime, with a different story, if
people like.
Received on 2014-02-02 16:43:34 CET

This message: [ Message body ]
Next message: Kamal Ahmed: "Subversion merge minified javascipt shows conflict"
Previous message: Bert Huijben: "RE: Possible bug in SVN 1.8.3 and 1.8.4 - file locking"
In reply to: Branko Čibej: "Re: expanding custom keywords in dump"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]