On 01/02/2012 02:52 AM, Alan Barrett wrote:
> On Sun, 01 Jan 2012, Mark Mielke wrote:
>>> Another idea is to change the revprop's value in the pre-commit or 
>>> post-commit hook: [...]
>>
>> This is what we've been doing for about two years. It has the 
>> consequence that tools don't automatically match unique identifier to 
>> commit as they no longer match.
>
> If your third party tools can't extract the unique ID from svn:author 
> = "Display Name <uniqueid_at_domain>" then perhaps the problem lies at 
> least as much in your third party tools as in subversion.
I wonder if you thought this through before posting. :-)
You are saying that if I make up an essentially arbitrary scheme, such 
as "Display Name <uniqueid_at_domain>", and you have a tool which is 
unaware of my scheme, and therefore your tool fails to matches users in 
the region because of my scheme - that your tool has the problem? 
Despite the documentation for Subversion never mentioning or even 
suggesting a convention that you should be responsible for understanding?
No.
The convention must be defined in the Subversion book, and it must be 
part of the release notes so that third party tools adhere to the 
convention.
Otherwise, only extremely casual interpretation can be done of the 
field. For example, it can be treated as a unique identifier - but more 
like a "foreign key" unique identifier in the sense that it is a key in 
some domain, but not necessarily a domain I know about or am an 
authority for. This is why tools such as FishEye provide a "committer 
mapping" that is precisely this. It allows me to code on a 
per-repository basis each of the committer values that I want to 
associate with my own FishEye account. This is really horrible for 
dozens of repositories and thousands of users. Every user having to 
input their own mappings? Yuck, yuck, yuck.
If, instead, a convention was defined such that (and just hand waving 
here, I'm not really attached to these details):
     svn:author => unique identifier
     svn:author-name => Mark Mielke
     svn:author-email => mark_at_mark.mielke.cc
Then tools could make much more intelligent decisions on what to do or 
show. They could use svn:author as the mapping key, but show name and 
email in "svn log" or graphical browsers.
The above model is a simple solution to the problem. More data stored 
for every commit. Data which can be used by downstream tools. This has a 
benefit in that the data is static which is sometimes good. In a large 
project, there is normally a turnover, and accounts that exists or are 
active in one year are not necessarily the same as the ones active in 
another year. By taking a snapshot of the data at the time of commit, it 
represents a permanent record of sorts. ClearCase is a system which does 
it this way. Event history records which track such things as object 
creation which is the closest map to svn:author have username, domain 
(NIS - old school), and fullname.
The other alternative is for a Subversion client to be able to lookup 
details for svn:author by asking the server using a published protocol. 
This model would allow the server to implement these queries 
transparently using LDAP lookups or similar depending on the 
requirements of the project. This stores less data for every commit, and 
allows for dynamic updates. It would allow for "Mark Mielke" to become 
"Mielke, Mark" with a server side configuration, but in contrast to the 
previous method, it would not all for a snapshot of history to be taken. 
It would be a requirement that the identity management system used on 
the server would always have a record for me even after I am gone - or  
- alternatively, that the detail would become more vague over time. I 
disappear, and my account disappears - so it is left with only a unique 
identifier which might not be enough information.
In our particular case, we value all three of: 1) unique identifiers to 
be able to do cross referencing of reports between tools, 2) display of 
humanly readable names in output such as "svn log" or annotations in 
FishEye, ViewVC, Eclipse, or whatever tool the user is using, and 3) 
permanent historical record for auditing purposes.
Our exact compromise for the last three years is:
1) original svn:author value arrives on the server as as "1234567" - a 
corporate unique identifier
2) pre-commit re-writes svn:author to "Full Name (<original svn:author 
value>)"
3) pre-commit adds <company>:gid as "<original svn:author value>"
Then as I mention - various other tools such as FishEye have explicit 
mappings from "Mark Mielke (1234567)" => "1234567" for each Subversion 
repository. We're primarily a ClearCase and Perforce shop right now - 
but even so, I have several Subversion repository mappings of this form. 
It works. It just sucks.
For svn:author to have structure - either internally using punctuation 
such as Unix gecos, or separated out as separate attributes - and for 
tools to all honour this structure - would be far more ideal. As 
Subversion is already well established, separate attributes is probably 
the best approach as it would enable forwards and backwards 
compatibility for uses of svn:author implemented by the Subversion code 
base itself. Tools that know how to access and do intelligent things 
with the new fields could feel free to do so. Users of tools that do not 
do something intelligent things with the new fields could point to the 
Subversion release notes and Subversion book and say "this new attribute 
svn:author-name should be recognized by your tool", the change can make 
the tool roadmap, and we can all be happy.
-- 
Mark Mielke<mark_at_mielke.cc>
Received on 2012-01-02 09:35:22 CET