[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Participating in the Subversion project.

From: <kfogel_at_collab.net>
Date: 2005-07-23 06:24:13 CEST

This email began as a response to one of the Subversion project's
Google Summer of Code students (code.google.com/summerofcode.html),
but as I was writing it I thought it might be of wider interest,
because it discusses how to contribute code in general.

For the complete context, read the archives of the 'soc' mailing list
@subversion.tigris.org:

   http://subversion.tigris.org/servlets/SummarizeList?listName=soc

Here's some basic background:

Brian Davis's project for Summer of Code is to write a
Jabber-notifying post-commit system for Subversion. After he had
started, he discovered the commitmessage.tigris.org project, which is
building a unified framework for post-commit actions in SVN and CVS.
We all agreed that it made most sense for him to incorporate his work
into that project.

Now, this is a little tricky: commitmessage.tigris.org is a separate
project from Subversion, yet part of the point of Summer of Code is
for the student to actually work with the designated mentoring
organization -- which in this case is the Subversion project. So we
have an unusual situation going on here. Brian submitted his code to
commitmessage.tigris.org, which is entirely maintained by one person,
Stephen Haberman. However, his mentoring organization is the
Subversion project, which has thirty to eighty developers (depending
on how you count), and naturally, the dynamics of the two projects are
significantly different. Google's Summer of Code program is more
geared toward the second kind: it's meant to give students a chance to
work with large teams and their collaborative mechanisms.

Which leads us to this mail.

Below, I review Brian's initial code submission to commitmessage. I
review it according to the conventions of Subversion. This is partly
because we are the mentoring organization, but also partly because we
believe these standards are useful across open source projects in
general. I'm pretty sure Stephen Haberman would agree with most of
what I write below; I've CC'd dev@commitmessage.tigris.org in case he
has anything to add and modify.

By the way, folks, don't think from reading this mail that Brian has
been delinquent or something. Quite the contrary! He was quick to
dive in and get coding, constructively responsive to feedback, and in
general just the sort of student Google SoC was meant to uncover.

The rest of this mail is addressed to Brian.

Brian Davis <brian.william.davis@gmail.com> writes:
> Here's the link to the relevant message on commitmessage's side of
> things:
>
> http://commitmessage.tigris.org/servlets/ReadMsg?list=dev&msgNo=30
>
> The thread is located here:
>
> http://commitmessage.tigris.org/servlets/BrowseList?listName=dev&by=thread&from=345347&to=345347&first=1&count=9
>
> Apologies for the duplication,

No need to apologize, we're here to help :-). Here beginneth the
review.

In addition to the above message link, you also pointed us to

   http://commitmessage.tigris.org/servlets/ReadMsg?list=dev&msgNo=28

...which contains two attachments. We'll start with that message,
since 28 precedes 30.

The two attachments, it turns out, are the same thing in two different
forms. As you wrote in the body of the email:

> Anyhow, I'm attaching both the diff and the fully modified file
> (im.py, although I've added the '-jabber' suffix to avoid
> confusing myself). Let me know what you think.

Okay, so a few comments about this. First, attaching twice only
confuses people. "Which attachment should I use?" they'll wonder.
You might think "It doesn't matter, since they give the same thing,"
but in fact it *does* matter. I can show you instances where people
included the so-called "same" data twice, except there were subtle
differences between the two! The fact that this happens occasionally
means that no one can ever be sure it's *not* going to happen -- even
if you know would never do that, your reader can't count on that, and
so the reader will waste time worrying. Spare them the headache, by
just offering the changes once, as a diff (specifically, use "unified
diff" format, the output of 'diff -u', 'cvs diff -u', and 'svn diff').

Secondly, if you *do* attach a file that already exists in the
project, don't change its name to a name that could plausibly
represent a file that you might be *adding* to the project. If
circumstances force you to attach an entire modified file, then give
it a name like "foo-MODIFIED_BY_BRIAN.py" or something similarly
obvious. I'm sure you can see why this is desirable. Yes, you did
describe in the body of your email what you were doing, and that
helped, but a week later when I'm looking at the message again and I
just scroll directly down to see the attachments, I have to do a
double-take, wonder whether you're adding "im-jabber.py", realize I
can't tell, and go re-read your message to figure it all out.

Thirdly, when you generate a diff, generate it from the top of the
source tree, so that the full (relative) path to the modified file is
visible in the diff headers. In this case, that would have been
"commitmessage/views/im.py". The standard way to achieve this is
simply to edit the relevant file directly, and use 'cvs diff -u' from
the top of the tree to generate a diff. (Obviously, if commitmessage
were using SVN instead of CVS, that would just be 'svn diff').
Remember, if you need to retain a pristine copy of the original file
for testing purposes, you can always use 'cvs up -p' to retrieve it.

Fourthly, when you post a patch against a file, say what revision of
the file you based it on. Again, if you use 'cvs diff -u', this will
happen automatically. Watch what happens when I add the line "fish\n"
to commitmessage/views/im.py and run 'cvs diff -u' from the top of the
source tree:

   $ cvs diff -u
   Index: commitmessage/views/im.py
   ===================================================================
   RCS file: /cvs/commitmessage/commitmessage/views/im.py,v
   retrieving revision 1.4
   diff -u -r1.4 im.py
   --- commitmessage/views/im.py 15 Mar 2004 03:49:11 -0000 1.4
   +++ commitmessage/views/im.py 23 Jul 2005 04:11:57 -0000
   @@ -4,6 +4,7 @@
    # Copyright 2002-2004 Stephen Haberman
    #
    
   +fish
    """Provides an IM-related L{commitmessage.model.View}s"""
    
    import time

   $

Fifthly, and of VAST importance: Write a log message. I mean, this is
really key. For Subversion's guidelines on writing log messages, see

   http://subversion.tigris.org/hacking.html#log-messages

I don't know if commitmessage.tigris.org has similar standards, but I
can see (by perusing the project's ChangeLog and a generated CVS-based
changelog produced via http://www.red-bean.com/cvs2cl/) that Stephen
Haberman has been scrupulous about logging his changes. Without a log
message, your change is much harder to review.

Next I'll go into some specific comments about your changes, basing
them on the diff. Following that I'll have more comments about things
you *didn't* change (such as the installation doc, hint hint :-) ).

In the code comments that follow, please remember that I am not an
expert in commitmessage. Stephen will no doubt have much more
penetrating questions to ask.

> *** im.py 2005-07-13 20:37:22.000000000 -0700
> --- im-jabber.py 2005-07-13 20:37:50.000000000 -0700
> ***************
> *** 3,6 ****
> --- 3,9 ----
> # commitmessage
> # Copyright 2002-2004 Stephen Haberman
> + #
> + # Jabber extension added (7/5/2005) by Brian Davis
> + # email: brian.william.davis@gmail.com
> #

In Subversion at least, we discourage putting individual authornames
into files, see

   http://subversion.tigris.org/hacking.html#other-conventions

where it says:

   We have a tradition of not marking files with the names of
   individual authors (i.e., we don't put lines like "Author: foo" or
   "@author foo" in a special position at the top of a source
   file). This is to discourage territoriality even when a file has
   only one author, we want to make sure others feel free to make
   changes. People might be unnecessarily hesitant if someone appears
   to have staked a personal claim to the file.

I don't know if commitmessage follows this convention, of course; I
hope they do, though :-).

Note that Stephen Haberman's name is at the top of the file as the
copyright holder, not as the author. It is legally necessary to list
the copyright holder, but as far as recording who made what changes,
leave that to the version control system logs.

> ***************
> *** 8,21 ****
>
> import time
> -
> from StringIO import StringIO
> - from toc import TocTalk, BotManager
> - import msnp
> -

This is unexpected -- you're adding functionality, yet you remove a
bunch of imports that were already there. Why? (If there were a log
message, I'd probably know why :-) ).

> from commitmessage.model import View
>
> class IMView(View):
> """An abstract class to hold the message-generation logic for IM clients"""
> !
> # Borrowed from email.py
> def _printFiles(self, text, action):
> --- 11,20 ----
>
> import time
> from StringIO import StringIO
> from commitmessage.model import View
>
> class IMView(View):
> """An abstract class to hold the message-generation logic for IM clients"""
> !
> # Borrowed from email.py
> def _printFiles(self, text, action):

Here you made a functionally irrelevant whitespace change. It is only
a distraction when reading the diff. Before submitting a diff, you
should go over it to make sure it includes only relevant changes.
(It's okay to clean up whitespace and other formatting
inconsistencies, if that's what you want to do, but post that as a
separate change, don't mix it in with your substantive change.)

By the way, the reason 'diff -u' is preferable to 'diff -C' (i.e.,
"context diff" format, which is what you apparently generated), is
that in unified diff, the old and new lines are adjacent to each other
in each hunk, so it's very easy to see what changed without skipping
around in the diff.

> ***************
> *** 33,107 ****
> def _generateMessage(self):
> text = StringIO('')
> !
> text.write(self.model.user)
> text.write('\n')
> text.write(self.model.log)
> text.write('\n')
> !
> self._printFiles(text, 'added')
> self._printFiles(text, 'removed')
> self._printFiles(text, 'modified')
> !
> text.seek(0)
> return text.read()
> !
> class MSNView(IMView):
> """Sends out commit messages on MSN
> !
> This view has three properties:
> !
> ! - passport - the MSN passport to login with
> ! - password - the password for the MSN passport
> ! - to - a comma-separated list of other MSN passports to send the message to
> """
>
> ! def execute(self):
> ! """Sends the IM with the commit message"""
> !
> ! class MsnListener(msnp.SessionCallbacks):
> ! def __init__(self, message):
> ! self._message = message
> ! self._done = False
> ! def chat_started(self, chat):
> ! chat.send_message(self._message)
> ! self._done = True
> !
> ! listener = MsnListener(self._generateMessage())
> ! msn = msnp.Session(listener)
> !
> ! msn.login(self.passport, self.password)
> !
> ! for name in [name.strip() for name in self.to.split(',')]:
> ! msn.start_chat(name)
> ! listener._done = False
> !
> ! while not listener._done:
> ! msn.process(chats=True)
> ! time.sleep(1)
> !
> class AIMView(IMView):
> """Sends out commit messages on AIM
> !
> This view has three properties:
> !
> ! - screenname - the AIM screenname to login with
> ! - password - the password for AIM screenname
> ! - to - a comma-seperated list of other AIM screennames to send the message to
> """
> !
> def execute(self):
> ! """Sends the IM with the commit message"""
> !
> ! bm = BotManager()
> ! bot = TocTalk(self.screenname, self.password)
> !
> ! bm.addBot(bot, "bot")
> ! time.sleep(4)
> !
> ! message = self._generateMessage()
> !
> ! names = [name.strip() for name in self.to.split(',')]
> ! for name in names:
> ! print name
> ! bot.do_SEND_IM(name, message)
>
> --- 32,163 ----
> def _generateMessage(self):
> text = StringIO('')
> !
> text.write(self.model.user)
> text.write('\n')
> text.write(self.model.log)
> text.write('\n')
> !
> self._printFiles(text, 'added')
> self._printFiles(text, 'removed')
> self._printFiles(text, 'modified')
> !
> text.seek(0)
> return text.read()
> !
> !
> class MSNView(IMView):
> """Sends out commit messages on MSN
> !
> This view has three properties:
> !
> ! - passport - the MSN passport to login with
> ! - password - the password for the MSN passport
> ! - to - a comma-separated list of other MSN passports to send the message to
> """
> +
> + try:
> + import msnp
> +

Ahh, now I understand what was going on with the removal of those
imports, okay.

> + def execute(self):
> + """Sends the IM with the commit message"""
> +
> + class MsnListener(msnp.SessionCallbacks):
> + def __init__(self, message):
> + self._message = message
> + self._done = False
> + def chat_started(self, chat):
> + chat.send_message(self._message)
> + self._done = True
> +
> + listener = MsnListener(self._generateMessage())
> + msn = msnp.Session(listener)
> +
> + msn.login(self.passport, self.password)
> +
> + for name in [name.strip() for name in self.to.split(',')]:
> + msn.start_chat(name)
> + listener._done = False
> +
> + while not listener._done:
> + msn.process(chats=True)
> + time.sleep(1)
> + except:
> + pass

Okay, it appears that you have made a change that is totally unrelated
to the addition of Jabber functionality. You've increased the
robustness (or, put another way, increased the fault tolerance) of
commitmessage as a whole, by having it not error if it can't find the
'msnp' module. I'm not sure whether this is a good or bad change --
Stephen can analyze it better -- but I am pretty sure that it doesn't
belong in a code change that's supposed to be about Jabber :-).

> !
> class AIMView(IMView):
> """Sends out commit messages on AIM
> !
> This view has three properties:
> !
> ! - screenname - the AIM screenname to login with
> ! - password - the password for AIM screenname
> ! - to - a comma-seperated list of other AIM screennames to send the message to
> """
> !
> ! try:
> ! from toc import TocTalk, BotManager

*nod* Same thing going on here...

> !
> ! def execute(self):
> ! """Sends the IM with the commit message"""
> !
> ! bm = BotManager()
> ! bot = TocTalk(self.screenname, self.password)
> !
> ! bm.addBot(bot, "bot")
> ! time.sleep(4)
> !
> ! message = self._generateMessage()
> !
> ! names = [name.strip() for name in self.to.split(',')]
> ! for name in names:
> ! print name
> ! bot.do_SEND_IM(name, message)
> ! except:
> ! pass

And same comments apply here, of course.

> !
> ! class JabberView(IMView):
> ! """Sends out commit messages on Jabber

Whoo-hoo! Now we get to the meat of it...

> !
> ! This view has two properties:
> !
> ! - jid - the Jabber ID to login with
> ! - password - the password for the Jabber account
> ! - resource - the Jabber resource of the client (should probably be 'svn' by default)
> !

Why do you say "two" properties when there appear to be three?

> ! NOTE: A 'to' argument is not necessary, as the view will send notifications to
> ! everybody on the JID's roster.
> !
> ! NOTE: Perhaps it would also be a good idea to set the Jabber client's presence
> ! to reflect the current revision/datetime of last commit...?
> ! """
> !

You use the prefix "NOTE: " to label two completely different kinds of
entities. The first is a note about how the program works -- it's an
aside to the person reading this documentation, telling them something
they might want to know. The second is basically a "TODO" item;
perhaps it should be filed as an issue in commitmessage's issue
tracker, or perhaps it's enough to mark it as "TODO" here, but in any
case it shouldn't be treated the same way as the preceding "NOTE".

> def execute(self):
> ! """Sends the notification with the commit message"""
> !
> ! try:
> ! import xmpp
> !
> ! # JIDs are compounded with an '@', similar to email addresses
> ! userName, server = self.jid.split('@')
> !
> ! # Set up the client and connect to the host server
> ! jc = xmpp.Client(server)
> ! jc.connect()
> !
> ! # Authorize the client
> ! jc.auth(userName, self.password, 'svn')
> !
> ! # Set the client's initial presence notification
> ! jc.sendInitPresence()
> !
> ! # Generate the message
> ! message = self._generateMessage()
> !
> ! # Send the notification to each individual subscribed to this JID
> ! roster = jc.getRoster()
> ! for name in roster.getItems():
> ! jc.send(xmpp.Message(name, message))
>
> + except:
> + pass

Okay, I see how you ended up increasing fault-tolerance for every
view. You realized when you made your change that 'xmpp' might not be
available. You didn't want that to be an error condition in your new
code, and you extended that reasoning to the other views.

The right way to do this is to first submit a patch conditionalizing
the existing views, then submit a separate patch adding the Jabber
view. Two logically separate changes ==> two patches.

With all these 'except: pass' bits, I wonder if it might be good to
log an error somewhere? Or does commitmessage not have a logging
system? Silent failure is usually frowned on, but maybe it's right
thing here, I don't know.

Okay, on to the things that were not in your patch:

As we already discussed on the soc list, you made changes to a
configuration file, and posted the changed configuration file in
emails, one of which was

   http://commitmessage.tigris.org/servlets/ReadMsg?list=dev&msgNo=30

Whatever changes you made should simply have been part of your
original patch. After all, ./commitmessage.conf in the top of the
commitmessage source tree is a versioned file, and it's supposed to
contain examples of all the configurations, so a reasonably skilled
user can set things up just by tweaking that conf file. So the tweaks
you made to that file are part of your submission, and should have
been in the same email with the code changes, and in 'diff -u' format
of course.

Actually, what I just said could be a little inaccurate: it may be
that the *actual* tweaks you had to make to commitmessage.conf to test
your Jabber functionality are not the same tweaks you would want to
put in that conf file for general distribution. That's fine; in that
case, the latter are what you want to submit when you post, of course.
If anything about the former are important for reporting your testing
setup, you can just describe them in the body of your email.

I diffed your commitmessage.conf against the one in the distribution,
but unfortunately your changes were a mixture of the general (that is,
stuff that should go into the main distribution's conf file) and the
particular (stuff related to your particular testing setup).

Am I correct in thinking that the only "documentation" changes you
needed to make would be in that commitmessage.conf file, and that the
stuff under www/docs/ is not View-specific and therefore is unaffected
by the addition of Jabber functionality?

Looking over the other documentation in the project... I think
INSTALL.txt has a typo:

> Once you have all of these requirements, you can start installing
> commitmessage.
>
> 1. Download and extract commitmessage.tar.gz to a convenient location
>
> 2. Customize the commitmessage-2.0/commitmessage.conf

Oops, why is that "-2.0" there? That should probably be fixed. It's
a two-way street, this dependence on the commitmessage framework to
satisfy the Google SoC completion requirements :-). You get the
benefit of all that code... But then we're going to treat the rest of
that project as (in some limited sense) your purview as well. Fixing
this typo in INSTALL.txt, if it is a typo, would be a separate patch
of course, as it is unrelated to the Jabber stuff.

Also, for your changes to work, there is a (new?) dependency on the
'xmpp' module. You need to tell the user a) that that module is
needed for Jabber, and b) how to get that module. Whether INSTALL.txt
is the most appropriate place for this, I don't know, maybe Stephen
can say. But in any case, this dependency needs to be documented.

Can you make a repost, addressing all the concerns above?

Note that I've set the "Mail-followup-to:" header on this mail to

   users@subversion.tigris.org

because if anyone else follows up, that's probably where we want the
reply to go. However, you (Brian) should probably follow up to
dev@commitmessage.tigris.org and/or soc@subversion.tigris.org
(depending on what you're writing, of course).

Thanks,
-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sat Jul 23 07:17:17 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.