On Fri, Dec 3, 2010 at 10:44 AM, Jim Jenkins <jej_at_homrichberg.com> wrote:
> Im planning to use Hooks to add OCR scanning for select documents going
into
> a SVN repo. Im not really sure where to start so Im hoping someone here
can tell
> me if its possible and even suggest how best to proceed.
I'm going to take a slightly different approach. Pre-commit hooks are not
what you want.
1. A pre-commit hook should only be used if the developer has some way of
fixing an issue. A good pre-commit hook is to make sure all files that end
in *.sh have the property svn:eol-style set to "LF". If a developer doesn't
set this, and the pre-commit hook fails, the developer can easily fix the
problem and recommit the file.
2. The user is left twiddling their thumbs on hooks, even a post-commit
hook. If you have a hook that takes a few minutes to run, users will get
impatient. They may simply not bother committing changes they should until
they have a big horking commit which they'll do at the end of the day and
leave.
3. Changing committed files on a commit is very difficult. You, after
all, don't have access to the client's workspace, so you'll have to emulate
their checkout, so you can make your changes and do a commit. Of course that
means that your pre-commit hook will fire off once more, so you'll have to
have some mechanism in place letting your pre-commit hook know to not do
whatever is it was suppose to do in the first place.
4. Also, it's a bad idea to change a commit on a user. As Ulrich
Eckhardt pointed out, your user's client doesn't know that the files they
just committed were changed. Besides, what if your pre-commt hook created an
error as a side effect of that hook? I once wrote a pre-commit hook in
ClearCase to automatically expand RCS keywords. On occasion, the pre-commit
hook expanded a sprintf statement or something like that, and the developer
was furious because their program worked, and I botched it up.
I would instead think of your committed files as a "source" code, and that
your OCR scans as a "compiled" code.
What you probably want, although you really don't compile, is a continuous
build server that takes the committed files, and creates the needed OCR
scans of these files, and stores them where they can be referenced. The
storage area does not have to be Subversion (and in fact, I would argue that
Subversion is not your ideal storage area).
Take a look at Hudson. It's a powerful continuous build server and is very
flexible in its setup. With Hudson, you could automatically do the scans
after a commit, and then email the user if the scan failed for some reason.
It is possible to only have Hudson scan the files that were changed (since
Hudson knows which files were committed). And, it is possible to have Hudson
FTP or store the changed OCR files onto another server (or to simply keep
the scanned archive on Hudson itself.
It'll. take a bit of tweaking, but so would trying this in Subversion. And,
you and your users would be much happier with this arrangement.
--
David Weintraub
qazwart_at_gmail.com
Received on 2010-12-06 16:57:56 CET