[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Using Hooks To OCR Documents

From: Ulrich Eckhardt <ulrich.eckhardt_at_dominolaser.com>
Date: Mon, 6 Dec 2010 09:18:56 +0100

On Friday 03 December 2010, Jim Jenkins wrote:
> I'm planning to use Hooks to add OCR scanning for select documents going
> into a SVN repo.

I assume that you know how to OCR the docs, so this is just about SVN
integration.

> Basically I'd like to have every commit to an SVN repo stop at the
> pre-commit (or another more suitable) hook so the submitted files can be
> inspected and if needed run through a command line OCR engine. We are
> dealing with "image" based PDF files so these would be sent off to the
> OCR engine and a "test+image" PDF would be returned. The new PDF would
> replace the original before being sent on it's way into the SVN repo.

I guess you meant "text+image" there, right? Anyway, what you want to do is
possible, and you might be able to use the pre-commit hook for that, but you
shouldn't. The things that you shouldn't do is modify commits on the server,
because the client has no way of knowing about this, and the client will
never receive a notification that the content of the repository is different
from what it sent to the repository itself.

Suggestions:
1. You trigger a process that OCRs the PDF in question and then replaces the
one in the repository or adds a second one next to it, but in a second
commit. You could also batch this process, i.e. run it once at night or
things like that.
2. You could simply reject the commit from a pre-commit hook if the file is
not OCRed already. This would put it into the user's responsibility to run
the OCR on the file before committing it.

You also mentioned that you only want to scan "select[ed] documents", you
could achieve this using a custom property that you check in one of the
processing steps.

Greetings from Hamburg!

Uli

-- 
ML: http://subversion.apache.org/docs/community-guide/mailing-lists.html
FAQ: http://subversion.apache.org/faq.html
Docs: http://svnbook.red-bean.com/
**************************************************************************************
Domino Laser GmbH, Fangdieckstraße 75a, 22547 Hamburg, Deutschland
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
**************************************************************************************
Visit our website at <http://www.dominolaser.com/>
**************************************************************************************
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte Änderungen enthalten. Domino Laser GmbH ist für diese Folgen nicht verantwortlich.
**************************************************************************************
Received on 2010-12-06 09:19:11 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.