Geir,
I looked through some scripts that I wrote to help me sync the GNU
Nano repository and I came across a Perl script that might be useful
to you in quickly identifying all log messages that are not
representable in ASCII (hence possibly not UTF-8).
Attached is the source of the script. To use it, you will need the
libsvn Perl bindings (on Debian, install the `libsvn-perl` package),
and you will need to edit line 20 to change the URL of the Subversion
repository that you wish to examine.
Example output for svn://svn.sv.gnu.org/nano is:
------------------------------------------------------------------------
r619
Added Galician translation by Jacobo Tarr<jtarrio_at_trasno.net>.
------------------------------------------------------------------------
r757
Updated Galician translation; thanks, Jacobo Tarr
------------------------------------------------------------------------
r826
Galician translation brought up to date for 1.1.2 by Jacobo Tarr
------------------------------------------------------------------------
r954
Galician translation update (Jacobo Tarr.
------------------------------------------------------------------------
r958
French translation update (Jean-Philippe Gu곡rd).
------------------------------------------------------------------------
r962
French translation update (Jean-Philippe Gu곡rd).
------------------------------------------------------------------------
r1009
Moved no.po to nn.po.
New Norwegian bokm欠translation, by Stig E Sandoe <stig_at_ii.uib.no>.
Updated Norwegian nynorsk translation, by Kjetil Torgrim Homme
<kjetilho_at_linpro.no>.
------------------------------------------------------------------------
r1013
Moved no.po to nn.po.
New Norwegian bokm欠translation, by Stig E Sand𠼳tig_at_users.sourceforge.net>.
Added missing entries to THANKS.
------------------------------------------------------------------------
r1047
French translation updates (Jean-Philippe Gu곡rd).
------------------------------------------------------------------------
r1070
Norwegian bokm欠translation updates (Stig E Sandoe).
------------------------------------------------------------------------
r1071
Norwegian bokm欠translation updates (Stig E Sand𩮍
------------------------------------------------------------------------
r1072
Norwegian bokm欠translation updates (Stig E Sand𩮍
------------------------------------------------------------------------
r1125
French translation updates (Jean-Philippe Gu곡rd).
------------------------------------------------------------------------
r1133
French translation updates (Jean-Philippe Gu곡rd).
------------------------------------------------------------------------
r1258
French translation update (Jean-Philippe Gu곡rd).
------------------------------------------------------------------------
r1259
Spanish translation updates (Ricardo Javier Cⳤenes Medina).
------------------------------------------------------------------------
r1299
Updated Spanish translation (Ricardo Javier Cⳤenes Medina).
------------------------------------------------------------------------
r1301
Updated French translation (Jean-Philippe Gu곡rd).
------------------------------------------------------------------------
r1500
Updated French translation by Jean-Philippe Gu곡rd.
------------------------------------------------------------------------
r1537
Updated French translation by Jean-Philippe Gu곡rd.
------------------------------------------------------------------------
r1923
Updated French translation by Jean-Philippe Guérard.
------------------------------------------------------------------------
r2102
spell Ulf H峮hammar's name right
------------------------------------------------------------------------
r2373
in do_credits(), display Florian König's name properly in UTF-8 mode;
since we can't dynamically set that element of the array to its UTF-8
equivalent when in UTF-8 mode, we have to use the ISO-8859-1 version and
pass every string in the credits through make_mbstring() to make sure
they're all UTF-8 (sigh)
------------------------------------------------------------------------
r2784
rework the credits handling to display Florian König's name properly
whether we're in a UTF-8 locale or not. This requires a minor hack, but
it's better than requiring a massive function that we only use once
------------------------------------------------------------------------
r2898
Update French manpages by Jean-Philippe Guérard.
------------------------------------------------------------------------
r3924
Update French manpages by Jean-Philippe Guérard.
------------------------------------------------------------------------
r4181
per Jean-Philippe Guérard's updates, in doc/man/fr/*.1,
doc/man/fr/nanorc.5, fix copyright notices; the copyrights are
disclaimed on these translations, but the copyrights of the untranslated
works also apply
------------------------------------------------------------------------
r4182
per Jean-Philippe Guérard's updates, in doc/man/fr/*.1,
doc/man/fr/nanorc.5, fix copyright notices; the copyrights are
disclaimed on these translations, but the copyrights of the untranslated
works also apply
------------------------------------------------------------------------
r4208
in print_opt_full(), use strlenpt() instead of strlen(), so that tabs
are placed properly when displaying translated strings in UTF-8, as
found by Jean-Philippe Guérard
------------------------------------------------------------------------
The corrupted-looking entries are the ones where the log message is
incorrectly stored in ISO-8859-1.
Received on 2010-12-04 19:27:34 CET