[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

patchette for bug # 983: cvs2svn.py Non-ASCII characters garbled

From: <PVM_at_capmon.dk>
Date: 2003-03-14 18:57:50 CET

Hi (Karl :-),

About http://subversion.tigris.org/issues/show_bug.cgi?id=983:

Even after registering and logging in I couldn't put my comments directly
in the bug report for some reason, so here you go.

It seems the patch you suggested has made it into 0.19.1, and maybe a
quick note about that in the bug would be in order. Also, If the
--encoding option isn't supplied and cvs2svn.py is run anyway, it fails
with a nasty stack trace like below. Here is a patch. I myself didn't
know I was using latin-1 so I suggest to include the suggestion of
"latin-1" in the error message. I got that from Greg Stein in the
bug-report a great hint for morons like me... :-)

It is the first code I've ever written in Python, so...

Sincerely,

Peter

Index: cvs2svn.py
===================================================================
--- cvs2svn.py (revision 5334)
+++ cvs2svn.py (working copy)
@@ -452,13 +452,29 @@
     author, log, date = self.get_metadata(c_pool)
 
     # convert locale encoded strings to unicode objects
- l = unicode(log, ctx.encoding)
- a = unicode(author, ctx.encoding)
+ # This can fail. Handle errors better! No clean up is done!!
+ failed_item=0
 
+ try:
+ l = unicode(log, ctx.encoding)
+ except UnicodeError:
+ failed_str=log
+ failed_item="log"
+
+ try:
+ a = unicode(author, ctx.encoding)
+ except UnicodeError:
+ failed_str=author
+ failed_item="author"
+
+ if failed_item:
+ print "*Error* %s \"%s\" contained an invalid character. Try e.g.
--encoding=latin-1" % (failed_item, failed_str)
+ sys.exit(1)
+
     # put UTF-8 encoded unicode-"strings" into svn filesystem
     fs.change_txn_prop(txn, 'svn:author', a.encode('utf8'), c_pool)
     fs.change_txn_prop(txn, 'svn:log', l.encode('utf8'), c_pool)
-
+
     conflicts, new_rev = fs.commit_txn(txn)
     if conflicts:
       # our commit processing should never generate a conflict. if we
*do*

It would be nice with a try/catch combo and put out a "use
--encoding=latin1" error message instead of the cryptical one below (

committing: Thu Feb 14 19:51:27 2002, over 0 seconds
    changing 1.4 : /trunk/dbInfo.descr
Traceback (most recent call last):
  File "/usr/subversion/cvs2svn/cvs2svn.py", line 801, in ?
    main()
  File "/usr/subversion/cvs2svn/cvs2svn.py", line 798, in main
    util.run_app(convert, ctx, start_pass=start_pass)
  File "/usr/subversion/lib/svn-python/svn/util.py", line 38, in run_app
    return apply(func, (pool,) + args, kw)
  File "/usr/subversion/cvs2svn/cvs2svn.py", line 726, in convert
    _passes[i](ctx)
  File "/usr/subversion/cvs2svn/cvs2svn.py", line 683, in pass4
    c.commit(t_fs, ctx)
  File "/usr/subversion/cvs2svn/cvs2svn.py", line 455, in commit
    l = unicode(log, ctx.encoding)
UnicodeError: ASCII decoding error: ordinal not in range(128)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Mar 14 23:07:36 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.