[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Cross-Platform Character Encoding Issues

From: Kalin KOZHUHAROV <kalin_at_thinrope.net>
Date: 2005-08-23 03:43:53 CEST

Gustave T. Stresen-Reuter wrote:
> On Aug 22, 2005, at 5:38 PM, Kalin KOZHUHAROV wrote:
>
>>>> "Gustave T. Stresen-Reuter" <tedmasterweb@mac.com> wrote on 08/22/2005
>>>> 11:33:06 AM:
>>>>
>>>>> Specifically, documents checked into Subversion from the Mac (encoded
>>>>> in utf-8 No BOM) and then checked out onto Windows end up incorrectly
>>>>> encoded (accented characters display incorrectly). Likewise, documents
>>>>> created on the windows machines, checked into Subversion and then
>>>>> checked out onto the Mac end up with "gremlins" (characters that don't
>>>>> display but are definitely a part of the document).
>>>>>
>>>>> I've read in several places that documents checked into Subversion are
>>>>> converted to utf-8, but if that were true, why would we end up with
>>>>> this mis-encoded documents? Is it possible that JEdit or some other
>>>>> aspect of Windows is messing with the encoding and if so, what
>>>>> could it
>>>>> possibly be?
>>>>>
>>>>> This is somewhat urgent so any help resolving this issue is greatly
>>>>> appreciated.
>>>>
>>>>
>>>>
>>>> Subversion does not do anything to the contents of your files. The
>>>> lone
>>>> exception being that you can ask Subversion to do stuff with the EOL
>>>> characters and/or expand specific keywords.
>>
>>
>> Yes, make sure svn:eol-style is "native". If you don't believe
>> subversion (then why do you use it :-), make an MD5 sum of the file
>> before svn add, after svn add, after commit and after checkout on a
>> different platform. All MD5s should be the same. If not, post here
>> which are the different ones.
>
>
> Indeed, it appears to be a problem with Tortoise. Here is the result of
> a file checked out using TortoiseSVN and one using the Subversion
> Windows command line tool:
>
> Mac: MD5 (index.html) = 45f0f1bc8e0571a025e225dea7f7c353
> Win cmd line: MD5 (index-cmd.html) = 45f0f1bc8e0571a025e225dea7f7c353
> Win Tortoise: MD5 (index-win.html) = 5384838ac87ceae69925eae16d9e6a7d
>
> Thanks to everyone for your help. Is this a TortoiseSVN configuration
> issue or a Windows configuration issue???

Well if you say that with the command line you can det it OK, then it
should be Tortouse issue. Try posting this simple test case to TSVN list
and see what happnes there.

Some times ago we had very weird problems with Java code in Shift_JIS
(from windoze) and eclipse/subclipse on UTF-8 linux, but since we added
--encoding=Shift_JIS to the javac, everything is fine. However, the
files came with the same MD5 both on windoze and linux (TSVN and
command). So it is unrelated. (I write this as I am gathering i18n
issues with subversion and related technology/software).

Kalin.

-- 
|[ ~~~~~~~~~~~~~~~~~~~~~~ ]|
+-> http://ThinRope.net/ <-+
|[ ______________________ ]|
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Tue Aug 23 03:46:29 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.