[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PROPOSAL] Using binary mode in Python open() calls

From: Michael Haggerty <mhagger_at_alum.mit.edu>
Date: 2006-04-18 14:13:59 CEST

Mark Phippard wrote:
> Michael Haggerty <mhagger@alum.mit.edu> wrote on 04/17/2006 04:27:58 PM:
>> Paul Burba wrote:
>> I can think of lots of reasons to *expect* problems switching all file
>> access to binary mode. The fact that no problems turned up in your XP
>> test should be considered to be a lucky coincidence until proven
> otherwise.
>
> What sort of problems would you expect? In C opening a file in binary
> has no effect on most platforms. In fact, when we were working on the
> C-side of the port, several different committers told us it has no
> effect on any *nix port. Is Python different? If so, then how exactly?

No, the "b"inary option doesn't have any effect on *nix systems.
Problems, if any, would be expected under Windows, old MacOS, CP/M,
ENIAC, etc.

Python allows C's stdio libraries to do text <-> binary translation.
Therefore the situation is no different than that for C.

>> - The contents of the file itself would not be in the correct text
>> format for the local platform. This would make it difficult to look at
>> or process a test's intermediate results using the platform's standard
>> tools.
>
> Other than OS/400, I cannot think of any platform that does any kind of
> translation of the text encoding. What problems are you anticipating here?

I'm just referring to Window's '\n' -> '\r\n' conversion and old MacOS's
(I believe) '\n' -> '\r'. If the file is opened in binary mode, then
these translations are not done and therefore the files on disk are not
in the platforms' expected text file format. I know from experience
that this confuses many Windows tools (for example, some editors).

>> > All of these problems are easily avoided by using the 'b' mode with
>> > open().
>>
>> Just a very naive question then: why does the OS400 version of stdio (or
>> Python) do translation when opening a text-mode file? Why not treat
>> text files the same as binary in general on this platform?
>
> OS/400 is EBCDIC native. If you search the dev@ list for EBCDIC you can
> probably find several messages where Paul has explained this in great
> detail for the C-side of the port. Basically, files are tagged with a
> CCSID that indicates there encoding. When opened in text mode, the
> contents are converted to the job CCSID, which is always EBCDIC. For
> Subversion purposes, we want to keep stuff in UTF-8, so opening the
> files in binary mode tells OS/400 not to translate the encoding of the
> content into EBCDIC.

Thanks for the explanation.

My point of view is that writing a text file in binary mode is
nonstandard and nonportable, and therefore the burden of proof should be
on the proposer to explain why the change will not be a problem on any
platform. (The fact that a test ran successfully, by itself, is not
such a proof.) Something along the lines of "after this change, files
x, y, and z will be written with different line-end conventions, but
that is not a problem because..." would be much more persuasive.

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Apr 18 14:14:52 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.