[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PROPOSAL] Using binary mode in Python open() calls

From: Mark Phippard <markp_at_softlanding.com>
Date: 2006-04-18 02:33:27 CEST

Michael Haggerty <mhagger@alum.mit.edu> wrote on 04/17/2006 04:27:58 PM:

> Paul Burba wrote:
> I can think of lots of reasons to *expect* problems switching all file
> access to binary mode. The fact that no problems turned up in your XP
> test should be considered to be a lucky coincidence until proven
otherwise.

What sort of problems would you expect? In C opening a file in binary has
no effect on most platforms. In fact, when we were working on the C-side
of the port, several different committers told us it has no effect on any
*nix port. Is Python different? If so, then how exactly?

> - It will not occur to authors of future tests to open text files in
> non-text mode, so future tests will likely be broken on OS400.

We expect this. Paul has full commit rights, once the general concept is
approved he can fix tests as necessary if/when someone commits changes and
forgets. This is much easier than having to make changes in the WC
whenever we want to run the tests.

> - The contents of the file itself would not be in the correct text
> format for the local platform. This would make it difficult to look at
> or process a test's intermediate results using the platform's standard
> tools.

Other than OS/400, I cannot think of any platform that does any kind of
translation of the text encoding. What problems are you anticipating
here?

> > All of these problems are easily avoided by using the 'b' mode with
> > open().
>
> Just a very naive question then: why does the OS400 version of stdio (or
> Python) do translation when opening a text-mode file? Why not treat
> text files the same as binary in general on this platform?

OS/400 is EBCDIC native. If you search the dev@ list for EBCDIC you can
probably find several messages where Paul has explained this in great
detail for the C-side of the port. Basically, files are tagged with a
CCSID that indicates there encoding. When opened in text mode, the
contents are converted to the job CCSID, which is always EBCDIC. For
Subversion purposes, we want to keep stuff in UTF-8, so opening the files
in binary mode tells OS/400 not to translate the encoding of the content
into EBCDIC.

Mark

_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. and SoftLanding Europe Plc by IBM Email Security Management Services powered by MessageLabs.
_____________________________________________________________________________
Received on Tue Apr 18 02:33:57 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.