[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] #7 OS400/EBCDIC Port: Make svn_io_copy_file() CCSID insensitive.

From: Paul Burba <paulb_at_softlanding.com>
Date: 2006-03-10 16:21:50 CET

Julian Foad <julianfoad@btopenworld.com> wrote on 03/10/2006 08:11:35 AM:

> Paul Burba wrote:
> > Julian Foad <julianfoad@btopenworld.com> wrote on 03/07/2006 09:56:00
AM:
> >
> >>(1) [...] The destination should be opened in the same
> >>mode, and have the same CCSID as the source. [...]
> >
> > FWIW, IBM APR does not provide any way to obtain, specify, or
otherwise
> > manipulate a file's CCSID. It can be done with lower-level system
calls;
> > IBM provides a rather "cumbersome" (as in requiring a new 30+ line
> > function) API (Qp0lSetAttr) to change the CCSID of an IFS file.
>
> OK. Perhaps it's safe to pretty much ignore the CCSIDs, always setting
it to
> UTF-8. I still think the CCSID should be copied by any "file copy"
function,
> in theory, but, in practical terms, I was thinking that either
>
> * the unwanted change of CCSID when copying a non-UTF8 file could bea
problem
> in some cases, or
>
> * all files being copied by Subversion will already have a UTF-8
> CCSID in which
> case no translation will happen when copied in text mode so the
> patch is not needed
>
> Now I can see that the second isn't true (e.g. "svn add non-UTF8-file"),
and
> maybe the first isn't either; I'll leave that to your judgement, as I
haven't
> thought through the possible scenarios and don't know the system well
enough
> anyway. If it's not a problem in practice, then fair enough.

The svn add scenario is a concern. With my current set of patches if a
user adds and commits a file with a CCSID of 37 and another user checks
that file out, the second user gets the file's contents correctly in a
byte-for-byte sense, but the file's CCSID will be 1208 in their working
copy. Recall that OS400 V5R4 APR (built with UTF support) assumes binary
files have CCSID 37 and text files have CCSID 1208. But any other utility
or application on V5R4 that *isn't* built with UTF support assumes binary
and text files both have CCSID 37. So there is a fundamental disagreement
between them as to how to encode and tag a text file, but this is a
problem bigger than Subversion, it's just life on the iSeries.

Ideally, if a user commits a CCSID 37 text file and another user checks it
out, the CCSID of the checked out file would be 37. But I don't see
svn_io_copy_file() as the place to solve this - even if it did preserve
CCSID, the svn add/ci/co scenario previously described still has the same
result. Perhaps the ultimate solution is to support some new svn:
properties (i.e. svn:ebcdic) that causes a checked out file to be retagged
with Qp0lSetAttr()...

...which brings us to my cop out :-) Mark and I have ported Subversion to
the iSeries primarily to run as a server. We don't know of anyone who is
using the client to commit ebcdic text files nor has anyone so much as
hinted to us that this is something they want. If a demand for this
arises we'll certainly reconsider beefing up support for it or possibly
get some other folks in the wider OS400 world to contribute some patches.
For now we feel safer knowing that all files created by Subversion (via
apr_file_open and apr_file_copy) are consistently tagged with 1208.

> >>To complete this patch, svn_io_copy_file() needs to say in its
> doc-string that
> >>it does a "byte-for-byte" copy, as it is no longer just a wrapper
around
> >>apr_file_copy(). (Eventually, I hope apr_file_copy() will be fixed
and
> >>documented to do the same.)
> >
> > Do we really want to do that? It becomes true for OS400 yes, and it's

> > been implicitly true for *nix, but it still isn't true for CygWin, no?
 Am
> > I being too much of a stickler here...or did you mean for the
doc-string
> > to indicate that there are platform dependent differences?
>
> Sorry, I thought svn_io_copy_file() said it was a wrapper around
> apr_file_copy(), but it doesn't.
>
> We obviously want svn_io_copy_file() to perform a byte-for-byte copy.
> Therefore we want to document it as such, because, as we have seen in
APR, if
> we don't then there is confusion about what it should do to text files.
The
> fact that it does not actually do so on all platforms then becomes
adefinite
> bug, rather than just unspecified behaviour as it is at present, and I
think
> that's a good thing.
>
> It seems logical to document the intended behavior at the same timeas
fixing
> the implementation (on OS400) to conform to that behaviour.

Ok, I understand now, I'll tweak the doc-string.

> Apart from that, this patch is fine.
>
> Sorry for the delay in responding this time.
>
> - Julian

Thanks for all the help,

Paul B.

_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. and SoftLanding Europe Plc by IBM Email Security Management Services powered by MessageLabs.
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Mar 10 16:22:51 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.