Re: line-ending conversion and keyword substitution

From: Greg Hudson <ghudson_at_mit.edu>
Date: 2001-12-12 01:39:40 CET

On Tue, 2001-12-11 at 17:58, Ben Collins-Sussman wrote:
> Greg, your proposal sounds almost *exactly* like my document.

Intentionally so, but Karl asked me to write it without any references
to previous proposals.

> You only talk about updates/checkouts, however. Is it the commit-side
> of the document that you object to?

Yes. In my proposal, no newline translation happens on commit. So, my
problems with your proposal are:

  1. You define newline-style as the in-repository newline style,
     whereas I define it as the working-directory style.
  2. As a result of (1), you do newline normalization on commit.

> > 1. The bits of the file on disk, before the commit command is run
> > 2. The bits which are committed to the repository
> > 3. The bits of the file on disk, after the commit command is run
> >
> > I assert that (1) and (2) should always be the same, to avoid
> > irrevocably destroying data, even though that means the repository has
> > to store some gratuitous diffs a lot of the time. However, (3) should
> > be different; we should do a keyword substitution on the file after we
> > commit it, since certain keyword tags will become different as a result
> > of the commit. (No compelling reason to do a newline substitution,
> > though.)
>
> Obviously you have a different commit system in mind. But I can't
> wrap my head around it. Maybe you can walk me through an example of
> what you're thinking.

I'm not sure what the sticking point is.

Let's say we have a file in the repository with svn:newline-style=native
(which would be the common case for most text files in a multi-platform
repository). William (a Windows user) and Linus (a Linux user) are each
working on the file. For simplicity, let's assume that this is the only
file in the repository, which starts out empty at rev 1.

First, Linus creates the file with an editor, adds it, and checks it
in. (No newline translation in any of those steps.) In rev 2 in the
repository, the file exists with LF line endings.

Second, William checks out the file. In text-base, it has LF line
endings, but when it is copied to the working directory we translate
those to CRLF. Steve makes some small edits to the file and checks it
in; there is no newline translation, just a binary diff between
text-base (which has LF endings) and the working copy (which has CRLF
endings). In rev 3 in the repository, the file exists with CRLF line
endings. I'll be up front about this: the binary diff between rev 2 and
rev 3 will be somewhat larger than it has to be as a result. But that's
purely a performance issue, and not one which is likely to bug most
people.

Third, Linux makes some local modifications to the file (which is still
at rev 2 in his working directory), and then does an "svn update".
Subversion acquires a binary diff between rev 2 and 3 of the file,
newline-translates both the old and new text-base files, and merges the
differences between them into the working copy. Even though rev 2 and 3
have different newline endings in the repository, that doesn't cause any
problems with the merge.

Fourth, William does a diff between rev 2 and rev 3 of the file. The
client acquires the repository contents for revision 2 and 3 (perhaps
using text-base for rev 3 and a binary diff against text-base for rev
2), translates both so that they have CRLF line endings, and invokes
diff between the results of those transforms. Only William's actual
edits show up in the diff, not the newline differences.

As you can see, from the user's perspective, my scheme is identical to
your scheme, with one important exception: suppose we notice at this
point that this file wasn't actually a text file created by Linus with
an editor, but a binary file copied in from some other source, which
Subversion failed to recognize as a binary. In your scheme, as soon as
Linus commits the file, it has been destroyed; we have to go find out
where Linus got it from and get it again, if possible. In my scheme, we
just have to check out the file in rev 2 with some kind of newline
override option, and we have it back. My scheme never destroys data;
your scheme sometimes does.

My keyword translation scheme is very similar to my newline translation
scheme, with one difference: on commit, AFTER the commit has succeeded,
we do a keyword substitution on the working copy, as if we had just
checked out the file at the new revision.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:52 2006

This message: [ Message body ]
Next message: Greg Hudson: "Re: line-ending conversion and keyword substitution"
Previous message: Natalie Vincent: "Re: Poll: do we really need newline conversion?"
In reply to: Ben Collins-Sussman: "Re: line-ending conversion and keyword substitution"
Next in thread: Philip Martin: "Re: line-ending conversion and keyword substitution"
Reply: Philip Martin: "Re: line-ending conversion and keyword substitution"
Reply: Ben Collins-Sussman: "Re: line-ending conversion and keyword substitution"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]