[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Improved patch format - SoC

From: Nicolás Lichtmaier <nick_at_reloco.com.ar>
Date: 2007-04-01 10:11:04 CEST

> Correction: "unified diff" is in fact a later addition to the set of
> diff formats (the original diff didn't have it), and is by no means
> universally accepted as "the standard". IIRC the GCC project, for
> example, insists that patshes be submitted in context-diff, not
> unified-diff format. Reading man diff(3) will give you a list of
> different formats that diff can output, and consequently patch can accept.
>

I know that. But is "unified diff" the format that is clearly widely
preferred. IMO it's the one that should be used as inspiration for the
new format.

> An improved diff format that aims to be(come) compatible with patch
> should be applicable to at least the original plain diff, context-diff
> and unified-diff; possible even ed-script diff (see diff -e), though I
> suspect that one looses too much context to be useful for patches.
>
> [...]
>
>
>> Just as Subversion took the best of CVS concepts and created a better
>> version control system, we should take this format and enhance it so
>> that it can serve Subversion needs. Those new needs are three:
>>
>> * The ability of describe tree modifications (renames, deletions, etc.),
>>
>
> This is more complicated than it looks, if you insist on compatibility
> with current diff/patch. Diff is file-based, whereas tree modifications
> are not.
>
> For example, patches are separable: you can take a patch which contains
> diffs of several files, split it apart, and apply each file's hunk
> separately. When you add tree-modification metadata to such patches,
> this is (in general) no longer true; and to be safe, a patch program
> that accepts such an enhanced diff format should be able to warn you
> that you're missing some tree modifications when you apply a patch.
>
> I certainly don't know how to do this by simply extending, e.g., unidiff.
>
> [...]
>

Sorry, perhaps I wasn't clear. I'm not proposing a new format compatible
with unidiff. I'm in fact arguing for the opposite. A new format which
is "inspired" by unidiff goals, achievements and looks, but completely
incompatible. I would even change the "---" and "+++" markers so as to
make it clear that the new format is not compatible, although it might
look similar. The rationale for not even trying to be partially
compatible is not providing something that sometimes works and sometimes
it doesn't.

>> * The patch should convey all the meaning implied in a 'svn merge'
>> operation that is considered reasonable. A patch should be like
>> tearing the merge action into two steps, i.e.: 'diff + patch = merge'.
>>
>
> diff + patch != merge
>
> I don't know how that misconception came about, but merge typically
> looks at three sources, not two (and note that there's a prototype 4-way
> merge in Subversion's source tree).
>

Perhaps there's something I'm not understaning. As I see it there the 3
ways are:

1) The common ancestor (which is the "from" revision in the diff)
2) The proposed end result
3) The current local source. All of them are present in a normal diff
operation.

>> * Support for binary diffs. Binary diffs would just be diffs created
>> with the binary diff algorithm already present in subversion, and then
>> encoded in base64 to get a text representation. The algorithm name
>> should be stated to allow for change in the future.
>>
> This one is tricky. The fundamental reason for the success of the
> diff/patch pair is that the most common diff formats contain enough
> context to allow inexact patches. Patch is smart enough to find where to
> apply a patch even if a file has been (slightly) modified. To do so, it
> makes a number of assumptions about how text files are organized; these
> assumptions typically work for source code and consistently formatted
> text, but will usually fall down when, for example, your text format is
> one paragraph per line (which is fairly common in word processors and
> the like).
>
> With a generic binary file, you can define context in a similar way
> (though definitely not in combination with a block-copy delta
> algorithm!), but you can't invent good heuristics that would make
> inexact patching work, except in very, very limited cases -- that is,
> when your patch program knows everything about the format of the binary
> file it's handling.
>
> If you don't have inexact patch, then the diff/patch thing becomes
> pretty much useless for code exchange (/and/ for merging).
>

The same as regular Subversion merges. Binary merges are aborted and
left as an exercise for the user. That is not making Subversion merging
useless, so I don't see why it would render a patch merging useles...

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Apr 1 10:11:33 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.