[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Preserving MacOS Files in SVN (was: Pre/Post-processing)

From: Steve Sisak <sgs_at_codewell.com>
Date: 2007-10-04 02:20:27 CEST

At 11:08 AM -0400 10/3/07, Mark Phippard wrote:
> > There, done. Any holes?
>I probably should not wade into this discussion as it seems like
>asking any questions gets interpreted as not wanting to see the
>feature added.

Hopefully that's not the case -- if you can identify technical
issues, that would be a good thing.

>I use OSX (not these features in general) and would
>like to see it better supported so that there are no barriers to


>That being said, I do not see where any of the proposed solutions
>really talks about how Subversion actually works and how it would need
>to be modified.

I'm only beginning to dig into that, so your help would be most appreciated.

>Basically, you are talking about the format of the file stored in
>Subversion repository and the format of the file in the WC not being
>the same. Given that, how would you solve the following?

Can I have an acronym expansion on WC? (Assuming Working Cache)

>* client and server communicate with binary deltas. This means that
>the client will have to convert local file to repository
>representation to exchange deltas. Say there are local changes to a
>file, and you are updating from server. The local version has to be
>encoded, receive and apply the deltas, and then be decoded again. Can
>this be done reliably? How would/could conflicts be handled?

The encodings in discussion are just a concatenation of the various
streams (with headers) -- as long as we use the same order of the
segments, the encoding can be done on the fly easily.

>* client uses a heuristic to determine if file is changed locally. In
>this case, to check if a file is modified it would have to be encoded,
>and then if the size is not different, a byte by byte compare would
>have to be performed.

What's wrong with modification time? Mac OS modification time is only
updated if a file (any part) is changed.

In any case, this is no different that having 2 files plus a small
fixed size header. How does the client currently handle a 1 byte
write to the middle of a file (which wouldn't change the length)?

>This will be very slow. Let's assume it is
>worth it, is the process reliable? Does encoding/decoding the same
>file always give the exact same results?

Please take a moment to examine the definition of the AppleSingle and
AppleDouble formats:


I think this will answer your question.

Since there are file system calls to open the resource fork as a flat
file, both formats are essentially the concatenation of a header and
one or two files -- the header contains the extended file attributes
and offsets and sizes of the other two files in the "encoded" file.

The difference is wether the data fork is included or a stand-alone file.

>I am sure the important parts of a file are always exactly the same
>but is the actual binary encoded file always exactly the same?

Since the order of resources in the resource fork is inconsequential,
modifying a resource in the middle of a file might reorder the
contents of the resource fork section of the file, but I believe that
this would always be the result of an actual change.

>If not, you are going to get a lot of false positives from svn
>status, which means those changes will be committed.

If this were to be deemed a serious issue, there's no reason we
couldn't sort the resource fork on the first status call after a

>* an alternative to previous, is that perhaps the decoded version of
>the file is stored in the WC to make these comparisons faster.

Assuming the WC is the local cache, that would be fine, as we're
really just talking about local files.

>But then SVN needs to be enhanced to compare resource forks too.

You can open the resource fork as a file and use normal file compare
-- you just won't see the structure or catch a false positive where a
resource has moved in the file but not changed.

However if the flat comparisons are equal, the resource forks are identical.

>It also
>pushes the complications back to the process that talks to the server
>as now an extra encoding step has to happen.

Will need to look at the code -- if you have an abstraction of a file
stream, is could be generated on the fly.

>Are all of these problems solvable? Most likely ... there are a lot
>of smart people around. I do not think the solution is nearly as easy
>as it is being portrayed.

We do have an existing implementation of AppleDouble submitted as a
patch, although I have not dived into too deeply yet.

>I do not think there is any reason to respond to this email point by

Oops. Too late. :-)

>I am not a core SVN coder, I will not be making these changes.
> I think you should just take these things into consideration in the
>proposal and try to address them so we can keep the proposal moving
>forward. Or perhaps step back and think of a different approach that
>makes these things less of an issue.

No problem.

Thanks for the review.

What I'm hoping is that the "encoded" file can be represented as a
stream that just a concatenation of the (manufactured) header, the
resource fork (read as a byte stream), and the data fork (for

This might generate a false positive for a change if a resource is
modified and changed back, but that's no worse than a source file
being reordered, but at least all the data in the file is preserved.

A smarter resource fork compare would be a good thing, but that would
be another case of a smarter XML compare for XML-based files
(structured files where order is not significant) -- I'd consider it
an optimization.

Best wishes,


To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Oct 4 02:22:59 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.