I have a suggestion for special file storage that would be
implementable before 1.0 goes out the door. I think that atleast the
ability to store symlinks in the repository is something that is
wanted by a lot of people. But since the issue is a source of a lot of
controversy, I'll try to put the scope as tight as possible.
* Scope
I can find atleast three different things what different people would
want from special file storage.
** Special files with semantics in repository - not
This is what I call for example symlinks in repository - symlinks that
work inside the repository in all cases - when browsing and when
running operations on URLs. And which are handled meaningfully when
branching. I can see many uses for this kind of stuff - but they are
for a different problem.
These are not in scope for this suggestion.
** Special files at import/export time - not
This is a bit akin to make a distribution tarball. Ofcourse often it
can be handy if Subversion could do stuff like this automatically, but
since the needs of people differ quite much, it may be more trouble
than what it is worth. Scripts can easily tranlate all sorts of
special files in to revisionable properties - and back to files again
when exporting the data from the repository.
This is something that can be done today already, and probably only
needs a few scripts more. So this isn't in scope, either.
** Special files in a live working copy - yes
Often one encounters the need to have special files inside the working
copy itself. These can be generated by a script inside the working
copy ofcourse - but it generates a problem in the general handling of
the files, when they do not carry over properly between updates and
other operations. It would be a lot easier to have the client actually
handle the files properly.
This is what I wish to solve. To make the subversion client able to
handle a working copy with special files in it.
* Reason
For normal source code storage in a repository, special files are
almost never needed. But source code is not the only thing people want
to store in a repository. Some people want to have their whole
home-directory as a working copy. Some people want to store their
configuration files in a repository. Some people want to have /etc as
a working copy entirely. Some people want to have /dev as a working
copy :-) Source packages from other people might contain symlinks and
preserving those would be necessary for compiling the package.
I have personally run into wanting to store symlinks and named pipes
in a repository.
* Focus
The focus on the effort should be in being able to have a working copy
with all the special files ready directly after checkout - and
tracking changes when updating that working copy. That is, for the
client to be able to create all the different special files - or said
in another way, to be able to replicate or re-create what was put in
the repository.
The ease of adding the special files in the working copy or having
them behave perfectly in all working copy operations is ofcourse
important, but in my opinion the feature is already useful without
that as well.
The types of special files I am considering are symlinks, device files
and named pipes. Hard links in some form could be useful, but they
have a lot more problems as well.
No special semantics should be implemented in the repository for the
files. The files should behave just as regular files do in the
repository - no special handling should happen on cp or any other
operation.
* How
Well, there's ofcourse several ways to do it - but I hope to get the
simplest and easiest to implement model here. It can be expanded to
cover more cases past-1.0 obviously.
** In the repository
I want all special files to be just simple files inside the
repository. Simple text files with certain properties set on
them. This means no schema changes necessary - nor any changes
required to tools like ViewCVS or the web access.
** In the client
In the client, I want them to be as close to just simple files as
possible. This requires dutifully keeping the 'stored representation'
of a file separate from it's representation in the working copy. This
is actually done already - for the keywords and line-ending
conventions - the text base is not identical to the actual file.
What this means in practise is that I want the text-base and the
property files to be exactly as they are for other files - no special
handling there.
Now for the actual handling of the files in the working copy - there's
more than one way to do it, but I'd like to start from the easiest one
possible.
*** Create from scratch method
This I think is the bare-bones implementation. For this method, all of
the important information of a special file is kept in properties -
modifications to the properties is done simply as it is done with
every other property.
When the working copy is checked out, the file is created according to
the properties, ignoring the actual contents of the file, if any. When
an update comes that modifies the properties, the file is just blown
away and re-created based on the properties. When the file is checked
for modifications, no modifications are ever returned - eg. the file
is assumed to be the same as the text-base.
The problems are obvious ofcourse - no automatic determination of the
properties, no tracking of local edits what so ever, and so on. But
this would be enough to get started, I'd say - and perfectly enough
for the few cases I'd need these things in.
*** Generated text method
This is the next phase, requiring a bit more effort. For this method,
properties would no longer be kept for the special file, only a single
property detailing the type of the special file. Instead the contents
of the file define the attributes.
When the working copy is checked out, the file is created according to
the contents of the file in the repository. But this would be a
two-way conversion - from a simple text-format to a special file and
from the special file to a text-format. This means that when the
special file is modified - it would show up as locally modified. And
'svn diff' could show the difference between the text-representations
of the files. Ofcourse, like everywhere, the text-base would contain
just the text-representation.
How conflicts should be handled is ofcourse yet another issue. Should
there be three text-representation files in the working copy which
could be handled properly - and then 'svn resolve' would substitute
the text-representation with the actual special file? Perhaps.
This would be more work than the simple solution above, but is much
better ofcourse. The problem here is that one has to be careful not to
trigger too many cases where the files would need to be handled
specially - the 'svn:external' stuff showed pretty well how these
things can escalate.
* What's next
I would want any input you can muster out of yourselves on this
matter. I am still very unsure how to best achieve this all. I am
willing to do a lot of the coding, unless I get terribly bogged down
at work again - but I hope to achieve all this with the absolute
minimum of code and special cases.
If work should be started based on this, next would be defining the
actual formats for the special files and specifying what to support
and what not to. Then things can be added by small patches with tests.
-- Naked
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Oct 11 18:43:32 2002