Re: Newbie questions...

From: Blair Zajac <blair_at_orcaware.com>
Date: 2002-07-30 20:41:39 CEST
Alessandro,
Here's a first pass of the answers to your questions. I'd
wait a day or two for feedback from the other developers on
the mailing list before taking these answers as gospel.
Best,
Blair
-- 
Blair Zajac <blair@orcaware.com>
Web and OS performance plots - http://www.orcaware.com/orca/
> ----------------------------------------------------------
> 
> Here are the questions:
> 
> 1) CVS Compatibility. Is it possible to import or use an existing CVS
> repository? How about exporting?
Yes, we have a `cvs2svn.py' Python script that uses the SWIG interface
to the Subversion libraries.  Currently it does not handle branches or
tags.  There is on going work on VCP, a repository to repository
copying tool, that is being updated to import other repositories into
Subversion.
As for exporting a repository, we have two ways of `exporting' a
repository.  The first, `svn export', creates a directory structure of
the repository at a specific revision that does not contain any of
Subversion's private `.svn/' directories, which are equivalent to
CVS's `CVS/' directory in each checked out CVS directory.  This is a
client side operation that does not involve administrator access to
the repository.
The second way of exporting a repository is to use Subversion's
administrator program `svnadmin dump' to dump the repository.  The
dump format is specific to Subversion and the only program that reads
this dump format is `svnadmin load' to load the dump into a new or
existing repository.  This can be used as a backup method for
Subversion, but the preferred way of doing this is to the Berkeley
DB's hot backup method.
> 
> 2) Disconnected Operations. Is it possible to checkout a project on a laptop
> computer, fly to [your favorite desert island], come back after a few months
> and merge your stuff into the main branch without destroing the repository
> and without losing any data? (resorting to the developer for solving existing
> conflicts is perfectly acceptable).
Yes.  Currently Subversion supports reverting local modifications and
performing diffs between the local modifications and the unmodified
files.  You cannot switch to different revisions of any files or
directories besides the version you checked out.
Subversion does not have local repository copies where the entire
repository exists on the client system, unlike Arch and BitKeeper.
> 
> 3) Renaming. Is it possible to rename a file or a directory and still hold
> all the information about its history?
Yes.  A simple `svn mv OLD_NAME NEW_NAME' works for files and
directories.
> 
> 4) Atomicity. Sensitive operations are atomic? Can you be sure that nothing
> will ever interfere with your commit?
Yes.  We use the Berkeley DB 4.0.14 database and its support for
atomic transactions, recoverability, hot backups as a backend.  On top
of this database, Subversion has designed a virtual filesystem that
has atomic commits.
Each commit goes through several phases in the server.  First, the
server checks if a hook script named `start-commit' is available.
This script can perform some simple checking on user rights to the
commit over the whole repository.  If this succeeds, then a
transaction is created in the Berkeley DB.  Next, the server checks of
a `pre-commit' script is available and the name of the transaction is
passed to it.  The `pre-commit' script can perform fine-grained access
control.  If this script succeeds, than the transaction is applied to
the repository as a commit, otherwise it is deleted.
> 
> 5) Access Control List at the Repository Level. Can you grant a new user the
> access to the CMS without being forced to grant him access to the host
> computer as well?
Yes.  Subversion supports two methods of accessing a repository.  The
first is file based where the repository lives on a (ideally) locally
attached filesystem.  Obviously, here you have to have a local account
to access the repository.
The second way to set up a Subversion repository to enable network
access is to use Apache as a front-end to Subversion.  The Subversion
team built an Apache module named mod_dav_svn which sits on top of the
mod_dav module which comes with Apache.  DAV stands for (Web-based
Distributed Authoring and Versioning).  The mod_dav_svn server talks
to a local repository on the server.
In this case, to give an individual access to the repository but not
the whole server, you would using Apache's configuration file either
the AuthUserFile or AuthGroupFile configuration options and a separate
password file.
You can set up anonymous read-only access to the whole repository and
grant read-write access to select individuals.  That's what we do with
the Subversion source code that is self-hosted in the Subversion
server.
We also supply a `pre-commit' script that checks if a particular user
has write permissions to the files and directories being modified by a
particular commit.
> 
> 6) User Authentication. Can you force the developer to use a public-key-based
> authentication for accessing the repository?
No.  We use the Neon HTTP/HTTPS client library and this does not
support client-side certificates.
> 
> 7) Data Integrity. Can the system ensure the data integrity even in the event
> of a serious system crash or others serious accidents?
Yes.  We supply a hot-backup.py script that performs hot backups of
the Subversion repository.  This script runs just after a commit
completes.  Obviously, the repository administrator will have to make
sure that the hot backups are copied to another disk and or server in
case the entire server goes belly up.
> 
> 8) Data protection. Can the repository be encrypted to ensure its
> confidentiality? Is the crypto system an integral part of the CMS or is it
> borrowed from the underling OS?
No, Subversion does not encrypt the repository.
> 
> 9) Binary files. Can the system actually manage binary files (binary diff and
> binary patch)? How?
Yes.
We use the Vdelta algorithm to compute differences between files.
Vdelta is a binary block-copying diff and compression algorithm -- it
treats all files as a binary stream, not as a series of text lines.
During commit and update, changes to file contents are sent as deltas
in both directions, client to server and server to client, saving
network traffic.  In terms of sending differences over the network,
the Subversion client and server treat *all* files as binary data.
On the server, file contents for revisions other than HEAD are stored
as a delta against a newer revision, much like in CVS; but Vdelta
makes deltified storage of binary files much more efficient.
Subversion tries to auto-detect binary files when you add them to the
repository, and marks them with the `svn:mime-type' property set to
`application/octet-stream'.  This an improvement over CVS, where you
have to remember to 'cvs add -kb'.  Files marked as binary are
deliberately *not* contextually merged when receiving changes from the
server.  Side-by-side versions of files are left for the user to
compare.
> 
> 10) Process Control. Is it possible to enforce the user to perform the
> required operations in a given sequence, in a given way or within a given
> time window?
Subversion has the essential mechanisms in place (properties and
hooks), but building a process-control tool on top of that is far from
trivial.  So yes, it's possible, but Subversion currently does not
provide the user-level tools.
> 
> 11) IDE Integration. Can the CMS be integrated in a IDE? Which one?
Yes.  Currently the Emacs vc-mode supports Subversion and people are
currently working on integrating Subversion into Eclipse.
> 
> 12) GUIs. Does the system have any GUI for simplifing the administrative
> operations or the day-to-day use?
People are working on Subversion GUI clients for the users of
Subversion, but there are no plans yet to build a GUI administrative
interface.  Frankly, there's been no need for one so far and it's
pretty much hands off.
> 
> 13) Web Interface. Can the repository be accessed via web? Read-only or
> read-write?
Yes.  The Subversion repository running through Apache is viewable by
default on the web.  Normal browsers can browse the HEAD revision of
the tree.  The repository also looks like a DAV server, so DAV enabled
clients can read and write to the repository.
> 
> 14) Maturity Level. What maturity level has reached the system? (planning,
> alpha, beta, release, in use, mature...).
We are right now between alpha and beta.
> 
> 15) Remote Access. Can the developer use the CMS through internet?
Yes.
> 
> 16) Server Platform. Which platform is required for the server? Linux/Unix?
> Windows?
The server runs on Windows, all Unix platforms and anywhere Apache
will compile, which includes OS/2 and BeOS.  The only operating system
not currently supported is HP-UX which does not allow mmap'ing the
same file more than once, which is required by Berkeley DB.
> 
> 17) Client Platform. For which platforms are available the clients?
> Linux/Unix only? Windows as well? Mac OS?
All of the above.
> 
> 18) Repository Nature. What is used to hold the data? The regular File System
> of the host machine? A versioning file system? A special file system? A
> database?
Yes, we have a versioning filesystem, but currently it is a *virtual*
filesystem API built on top of a Berkeley 4.0.14 database.  Currently
it is not a mountable filesystem, but someday it may be :)
There has been talk of using MySQL or PostgreSQL instead of Berkeley
DB, but there's been no serious work on this.
> 
> 19) Licence. Can the system be used for free even in a commercial project?
Yes.  It is a modified form of the Apache license.  See
http://svn.collab.net/repos/svn/trunk/COPYING
> 
> 20 ) User Tracking / Auditing / Logging. Is it always possible to tell who,
> when and how has changed a file (and a CMS parameter)?
Yes.  This is extensively tracked.  Even if the user does not enter a
commit message (which can be prevented by using a supplied `pre-commit'
script), we know exactly which files were modified by running `svn log
-v'.
> 
> 21) Concurrency. How the system manages the concurrent access to the same
> file? Lock-Modify-Unlock? Checkout-Modify-Merge? Can the policy be chosen by
> the administrator?
The only method we support is Checkout-Modify-Merge.
> 
> 22) Version Recovery. Is it always possible to recover an old version of any
> individual object (and of a whole configuration)?
Yes.  You can bring your local working copy back to any revision of
the repository with a simple `svn update -r REVISION_NUMBER', even if
the object no longer has a visible version in HEAD.
> 
> 23) Version Tracking. Is it always possible to know (to browse) the whole
> history of an individual object (and of a whole configuration)?
Yes, running `svn log FILE_OR_DIR' shows the entire history of the
file.
> 
> 24) Dependency Control. Is it always possible to know which components are
> required to build up a specific configuration of the program?
No.  Subversion does not support dependency control.
> 
> 25) Configuration Management. Is it possible to manage a group of objects as
> a single system? Can a specific configuration be marked (labelled) for a
> better identification? Is the labelling required?
Yes.
The first level of labelling is that the Subversion repository has
global revision numbers that behave in some sense like a lightweight
ubiquitous tag.  You can tell Joe, "Hey, in revision 2449, look at
xyz.c" and Joe can easily get the exact source tree you're looking at.
On top of this, any directory in the Subversion repository can be
copied to another arbitrary name to create a `tag', i.e. `svn cp
trunk tags/version-1.2' would create a directory named `version-1.2'
containing an exact copy of `trunk at that point of time.  The copied
directory `version-1.2' is a `cheap' copy in the sense that it
contains links to the versions of the files and directories in the
source `trunk' directory.  Updates to the source directory `trunk'
are not reflected in the copy `version-1.2', and those files in the
copy can be modified independently of the files in the original
directory.
We should note here that the term "configuration" is a bit more
general than that.  Roughly, it's a collection of specific versions of
specific objects, and can exist apart from directory/file hierarchy.
In general, a configuration controls version visibility rather than
structure, and the behavior can depend on a lot of parameters, not
just the current revision number.  Subversion only implements a subset
of these features -- i.e., a Subversion directory is a configuration,
but not the other way around.
> 
> 26) Notifications. Is it possible to trigger a notification for every
> meaningful event that can affect the repository and its content?
Yes.  We have start-commit, pre-commit and post-commit scripts.
> 
> 27) Notifications 2. How can be delivered a notification to the human user?
> E-mail only? Other ways?
Of the hook scripts Subversions currently supports, one is run after
the commit and is passed the revision number of the commit that was
just performed.  The Subversion source code comes with a `post-commit'
script that examines the differences applied in the commit and mails
them out to any number of email addresses.
> 
> 28) Notifications 3. How can be delivered a notification to a program?
> E-mail? RPC? Others?
Anything that can be run from a script.
> 
> 29) Branching-Merging. Is it possible to branch in any moment, work for a
> while on the branch and merge back without losing any data and without
> crashing the system? (resorting to the human judgement for solving the
> existing conflicts is perfectly acceptable). Is it possible to merge a branch
> with another branch, instead of the original trunk, supposing that the two
> branch have a common ancestor?
Is it possible to branch in any moment? Yes.
Work for a while on the branch and merge back? Yes.
Without crashing the system? Yes.
However, right now Subversion doesn't keep track of merge history.
This is a post 1.0 feature.
With the `svn merge' command you can merge any differences between any
two revisions to any other arbitrary location in the repository.  This
does not require a common ancestor.
> 
> 30) Merging 2. How good is the merging mechanism? Like CVS? Better than CVS?
> Worse?
When a conflict occurs, Subversion will:
1) Conflict markers are placed into the file, to visibly demonstrate
   the overlapping areas.  This matches CVS' behavior.
2) Three fulltext files starting with `tmp' are created; these files
   are the original three files that could not be merged together.
   This is better than CVS , because it allows users to directly
   examine all three files, and even use 3rd-party merge tools (as an
   alternative to conflict markers.)
3) Another improvement over CVS conflict handling: Subversion will not
   allow you to "accidentally" commit conflict markers, as so often
   happens in CVS.  Unlike CVS, Subversion remembers that a file
   remains in conflict, and requires definite action from the user to
   undo this state before it will allow the item to be committed
   again.
Once we start tracking merge history and learn how to extract partial
change sets, it'll be substantially better.  This is post-1.0.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jul 30 20:42:24 2002
This message: [ Message body ]
Next message: Peter Davis: "Re: file:// -> svn://"
Previous message: Karl Fogel: "Re: svn commit: rev 2800 - trunk/doc/handbook"
Maybe in reply to: Alessandro Bottoni: "Newbie questions..."
Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]