Re: [PATCH] Discuss character set restrictions in book

From: Charles Bailey <bailey.charles_at_gmail.com>
Date: 2005-02-27 00:25:30 CET

On Wed, 23 Feb 2005 16:32:24 -0500, Charles Bailey
<bailey.charles@gmail.com> wrote:
> Attached is a patch to Book Chapter 3 which expands one of the
> sidebars to discuss how Subversion handles text and path names, and
> therefore what is(n't) allowed in these types of input. I've waffled
> a fair amount over where to put this -- on the one hand, I thought
> it'd be good to have it in a common enough place that new users (like
> me) would see it before finding problems the hard way; on the other,
> it can get to be a fairly picky and/or fragmented topic. I tried to
> find a middle ground. The location early in the book's "detailed"
> description of Subversion seemed reasonable, rather than placing it in
> introductory section, where the detail is less fine, or the developer
> reference, where users might not think to look. I didn't think any of
> the issues merited a full section on their own, so I tried to combine
> (conflate?) them as smoothly as I could into a sidebar. The revised
> version does bury the note about "/trunk" a bit, but it gets detailed
> discussion in chapter 4, and the original sidebar was really just a
> pointer to that discussion.

I apologize for replying to my own patch, but since Peter Lundblad pointed
out that my original postings has unintentionally gone out with
attachments, I'm appending the original patch below my sig. If
proposed book patches now belong in a different forum, please let me

Charles Bailey
Lists: bailey _dot_ charles _at_ gmail _dot_ com
Other: bailey _at_ newman _dot_ upenn _dot_ edu
Explain character set restrictions for text and path names.
* docs/book/book/ch03.xml:
  Expand sidebar to discuss character encoding,  and hence restrictions
  on legal characters, for text and path names.
Index: book/ch03.xml
--- book/ch03.xml	(revision 13123)
+++ book/ch03.xml	(working copy)
@@ -312,13 +312,55 @@
-      <title>Repository Layout</title>
+      <title>What's in a name?</title>
+      <para>Subversion tries hard not to limit the type of data you
+      can place under version control.  The contents of files and
+      property values are stored and transmitted as binary data, and
+      the <xref linkend="svn-ch-7-sect-2.3.2"/> tells you how
+      to give Subversion a hint that <quote>textual</quote> operations
+      don't make sense for a particular file.  There are a few places,
+      however, where Subversion places restrictions on information it
+      stores.</para>
-      <para>If you're wondering what <literal>trunk</literal> is all
-        about in the above URL, it's part of the way we recommend
-        you lay out your Subversion repository which we'll talk a lot
-        more about in <xref linkend="svn-ch-4"/>.</para>
+      <para>Subversion handles text internally as UTF-8 encoded
+      Unicode.  As a result, certain items which are
+      inherently <quote>textual</quote>, such as property names, path
+      names, and log messages, can only contain legal UTF-8
+      characters.  It also provides a minimum requirement for use of the
+      <literal>svn:mime-type</literal> property: if a file's contents
+      aren't compatible with UTF-8, you should mark it as a binary
+      file.  Otherwise, Subversion will attempt to merge differences
+      using UTF-8, which is likely to leave garbage in the
+      file.</para>
+      <para>In addition, path names are used as XML attribute values
+      in WebDAV exchanges, as well in as some of Subversion's
+      housekeeping files.  This means that path names can only contain
+      legal XML (1.0) characters.  Subversion also prohibits 
+      TAB, CR, and LF in path names, so they aren't broken up
+      in diffs, or in the output of commands like
+      <xref linkend="svn-ch-9-sect-1.2-re-log"/> or
+      <xref linkend="svn-ch-9-sect-1.2-re-status"/>.</para>
+      <para>While it may seem like a lot to remember, in practice
+      these limitations are rarely a problem.  As long as your
+      locale settings are compatible with UTF-8, and you don't use
+      control characters in path names, you should have no trouble
+      communicating with Subversion.  The command line client adds an
+      extra bit of help: it will automatically escape legal
+      characters as needed in URLs you type to create <quote>legally
+      correct</quote> versions for internal use.</para>
+      <para>Experienced users of Subversion have also developed a set
+      of <quote>best practice</quote> conventions for laying out paths
+      in the repository.  While these aren't strict requirements like
+      the syntax described above, they help to organize frequently
+      performed tasks.  The <literal>/trunk</literal> part of the URL
+      above is one of these conventions; we'll talk a lot more about
+      it and related recommendations in <xref
+      linkend="svn-ch-4"/>.</para>
     <para>Although the above example checks out the trunk directory,
## End of patch ##
Received on Sun Feb 27 00:26:50 2005

