Re: [PATCH] Discuss character set restrictions in book
From: Charles Bailey <bailey.charles_at_gmail.com>
Date: 2005-02-27 00:25:30 CET
On Wed, 23 Feb 2005 16:32:24 -0500, Charles Bailey
I apologize for replying to my own patch, but since Peter Lundblad pointed
-- Regards, Charles Bailey Lists: bailey _dot_ charles _at_ gmail _dot_ com Other: bailey _at_ newman _dot_ upenn _dot_ edu Explain character set restrictions for text and path names. * docs/book/book/ch03.xml: Expand sidebar to discuss character encoding, and hence restrictions on legal characters, for text and path names. Index: book/ch03.xml =================================================================== --- book/ch03.xml (revision 13123) +++ book/ch03.xml (working copy) @@ -312,13 +312,55 @@ </screen> <sidebar> - <title>Repository Layout</title> + <title>What's in a name?</title> + + <para>Subversion tries hard not to limit the type of data you + can place under version control. The contents of files and + property values are stored and transmitted as binary data, and + the <xref linkend="svn-ch-7-sect-2.3.2"/> tells you how + to give Subversion a hint that <quote>textual</quote> operations + don't make sense for a particular file. There are a few places, + however, where Subversion places restrictions on information it + stores.</para> - <para>If you're wondering what <literal>trunk</literal> is all - about in the above URL, it's part of the way we recommend - you lay out your Subversion repository which we'll talk a lot - more about in <xref linkend="svn-ch-4"/>.</para> + <para>Subversion handles text internally as UTF-8 encoded + Unicode. As a result, certain items which are + inherently <quote>textual</quote>, such as property names, path + names, and log messages, can only contain legal UTF-8 + characters. It also provides a minimum requirement for use of the + <literal>svn:mime-type</literal> property: if a file's contents + aren't compatible with UTF-8, you should mark it as a binary + file. Otherwise, Subversion will attempt to merge differences + using UTF-8, which is likely to leave garbage in the + file.</para> + <para>In addition, path names are used as XML attribute values + in WebDAV exchanges, as well in as some of Subversion's + housekeeping files. This means that path names can only contain + legal XML (1.0) characters. Subversion also prohibits + TAB, CR, and LF in path names, so they aren't broken up + in diffs, or in the output of commands like + <xref linkend="svn-ch-9-sect-1.2-re-log"/> or + <xref linkend="svn-ch-9-sect-1.2-re-status"/>.</para> + + <para>While it may seem like a lot to remember, in practice + these limitations are rarely a problem. As long as your + locale settings are compatible with UTF-8, and you don't use + control characters in path names, you should have no trouble + communicating with Subversion. The command line client adds an + extra bit of help: it will automatically escape legal + characters as needed in URLs you type to create <quote>legally + correct</quote> versions for internal use.</para> + + <para>Experienced users of Subversion have also developed a set + of <quote>best practice</quote> conventions for laying out paths + in the repository. While these aren't strict requirements like + the syntax described above, they help to organize frequently + performed tasks. The <literal>/trunk</literal> part of the URL + above is one of these conventions; we'll talk a lot more about + it and related recommendations in <xref + linkend="svn-ch-4"/>.</para> + </sidebar> <para>Although the above example checks out the trunk directory, ## End of patch ## --------------------------------------------------------------------- To unsubscribe, e-mail: email@example.com For additional commands, e-mail: firstname.lastname@example.orgReceived on Sun Feb 27 00:26:50 2005
This is an archived mail posted to the Subversion Dev mailing list.