Hi,
Currently, if a Windows user checks out a directory called "aux",
they get an incomprehensible Subversion error. This is because
"aux", "nul", etc and names containing ":" cannot be used as file or
directory names on Windows. I'd like to make the error messages
explicit about the problem.
Also, Subversion does not support files where the filename contains
control characters (i.e. ASCII or [equivalently] Unicode values less
than 32). Currently adding one of these files silently corrupts the
working copy, and all further Subversion operations fail. I'd like
to put in a sane error here, and not break the working copy. This
isn't a problem on Windows, which doesn't support such files anyway,
but can be a problem on Linux.
Related issues:
Issue 1277 (Windows doesn't support files containing ':')
Issue 1932 (Control characters are not and never will be supported)
Issue 1954 (Control characters should be rejected gracefully)
Issue 1830 (Newlines should be rejected gracefully)
Proposal:
Since there are two different reasons why a path may be invalid,
I propose adding two new functions to check paths and two new error
codes to propogate this information.
We could re-use SVN_ERR_BAD_FILENAME for one or both of the errors.
However, I would prefer to use a new error code for at least the
"not supported on your OS" error so this case can be unambiguously
identified. The way SVN_ERR_BAD_FILENAME is currently used makes
me think it isn't well suited to be used as the "not supported by
Subversion" error, either, but that isn't a strong opinion. I'm
open to suggestions for good names for these errors - I'm
tentatively calling them SVN_ERR_WC_INVALID_PATH_ARCH and
SVN_ERR_WC_INVALID_PATH.
I'm not sure whether it is better for the check functions to throw
the errors themselves or to just return a boolean?
Then we need to add the checks everywhere. In general, commands
that work on repositories directly need to check only that the
filenames are legal in Subversion (i.e. no control characters),
but commands that access files on disk also need to check that
the filenames are valid on disk. More specifically:
- The working-copy library needs checks adding to ensure it only
handles paths that are legal both on the filesystem and in
Subversion. This should catch the update and checkout of an
unsupported filename, which are probably the most common cases,
and also catch "svn add" of a filename containing control
characters.
- The ra client code needs to ensure that paths sent by the server
are legal both on the filesystem and in Subversion. It can
probably delegate this to the working-copy library on checkout or
update, but it probably needs to handle this on export[*].
- It would be nice if the commandline client ensured that URLs are
valid[*] (e.g. when doing "svn mv URL1 URL2"), but this is not
essential so long as the ra libraries check this.
- The ra server code needs to ensure that paths sent by the client
are legal in Subversion[*], but it doesn't matter whether or not
they are valid on the server's OS as they're not going to be put
into normal files. (Note that the DAV protocol doesn't allow
invalid paths to be specified, so the DAV server may already do
this implicitly. The svnserve server probably needs to check
explicitly). It may be possible to delegate this to the fs
library.
- "svnadmin load" should check paths in the dumpfile are valid in
Subversion[*]. It may be possible to delegate this to the fs
library.
- The fs library may need to do some checks if higher levels
don't. [*]
- Probably some more I haven't thought of yet.[*]
We also need lots of testcases for this.[*]
Attached is a FIRST DRAFT of a patch. It doesn't yet do the things
marked with [*] above, and it has had minimal testing.
I also have a release scheduling question. When (if ever) should we
plan on putting this change into the official release? I can see
five options here:
0) Reject this idea totally.
1) Schedule this for 1.2.0
2) Schedule a cut-down version of this for 1.1.x, x>0, that doesn't
include the new public API. This means the is-valid-filename
tests would have to be duplicated as a library-private function
in each libary, and we would have to re-use an existing error
code. Yuck!
3) Delay 1.1.0 until this patch is finished. I do not think this
is a good idea - it will take me a couple of weeks to get this
finished, tested, etc, and it is quite a big change. (I would
like to write this patch myself - it seems about the right level
of difficulty for a Subversion newbie and will help me learn
about all the various parts of Subversion. However, if this
had to go into 1.1.0 then someone else with more free time would
need to take over).
4) Put the new public API in 1.1.0, but do not call it. That makes
it low-risk in 1.1.0, but we do need to get the API design issues
sorted out quickly. Schedule the rest of this change for 1.1.1
or 1.1.2.
My preferred solution is option (4). This change will be very useful
for Windows users, so I'd rather they didn't have to wait until
Subversion 1.2.
Comments on any of the above are welcome. Sorry this mail was so
long!
Thanks & regards,
Jon Foster
--
Log message for the patch (just because people have said it helps
them to review draft patches):
Check paths are valid
* subversion/include/svn_error_codes.h
(SVN_ERR_WC_INVALID_PATH, SVN_ERR_WC_INVALID_PATH_ARCH): New errors
* subversion/include/svn_path.h
* subversion/libsvn_subr/path.c
(svn_path_is_valid_in_svn, svn_path_is_valid_on_disk): New functions
* src-trunk/subversion/libsvn_wc/copy.c
(svn_wc_copy): Check path valid both on disk and in svn
* subversion/libsvn_wc/adm_ops.c
(svn_wc_add): Check path valid both on disk and in svn
* subversion/libsvn_wc/update_editor.c
(add_directory, add_or_open_file): Check path valid both on disk and in svn
* subversion/libsvn_ra_local/ra_plugin.c:
(svn_ra_local_do_check_path): Check path valid in svn
* subversion/libsvn_ra_svn/client.c:
(ra_svn_check_path): Check path valid in svn
* subversion/libsvn_ra_dav/props.c:
(svn_ra_dav_do_check_path): Check path valid in svn
Index: src-trunk/subversion/include/svn_error_codes.h
===================================================================
--- src-trunk/subversion/include/svn_error_codes.h (revision 10797)
+++ src-trunk/subversion/include/svn_error_codes.h (working copy)
@@ -361,6 +361,14 @@
SVN_ERR_WC_CATEGORY_START + 22,
"Path syntax not supported in this context")
+ SVN_ERRDEF (SVN_ERR_WC_INVALID_PATH,
+ SVN_ERR_WC_CATEGORY_START + 23,
+ "Path name is not allowed by Subversion")
+
+ SVN_ERRDEF (SVN_ERR_WC_INVALID_PATH_ARCH,
+ SVN_ERR_WC_CATEGORY_START + 24,
+ "Path name is not allowed by this OS")
+
/* fs errors */
SVN_ERRDEF (SVN_ERR_FS_GENERAL,
Index: src-trunk/subversion/include/svn_path.h
===================================================================
--- src-trunk/subversion/include/svn_path.h (revision 10797)
+++ src-trunk/subversion/include/svn_path.h (working copy)
@@ -327,10 +327,55 @@
*/
svn_boolean_t svn_path_is_single_path_component (const char *name);
+/**
+ * @since New in 1.1.
+ *
+ * Test to see if a path is allowed in a Subversion repository.
+ * Paths containing ASCII control characters (1-31) are forbidden.
+ * Currently, all other paths are allowed.
+ *
+ * This must be OS-independent - it is used to decide what paths
+ * are allowed to be added to a Subversion repository, so the
+ * file either already exists on disk (in the case of svn add)
+ * or may never exist on disk (in the case of svn mv with two
+ * URLs).
+ *
+ * Note that this does NOT guarantee that the path can be checked
+ * out on this particular operating system - for that, use
+ * svn_path_is_valid_on_disk().
+ *
+ * May be called with a path component or a complete path. The
+ * path must not be URL-encoded.
+ *
+ * If path is invalid, return @c FALSE.
+ * If path is valid, return @c TRUE.
+ */
+svn_boolean_t svn_path_is_valid_in_svn (const char *path);
/**
* @since New in 1.1.
*
+ * Test to see if a path is allowed in a Subversion working copy.
+ * This is operating-system dependent - for example Windows
+ * prohibits files and directories called "aux" but Linux allows
+ * them.
+ *
+ * This is a subset of svn_path_is_valid_in_svn - i.e. for any path
+ * that svn_path_is_valid_in_svn returns @c FALSE, this returns
+ * @c FALSE.
+ *
+ * May be called with a path component or a complete path. The
+ * path must not be URL-encoded and (on Windows) must not contain
+ * a drive letter.
+ *
+ * If path is invalid, return @c FALSE.
+ * If path is valid, return @c TRUE.
+ */
+svn_boolean_t svn_path_is_valid_on_disk (const char *path);
+
+/**
+ * @since New in 1.1.
+ *
* Test to see if a backpath, i.e. '..', is present in @a path.
* If not, return @c FALSE.
* If so, return @c TRUE.
Index: src-trunk/subversion/libsvn_subr/path.c
===================================================================
--- src-trunk/subversion/libsvn_subr/path.c (revision 10797)
+++ src-trunk/subversion/libsvn_subr/path.c (working copy)
@@ -678,8 +678,156 @@
return TRUE;
}
+svn_boolean_t
+svn_path_is_valid_in_svn (const char *path)
+{
+ const char *p;
+ for (p = path; *p; p++)
+ {
+ if ((unsigned char)*p < (unsigned char)32u)
+ return FALSE;
+ }
+ return TRUE;
+}
+#ifdef WIN32
+/* Check to see if a path segment matches the Windows reserved names.
+
+ These rules are taken from
+ http://msdn.microsoft.com/library/en-us/fileio/base/naming_a_file.asp
+
+ The following reserved device names cannot be used as the name
+ of a file:
+ CON, PRN, AUX, NUL,
+ COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9,
+ LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9.
+ Also avoid these names followed by an extension (for example,
+ NUL.tx7).
+
+ Windows NT: CLOCK$ is also a reserved device name.
+
+ These names are case-insensetive.
+
+ The path segment passed is defined as:
+
+ /some/path_to/a_file\0
+ ^ ^^ ^^ ^
+ | || || +--segment_end 3
+ | || |+--------segment_start 3
+ | || +--segment_end 2
+ | |+---------segment_start 2
+ | +--segment_end 1
+ +------segment_start 1
+
+ I.e. segment_end always points to a '/' or '\0'.
+ */
+static svn_boolean_t
+path_segment_is_valid_on_disk (const char *segment_start,
+ const char *segment_end)
+{
+ /* We can decide whether or not it is valid based only on
+ the first 7 characters. This avoids dynamic memory allocation
+ and is simpler than trying to look at the original string.
+ 7 characters is "CLOCK$z" (valid), with 6 characters we would
+ be unable to distinguish that from "CLOCK$" (invalid).
+ Note that since using these identifiers with an extension is
+ prohibited, we truncate just before any period. We also
+ take this opertunity to lowercase the string so we don't
+ need to do case-insensetive comparisons later.
+ */
+ char buf[8];
+ apr_size_t i;
+ apr_size_t length = segment_end - segment_start;
+ if (length < 3)
+ return TRUE;
+ if (length > 7)
+ length = 7;
+ for (i = 0; i < length; i++)
+ {
+ if (segment_start[i] == '.')
+ {
+ length = i;
+ break;
+ }
+ buf[i] = apr_tolower(segment_start[i]);
+ }
+ buf[length] = '\0';
+
+ if (length == 3)
+ {
+ if (!strcmp(buf, "aux") || !strcmp(buf, "con")
+ || !strcmp(buf, "nul") || !strcmp(buf, "prn"))
+ return FALSE;
+ }
+ else if (length == 4)
+ {
+ if ((!strncmp(buf, "con", 3) || !strncmp(buf, "lpt", 3))
+ && apr_isdigit(buf[4]))
+ return FALSE;
+ }
+ else if (length == 6)
+ {
+ if (!strcmp(buf, "clock$"))
+ return FALSE;
+ }
+ return TRUE;
+}
+
+/* Copied from apr_c_is_fnchar in apr/file_io/win32/filesys.c and
+ IS_FNCHAR in apr/arch/win32/apr_arch_fileio.c and simplified.
+ There doesn't seem to be a public API for this APR functionality.
+ */
+static const char svn_c_is_fnchar[256] =
+{/* Reject all ctrl codes... */
+ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
+ /* ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? */
+ 1,1,0,1,1,1,1,1,1,1,0,1,1,1,1,0, 1,1,1,1,1,1,1,1,1,1,0,1,0,1,0,0,
+ /* @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ */
+ 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,
+ /* ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ */
+ 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,
+ /* High bit codes are accepted (subject to utf-8->Unicode xlation) */
+ 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
+ 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
+ 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
+ 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
+};
+#define SVN_IS_FNCHAR(c) (svn_c_is_fnchar[(unsigned char)(c)])
+
+#endif /* def WIN32 */
+
svn_boolean_t
+svn_path_is_valid_on_disk (const char *path)
+{
+#ifdef WIN32
+ const char *segment_start = path;
+ const char *p = path;
+ for (; *p; p++)
+ {
+ if (*p == '/')
+ {
+ if (!path_segment_is_valid_on_disk(segment_start, p))
+ return FALSE;
+ segment_start = p + 1;
+ }
+ else if (!SVN_IS_FNCHAR(*p))
+ return FALSE;
+ }
+ if (!path_segment_is_valid_on_disk(segment_start, p))
+ return FALSE;
+ return TRUE;
+#else /* ndef WIN32 */
+ /* TODO FIXME Windows and Linux are not the only operating systems
+ in the world.
+ */
+ /* Allows all paths. For Linux this is right, for other OS this
+ is a best guess and should be fixed.
+ */
+ return svn_path_is_valid_in_svn(path);
+#endif
+}
+
+svn_boolean_t
svn_path_is_backpath_present (const char *path)
{
int len = strlen (path);
Index: src-trunk/subversion/libsvn_wc/copy.c
===================================================================
--- src-trunk/subversion/libsvn_wc/copy.c (revision 10797)
+++ src-trunk/subversion/libsvn_wc/copy.c (working copy)
@@ -380,6 +380,15 @@
SVN_ERR (svn_io_check_path (src_path, &src_kind, pool));
+ if (!svn_path_is_valid_in_svn(dst_basename))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
+ _("The path '%s' is not allowed in a Subversion"
+ " repository"), dst_basename);
+ else if (!svn_path_is_valid_on_disk(dst_basename))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH_ARCH, NULL,
+ _("The path '%s' is not allowed by your OS"),
+ dst_basename);
+
if (src_kind == svn_node_file)
SVN_ERR (copy_file_administratively (src_path, adm_access,
dst_parent, dst_basename,
Index: src-trunk/subversion/libsvn_wc/adm_ops.c
===================================================================
--- src-trunk/subversion/libsvn_wc/adm_ops.c (revision 10797)
+++ src-trunk/subversion/libsvn_wc/adm_ops.c (working copy)
@@ -879,6 +879,15 @@
apr_uint32_t modify_flags = 0;
svn_wc_adm_access_t *adm_access;
+ if (!svn_path_is_valid_in_svn(path))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
+ _("The path '%s' is not allowed in a Subversion"
+ " repository"), path);
+ else if (!svn_path_is_valid_on_disk(path))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH_ARCH, NULL,
+ _("The path '%s' is not allowed by your OS"),
+ path);
+
/* Make sure something's there. */
SVN_ERR (svn_io_check_path (path, &kind, pool));
if (kind == svn_node_none)
Index: src-trunk/subversion/libsvn_wc/update_editor.c
===================================================================
--- src-trunk/subversion/libsvn_wc/update_editor.c (revision 10797)
+++ src-trunk/subversion/libsvn_wc/update_editor.c (working copy)
@@ -1013,6 +1013,16 @@
|| ((! copyfrom_path) && (SVN_IS_VALID_REVNUM (copyfrom_revision))))
abort();
+ /* Check path is allowed */
+ if (!svn_path_is_valid_in_svn(db->path))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
+ _("Failed to add directory '%s': Path is not"
+ " allowed in a Subversion repository"), db->path);
+ else if (!svn_path_is_valid_on_disk(db->path))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH_ARCH, NULL,
+ _("Failed to add directory '%s': Path is not"
+ " allowed by your OS"), db->path);
+
/* There should be nothing with this name. */
SVN_ERR (svn_io_check_path (db->path, &kind, db->pool));
if (kind != svn_node_none)
@@ -1425,10 +1435,26 @@
const svn_wc_entry_t *entry;
svn_node_kind_t kind;
svn_wc_adm_access_t *adm_access;
+ apr_pool_t *subpool;
+ if (!svn_path_is_valid_in_svn(path))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
+ adding ? _("Failed to add file '%s': Path "
+ "is not allowed in a Subversion "
+ "repository")
+ : _("Failed to update file '%s': Path "
+ "is not allowed in a Subversion "
+ "repository"), path);
+ else if (!svn_path_is_valid_on_disk(path))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH_ARCH, NULL,
+ adding ? _("Failed to add file '%s': Path "
+ "is not allowed by your OS")
+ : _("Failed to update file '%s': Path "
+ "is not allowed by your OS"), path);
+
/* the file_pool can stick around for a *long* time, so we want to use
a subpool for any temporary allocations. */
- apr_pool_t *subpool = svn_pool_create (pool);
+ subpool = svn_pool_create (pool);
/* ### kff todo: if file is marked as removed by user, then flag a
conflict in the entry and proceed. Similarly if it has changed
Index: src-trunk/subversion/libsvn_client/copy.c
===================================================================
--- src-trunk/subversion/libsvn_client/copy.c (revision 10797)
+++ src-trunk/subversion/libsvn_client/copy.c (working copy)
@@ -750,6 +750,7 @@
svn_client_ctx_t *ctx,
apr_pool_t *pool)
{
+ /* ### TODO FIXME make sure dst_path is legal... */
void *ra_baton, *sess;
svn_ra_plugin_t *ra_lib;
svn_node_kind_t src_kind, dst_kind;
Index: src-trunk/subversion/libsvn_ra_local/ra_plugin.c
===================================================================
--- src-trunk/subversion/libsvn_ra_local/ra_plugin.c (revision 10797)
+++ src-trunk/subversion/libsvn_ra_local/ra_plugin.c (working copy)
@@ -656,6 +656,11 @@
svn_fs_root_t *root;
const char *abs_path = sbaton->fs_path;
+ if (path && !svn_path_is_valid_in_svn(path))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
+ _("The path '%s' is not allowed in a Subversion"
+ " repository"), path);
+
/* ### Not sure if this counts as a workaround or not. The
session baton uses the empty string to mean root, and not
sure that should change. However, it would be better to use
Index: src-trunk/subversion/libsvn_ra_svn/client.c
===================================================================
--- src-trunk/subversion/libsvn_ra_svn/client.c (revision 10797)
+++ src-trunk/subversion/libsvn_ra_svn/client.c (working copy)
@@ -1089,6 +1089,11 @@
svn_ra_svn_conn_t *conn = sess->conn;
const char *kind_word;
+ if (path && !svn_path_is_valid_in_svn(path))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
+ _("The path '%s' is not allowed in a Subversion"
+ " repository"), path);
+
SVN_ERR(svn_ra_svn_write_cmd(conn, pool, "check-path", "c(?r)", path, rev));
SVN_ERR(handle_auth_request(sess, pool));
SVN_ERR(svn_ra_svn_read_cmd_response(conn, pool, "w", &kind_word));
Index: src-trunk/subversion/libsvn_ra_dav/props.c
===================================================================
--- src-trunk/subversion/libsvn_ra_dav/props.c (revision 10797)
+++ src-trunk/subversion/libsvn_ra_dav/props.c (working copy)
@@ -1098,6 +1098,11 @@
svn_error_t *err;
svn_boolean_t is_dir;
+ if (path && !svn_path_is_valid_in_svn(path))
+ return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
+ _("The path '%s' is not allowed in a Subversion"
+ " repository"), path);
+
/* ### For now, using svn_ra_dav__get_baseline_info() works because
we only have three possibilities: dir, file, or none. When we
add symlinks, we will need to do something different. Here's one
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Sep 2 01:09:02 2004