[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] Checking paths are legal

From: Barry Scott <barry_at_barrys-emacs.org>
Date: 2004-09-03 21:31:28 CEST

I think you can get the list of DOS device names from QueryDosDevice by
passing NULL as the lpDeviecName.
Then you will get all devices that a windows platform treats as special.
I'd thought that you needed to look in the registry, but I found that
not all the keys in
HKLM:\SYSTEM\CurrentControlSet\Control\Session Manager\DOS Devices
but only some of them prevent you creating a file. For exmaple MAILSLOT
works as a filename.

Barry

On Sep 2, 2004, at 00:08, Jon Foster wrote:

> Hi,
>
> Currently, if a Windows user checks out a directory called "aux",
> they get an incomprehensible Subversion error. This is because
> "aux", "nul", etc and names containing ":" cannot be used as file or
> directory names on Windows. I'd like to make the error messages
> explicit about the problem.
>
> Also, Subversion does not support files where the filename contains
> control characters (i.e. ASCII or [equivalently] Unicode values less
> than 32). Currently adding one of these files silently corrupts the
> working copy, and all further Subversion operations fail. I'd like
> to put in a sane error here, and not break the working copy. This
> isn't a problem on Windows, which doesn't support such files anyway,
> but can be a problem on Linux.
>
> Related issues:
> Issue 1277 (Windows doesn't support files containing ':')
> Issue 1932 (Control characters are not and never will be supported)
> Issue 1954 (Control characters should be rejected gracefully)
> Issue 1830 (Newlines should be rejected gracefully)
>
>
> Proposal:
>
> Since there are two different reasons why a path may be invalid,
> I propose adding two new functions to check paths and two new error
> codes to propogate this information.
>
> We could re-use SVN_ERR_BAD_FILENAME for one or both of the errors.
> However, I would prefer to use a new error code for at least the
> "not supported on your OS" error so this case can be unambiguously
> identified. The way SVN_ERR_BAD_FILENAME is currently used makes
> me think it isn't well suited to be used as the "not supported by
> Subversion" error, either, but that isn't a strong opinion. I'm
> open to suggestions for good names for these errors - I'm
> tentatively calling them SVN_ERR_WC_INVALID_PATH_ARCH and
> SVN_ERR_WC_INVALID_PATH.
>
> I'm not sure whether it is better for the check functions to throw
> the errors themselves or to just return a boolean?
>
> Then we need to add the checks everywhere. In general, commands
> that work on repositories directly need to check only that the
> filenames are legal in Subversion (i.e. no control characters),
> but commands that access files on disk also need to check that
> the filenames are valid on disk. More specifically:
> - The working-copy library needs checks adding to ensure it only
> handles paths that are legal both on the filesystem and in
> Subversion. This should catch the update and checkout of an
> unsupported filename, which are probably the most common cases,
> and also catch "svn add" of a filename containing control
> characters.
> - The ra client code needs to ensure that paths sent by the server
> are legal both on the filesystem and in Subversion. It can
> probably delegate this to the working-copy library on checkout or
> update, but it probably needs to handle this on export[*].
> - It would be nice if the commandline client ensured that URLs are
> valid[*] (e.g. when doing "svn mv URL1 URL2"), but this is not
> essential so long as the ra libraries check this.
> - The ra server code needs to ensure that paths sent by the client
> are legal in Subversion[*], but it doesn't matter whether or not
> they are valid on the server's OS as they're not going to be put
> into normal files. (Note that the DAV protocol doesn't allow
> invalid paths to be specified, so the DAV server may already do
> this implicitly. The svnserve server probably needs to check
> explicitly). It may be possible to delegate this to the fs
> library.
> - "svnadmin load" should check paths in the dumpfile are valid in
> Subversion[*]. It may be possible to delegate this to the fs
> library.
> - The fs library may need to do some checks if higher levels
> don't. [*]
> - Probably some more I haven't thought of yet.[*]
>
> We also need lots of testcases for this.[*]
>
> Attached is a FIRST DRAFT of a patch. It doesn't yet do the things
> marked with [*] above, and it has had minimal testing.
>
>
> I also have a release scheduling question. When (if ever) should we
> plan on putting this change into the official release? I can see
> five options here:
> 0) Reject this idea totally.
> 1) Schedule this for 1.2.0
> 2) Schedule a cut-down version of this for 1.1.x, x>0, that doesn't
> include the new public API. This means the is-valid-filename
> tests would have to be duplicated as a library-private function
> in each libary, and we would have to re-use an existing error
> code. Yuck!
> 3) Delay 1.1.0 until this patch is finished. I do not think this
> is a good idea - it will take me a couple of weeks to get this
> finished, tested, etc, and it is quite a big change. (I would
> like to write this patch myself - it seems about the right level
> of difficulty for a Subversion newbie and will help me learn
> about all the various parts of Subversion. However, if this
> had to go into 1.1.0 then someone else with more free time would
> need to take over).
> 4) Put the new public API in 1.1.0, but do not call it. That makes
> it low-risk in 1.1.0, but we do need to get the API design issues
> sorted out quickly. Schedule the rest of this change for 1.1.1
> or 1.1.2.
> My preferred solution is option (4). This change will be very useful
> for Windows users, so I'd rather they didn't have to wait until
> Subversion 1.2.
>
> Comments on any of the above are welcome. Sorry this mail was so
> long!
>
> Thanks & regards,
>
> Jon Foster
> --
>
>
> Log message for the patch (just because people have said it helps
> them to review draft patches):
>
> Check paths are valid
>
> * subversion/include/svn_error_codes.h
> (SVN_ERR_WC_INVALID_PATH, SVN_ERR_WC_INVALID_PATH_ARCH): New errors
>
> * subversion/include/svn_path.h
> * subversion/libsvn_subr/path.c
> (svn_path_is_valid_in_svn, svn_path_is_valid_on_disk): New
> functions
>
> * src-trunk/subversion/libsvn_wc/copy.c
> (svn_wc_copy): Check path valid both on disk and in svn
>
> * subversion/libsvn_wc/adm_ops.c
> (svn_wc_add): Check path valid both on disk and in svn
>
> * subversion/libsvn_wc/update_editor.c
> (add_directory, add_or_open_file): Check path valid both on disk
> and in svn
>
> * subversion/libsvn_ra_local/ra_plugin.c:
> (svn_ra_local_do_check_path): Check path valid in svn
>
> * subversion/libsvn_ra_svn/client.c:
> (ra_svn_check_path): Check path valid in svn
>
> * subversion/libsvn_ra_dav/props.c:
> (svn_ra_dav_do_check_path): Check path valid in svn
>
>
> Index: src-trunk/subversion/include/svn_error_codes.h
> ===================================================================
> --- src-trunk/subversion/include/svn_error_codes.h (revision 10797)
> +++ src-trunk/subversion/include/svn_error_codes.h (working copy)
> @@ -361,6 +361,14 @@
> SVN_ERR_WC_CATEGORY_START + 22,
> "Path syntax not supported in this context")
>
> + SVN_ERRDEF (SVN_ERR_WC_INVALID_PATH,
> + SVN_ERR_WC_CATEGORY_START + 23,
> + "Path name is not allowed by Subversion")
> +
> + SVN_ERRDEF (SVN_ERR_WC_INVALID_PATH_ARCH,
> + SVN_ERR_WC_CATEGORY_START + 24,
> + "Path name is not allowed by this OS")
> +
> /* fs errors */
>
> SVN_ERRDEF (SVN_ERR_FS_GENERAL,
> Index: src-trunk/subversion/include/svn_path.h
> ===================================================================
> --- src-trunk/subversion/include/svn_path.h (revision 10797)
> +++ src-trunk/subversion/include/svn_path.h (working copy)
> @@ -327,10 +327,55 @@
> */
> svn_boolean_t svn_path_is_single_path_component (const char *name);
>
> +/**
> + * @since New in 1.1.
> + *
> + * Test to see if a path is allowed in a Subversion repository.
> + * Paths containing ASCII control characters (1-31) are forbidden.
> + * Currently, all other paths are allowed.
> + *
> + * This must be OS-independent - it is used to decide what paths
> + * are allowed to be added to a Subversion repository, so the
> + * file either already exists on disk (in the case of svn add)
> + * or may never exist on disk (in the case of svn mv with two
> + * URLs).
> + *
> + * Note that this does NOT guarantee that the path can be checked
> + * out on this particular operating system - for that, use
> + * svn_path_is_valid_on_disk().
> + *
> + * May be called with a path component or a complete path. The
> + * path must not be URL-encoded.
> + *
> + * If path is invalid, return @c FALSE.
> + * If path is valid, return @c TRUE.
> + */
> +svn_boolean_t svn_path_is_valid_in_svn (const char *path);
>
> /**
> * @since New in 1.1.
> *
> + * Test to see if a path is allowed in a Subversion working copy.
> + * This is operating-system dependent - for example Windows
> + * prohibits files and directories called "aux" but Linux allows
> + * them.
> + *
> + * This is a subset of svn_path_is_valid_in_svn - i.e. for any path
> + * that svn_path_is_valid_in_svn returns @c FALSE, this returns
> + * @c FALSE.
> + *
> + * May be called with a path component or a complete path. The
> + * path must not be URL-encoded and (on Windows) must not contain
> + * a drive letter.
> + *
> + * If path is invalid, return @c FALSE.
> + * If path is valid, return @c TRUE.
> + */
> +svn_boolean_t svn_path_is_valid_on_disk (const char *path);
> +
> +/**
> + * @since New in 1.1.
> + *
> * Test to see if a backpath, i.e. '..', is present in @a path.
> * If not, return @c FALSE.
> * If so, return @c TRUE.
> Index: src-trunk/subversion/libsvn_subr/path.c
> ===================================================================
> --- src-trunk/subversion/libsvn_subr/path.c (revision 10797)
> +++ src-trunk/subversion/libsvn_subr/path.c (working copy)
> @@ -678,8 +678,156 @@
> return TRUE;
> }
>
> +svn_boolean_t
> +svn_path_is_valid_in_svn (const char *path)
> +{
> + const char *p;
> + for (p = path; *p; p++)
> + {
> + if ((unsigned char)*p < (unsigned char)32u)
> + return FALSE;
> + }
> + return TRUE;
> +}
>
> +#ifdef WIN32
> +/* Check to see if a path segment matches the Windows reserved names.
> +
> + These rules are taken from
> +
> http://msdn.microsoft.com/library/en-us/fileio/base/naming_a_file.asp
> +
> + The following reserved device names cannot be used as the name
> + of a file:
> + CON, PRN, AUX, NUL,
> + COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9,
> + LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9.
> + Also avoid these names followed by an extension (for example,
> + NUL.tx7).
> +
> + Windows NT: CLOCK$ is also a reserved device name.
> +
> + These names are case-insensetive.
> +
> + The path segment passed is defined as:
> +
> + /some/path_to/a_file\0
> + ^ ^^ ^^ ^
> + | || || +--segment_end 3
> + | || |+--------segment_start 3
> + | || +--segment_end 2
> + | |+---------segment_start 2
> + | +--segment_end 1
> + +------segment_start 1
> +
> + I.e. segment_end always points to a '/' or '\0'.
> + */
> +static svn_boolean_t
> +path_segment_is_valid_on_disk (const char *segment_start,
> + const char *segment_end)
> +{
> + /* We can decide whether or not it is valid based only on
> + the first 7 characters. This avoids dynamic memory allocation
> + and is simpler than trying to look at the original string.
> + 7 characters is "CLOCK$z" (valid), with 6 characters we would
> + be unable to distinguish that from "CLOCK$" (invalid).
> + Note that since using these identifiers with an extension is
> + prohibited, we truncate just before any period. We also
> + take this opertunity to lowercase the string so we don't
> + need to do case-insensetive comparisons later.
> + */
> + char buf[8];
> + apr_size_t i;
> + apr_size_t length = segment_end - segment_start;
> + if (length < 3)
> + return TRUE;
> + if (length > 7)
> + length = 7;
> + for (i = 0; i < length; i++)
> + {
> + if (segment_start[i] == '.')
> + {
> + length = i;
> + break;
> + }
> + buf[i] = apr_tolower(segment_start[i]);
> + }
> + buf[length] = '\0';
> +
> + if (length == 3)
> + {
> + if (!strcmp(buf, "aux") || !strcmp(buf, "con")
> + || !strcmp(buf, "nul") || !strcmp(buf, "prn"))
> + return FALSE;
> + }
> + else if (length == 4)
> + {
> + if ((!strncmp(buf, "con", 3) || !strncmp(buf, "lpt", 3))
> + && apr_isdigit(buf[4]))
> + return FALSE;
> + }
> + else if (length == 6)
> + {
> + if (!strcmp(buf, "clock$"))
> + return FALSE;
> + }
> + return TRUE;
> +}
> +
> +/* Copied from apr_c_is_fnchar in apr/file_io/win32/filesys.c and
> + IS_FNCHAR in apr/arch/win32/apr_arch_fileio.c and simplified.
> + There doesn't seem to be a public API for this APR functionality.
> + */
> +static const char svn_c_is_fnchar[256] =
> +{/* Reject all ctrl codes...
> */
> + 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> + /* ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
> */
> + 1,1,0,1,1,1,1,1,1,1,0,1,1,1,1,0, 1,1,1,1,1,1,1,1,1,1,0,1,0,1,0,0,
> + /* @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _
> */
> + 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,
> + /* ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~
> */
> + 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,
> + /* High bit codes are accepted (subject to utf-8->Unicode xlation)
> */
> + 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
> + 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
> + 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
> + 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
> +};
> +#define SVN_IS_FNCHAR(c) (svn_c_is_fnchar[(unsigned char)(c)])
> +
> +#endif /* def WIN32 */
> +
> svn_boolean_t
> +svn_path_is_valid_on_disk (const char *path)
> +{
> +#ifdef WIN32
> + const char *segment_start = path;
> + const char *p = path;
> + for (; *p; p++)
> + {
> + if (*p == '/')
> + {
> + if (!path_segment_is_valid_on_disk(segment_start, p))
> + return FALSE;
> + segment_start = p + 1;
> + }
> + else if (!SVN_IS_FNCHAR(*p))
> + return FALSE;
> + }
> + if (!path_segment_is_valid_on_disk(segment_start, p))
> + return FALSE;
> + return TRUE;
> +#else /* ndef WIN32 */
> + /* TODO FIXME Windows and Linux are not the only operating systems
> + in the world.
> + */
> + /* Allows all paths. For Linux this is right, for other OS this
> + is a best guess and should be fixed.
> + */
> + return svn_path_is_valid_in_svn(path);
> +#endif
> +}
> +
> +svn_boolean_t
> svn_path_is_backpath_present (const char *path)
> {
> int len = strlen (path);
> Index: src-trunk/subversion/libsvn_wc/copy.c
> ===================================================================
> --- src-trunk/subversion/libsvn_wc/copy.c (revision 10797)
> +++ src-trunk/subversion/libsvn_wc/copy.c (working copy)
> @@ -380,6 +380,15 @@
>
> SVN_ERR (svn_io_check_path (src_path, &src_kind, pool));
>
> + if (!svn_path_is_valid_in_svn(dst_basename))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
> + _("The path '%s' is not allowed in a
> Subversion"
> + " repository"), dst_basename);
> + else if (!svn_path_is_valid_on_disk(dst_basename))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH_ARCH, NULL,
> + _("The path '%s' is not allowed by your
> OS"),
> + dst_basename);
> +
> if (src_kind == svn_node_file)
> SVN_ERR (copy_file_administratively (src_path, adm_access,
> dst_parent, dst_basename,
> Index: src-trunk/subversion/libsvn_wc/adm_ops.c
> ===================================================================
> --- src-trunk/subversion/libsvn_wc/adm_ops.c (revision 10797)
> +++ src-trunk/subversion/libsvn_wc/adm_ops.c (working copy)
> @@ -879,6 +879,15 @@
> apr_uint32_t modify_flags = 0;
> svn_wc_adm_access_t *adm_access;
>
> + if (!svn_path_is_valid_in_svn(path))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
> + _("The path '%s' is not allowed in a
> Subversion"
> + " repository"), path);
> + else if (!svn_path_is_valid_on_disk(path))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH_ARCH, NULL,
> + _("The path '%s' is not allowed by your
> OS"),
> + path);
> +
> /* Make sure something's there. */
> SVN_ERR (svn_io_check_path (path, &kind, pool));
> if (kind == svn_node_none)
> Index: src-trunk/subversion/libsvn_wc/update_editor.c
> ===================================================================
> --- src-trunk/subversion/libsvn_wc/update_editor.c (revision 10797)
> +++ src-trunk/subversion/libsvn_wc/update_editor.c (working copy)
> @@ -1013,6 +1013,16 @@
> || ((! copyfrom_path) && (SVN_IS_VALID_REVNUM
> (copyfrom_revision))))
> abort();
>
> + /* Check path is allowed */
> + if (!svn_path_is_valid_in_svn(db->path))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
> + _("Failed to add directory '%s': Path
> is not"
> + " allowed in a Subversion
> repository"), db->path);
> + else if (!svn_path_is_valid_on_disk(db->path))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH_ARCH, NULL,
> + _("Failed to add directory '%s': Path
> is not"
> + " allowed by your OS"), db->path);
> +
> /* There should be nothing with this name. */
> SVN_ERR (svn_io_check_path (db->path, &kind, db->pool));
> if (kind != svn_node_none)
> @@ -1425,10 +1435,26 @@
> const svn_wc_entry_t *entry;
> svn_node_kind_t kind;
> svn_wc_adm_access_t *adm_access;
> + apr_pool_t *subpool;
>
> + if (!svn_path_is_valid_in_svn(path))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
> + adding ? _("Failed to add file '%s':
> Path "
> + "is not allowed in a
> Subversion "
> + "repository")
> + : _("Failed to update file
> '%s': Path "
> + "is not allowed in a
> Subversion "
> + "repository"), path);
> + else if (!svn_path_is_valid_on_disk(path))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH_ARCH, NULL,
> + adding ? _("Failed to add file '%s':
> Path "
> + "is not allowed by your OS")
> + : _("Failed to update file
> '%s': Path "
> + "is not allowed by your
> OS"), path);
> +
> /* the file_pool can stick around for a *long* time, so we want to
> use
> a subpool for any temporary allocations. */
> - apr_pool_t *subpool = svn_pool_create (pool);
> + subpool = svn_pool_create (pool);
>
> /* ### kff todo: if file is marked as removed by user, then flag a
> conflict in the entry and proceed. Similarly if it has changed
> Index: src-trunk/subversion/libsvn_client/copy.c
> ===================================================================
> --- src-trunk/subversion/libsvn_client/copy.c (revision 10797)
> +++ src-trunk/subversion/libsvn_client/copy.c (working copy)
> @@ -750,6 +750,7 @@
> svn_client_ctx_t *ctx,
> apr_pool_t *pool)
> {
> + /* ### TODO FIXME make sure dst_path is legal... */
> void *ra_baton, *sess;
> svn_ra_plugin_t *ra_lib;
> svn_node_kind_t src_kind, dst_kind;
> Index: src-trunk/subversion/libsvn_ra_local/ra_plugin.c
> ===================================================================
> --- src-trunk/subversion/libsvn_ra_local/ra_plugin.c (revision 10797)
> +++ src-trunk/subversion/libsvn_ra_local/ra_plugin.c (working copy)
> @@ -656,6 +656,11 @@
> svn_fs_root_t *root;
> const char *abs_path = sbaton->fs_path;
>
> + if (path && !svn_path_is_valid_in_svn(path))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
> + _("The path '%s' is not allowed in a
> Subversion"
> + " repository"), path);
> +
> /* ### Not sure if this counts as a workaround or not. The
> session baton uses the empty string to mean root, and not
> sure that should change. However, it would be better to use
> Index: src-trunk/subversion/libsvn_ra_svn/client.c
> ===================================================================
> --- src-trunk/subversion/libsvn_ra_svn/client.c (revision 10797)
> +++ src-trunk/subversion/libsvn_ra_svn/client.c (working copy)
> @@ -1089,6 +1089,11 @@
> svn_ra_svn_conn_t *conn = sess->conn;
> const char *kind_word;
>
> + if (path && !svn_path_is_valid_in_svn(path))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
> + _("The path '%s' is not allowed in a
> Subversion"
> + " repository"), path);
> +
> SVN_ERR(svn_ra_svn_write_cmd(conn, pool, "check-path", "c(?r)",
> path, rev));
> SVN_ERR(handle_auth_request(sess, pool));
> SVN_ERR(svn_ra_svn_read_cmd_response(conn, pool, "w", &kind_word));
> Index: src-trunk/subversion/libsvn_ra_dav/props.c
> ===================================================================
> --- src-trunk/subversion/libsvn_ra_dav/props.c (revision 10797)
> +++ src-trunk/subversion/libsvn_ra_dav/props.c (working copy)
> @@ -1098,6 +1098,11 @@
> svn_error_t *err;
> svn_boolean_t is_dir;
>
> + if (path && !svn_path_is_valid_in_svn(path))
> + return svn_error_createf (SVN_ERR_WC_INVALID_PATH, NULL,
> + _("The path '%s' is not allowed in a
> Subversion"
> + " repository"), path);
> +
> /* ### For now, using svn_ra_dav__get_baseline_info() works because
> we only have three possibilities: dir, file, or none. When we
> add symlinks, we will need to do something different. Here's one
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Sep 3 21:32:43 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.