[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

[PATCH] Issue #2748: non-UTF-8 filenames in the repository

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Wed, 17 Sep 2008 15:42:21 +0300 (Jerusalem Daylight Time)

Patch for issue #2748 ("clients can create non UTF-8 filenames in the
repository"). It prevents non-UTF-8 paths from entering the repository,
but it doesn't affect existing files in existing repositories, svnsync,
or 'svnadmin load'.

I'm running 'make check' now, and I'll commit later today or tomorrow if
I don't hear anything.

Daniel

[[[
Validate in libsvn_repos that paths added to the repository are valid UTF-8.
This fixes issue #2748.

TODO: update callers of svn_path_check_valid

* subversion/libsvn_repos/commit.c
  (add_directory, add_file):
    Validate incoming paths for UTF-8 using svn_path_check_valid2().

* subversion/include/svn_path.h
  (svn_path_check_valid2): New.
  (svn_path_check_valid): Deprecate.

* subversion/libsvn_subr/path.c
  (utf_impl.h): Include.
  (svn_path_check_valid2): New.
  (svn_path_check_valid): Deprecate.

* subversion/tests/libsvn_repos/repos-test.c
  (prop_validation_commit_with_revprop): Renamed to...
  (commit_path_with_revprops): ... this.
    Also, add KIND parameter, and make PROP_KEY optional.
  (prop_validation):
    Track rename and extension of prop_validation_commit_with_revprop().
  (paths_are_UTF_8): New test.
  (test_funcs): Run paths_are_UTF_8 as PASS.
]]]

Index: subversion/libsvn_subr/path.c
===================================================================
--- subversion/libsvn_subr/path.c (revision 33124)
+++ subversion/libsvn_subr/path.c (working copy)
@@ -31,6 +31,7 @@
 #include "svn_io.h" /* for svn_io_stat() */
 #include "svn_ctype.h"
 
+#include "utf_impl.h" /* for svn_utf__cstring_is_valid() */
 
 /* The canonical empty path. Can this be changed? Well, change the empty
    test below and the path library will work, not so sure about the fs/wc
@@ -1461,10 +1462,20 @@ illegal_path_escape(const char *path, apr_pool_t *
 }
 
 svn_error_t *
-svn_path_check_valid(const char *path, apr_pool_t *pool)
+svn_path_check_valid2(const char *path, svn_boolean_t assume_utf8,
+ apr_pool_t *pool)
 {
   const char *c;
 
+ if (assume_utf8 == FALSE
+ && svn_utf__cstring_is_valid(path) == FALSE)
+ {
+ return svn_error_createf
+ (SVN_ERR_FS_PATH_SYNTAX, NULL,
+ _("Path '%s' is not valid UTF-8"),
+ illegal_path_escape(svn_path_local_style(path, pool), pool));
+ }
+
   for (c = path; *c; c++)
     {
       if (svn_ctype_iscntrl(*c))
@@ -1480,6 +1491,12 @@ svn_error_t *
   return SVN_NO_ERROR;
 }
 
+svn_error_t *
+svn_path_check_valid(const char *path, apr_pool_t *pool)
+{
+ return svn_path_check_valid2(path, TRUE, pool);
+}
+
 void
 svn_path_splitext(const char **path_root,
                   const char **path_ext,
Index: subversion/tests/libsvn_repos/repos-test.c
===================================================================
--- subversion/tests/libsvn_repos/repos-test.c (revision 33124)
+++ subversion/tests/libsvn_repos/repos-test.c (working copy)
@@ -2189,52 +2189,56 @@ reporter_depth_exclude(const char **msg,
 
 
 
-/* Test if prop values received by the server are validated.
- * These tests "send" property values to the server and diagnose the
- * behaviour.
- */
+/* Test validation of properties and path names by the server. */
 
 /* Helper function that makes an arbitrary change to a given repository
  * REPOS and runs a commit with a specific revision property set to a
  * certain value. The property name, type and value are given in PROP_KEY,
  * PROP_KLEN and PROP_VAL, as in apr_hash_set(), using a const char* key.
  *
- * The FILENAME argument names a file in the test repository to add in
- * this commit, e.g. "/A/should_fail_1".
+ * If PROP_KEY is NULL, no revision property is set.
  *
+ * The PATH argument names an object in the test repository to add in
+ * this commit, e.g. "/A/should_fail_1". The object will be a file or
+ * a directory according to KIND being svn_node_file or svn_node_dir.
+ *
  * On success, the given file is added to the repository. So, using
  * the same name multiple times on the same repository might fail. Thus,
  * use different FILENAME arguments for every call to this function
  * (e.g. "/A/f1", "/A/f2", "/A/f3" etc).
  */
 static svn_error_t *
-prop_validation_commit_with_revprop(const char *filename,
- const char *prop_key,
- apr_ssize_t prop_klen,
- const svn_string_t *prop_val,
- svn_repos_t *repos,
- apr_pool_t *pool)
+commit_path_with_revprops(const char *path,
+ svn_node_kind_t kind,
+ const char *prop_key,
+ apr_ssize_t prop_klen,
+ const svn_string_t *prop_val,
+ svn_repos_t *repos,
+ apr_pool_t *pool)
 {
   const svn_delta_editor_t *editor;
   void *edit_baton;
   void *root_baton;
- void *file_baton;
 
   /* Prepare revision properties */
   apr_hash_t *revprop_table = apr_hash_make(pool);
 
+ /* Validate input. */
+ SVN_ERR_ASSERT(kind == svn_node_file || kind == svn_node_dir);
+
   /* Add the requested property */
- apr_hash_set(revprop_table, prop_key, prop_klen, prop_val);
+ if (prop_key)
+ apr_hash_set(revprop_table, prop_key, prop_klen, prop_val);
 
   /* Set usual author and log props, if not set already */
- if (strcmp(prop_key, SVN_PROP_REVISION_AUTHOR) != 0)
+ if (prop_key != NULL && strcmp(prop_key, SVN_PROP_REVISION_AUTHOR) != 0)
     {
       apr_hash_set(revprop_table, SVN_PROP_REVISION_AUTHOR,
                    APR_HASH_KEY_STRING,
                    svn_string_create("plato", pool));
     }
   else
- if (strcmp(prop_key, SVN_PROP_REVISION_LOG) != 0)
+ if (prop_key != NULL && strcmp(prop_key, SVN_PROP_REVISION_LOG) != 0)
       {
         apr_hash_set(revprop_table, SVN_PROP_REVISION_LOG,
                      APR_HASH_KEY_STRING,
@@ -2250,12 +2254,27 @@ static svn_error_t *
 
   SVN_ERR(editor->open_root(edit_baton, 0, pool, &root_baton));
 
- SVN_ERR(editor->add_file(filename, root_baton, NULL,
- SVN_INVALID_REVNUM, pool,
- &file_baton));
+ if (kind == svn_node_file)
+ {
+ void *file_baton;
 
- SVN_ERR(editor->close_file(file_baton, NULL, pool));
+ SVN_ERR(editor->add_file(path, root_baton, NULL,
+ SVN_INVALID_REVNUM, pool,
+ &file_baton));
 
+ SVN_ERR(editor->close_file(file_baton, NULL, pool));
+ }
+ else if (kind == svn_node_dir)
+ {
+ void *dir_baton;
+
+ SVN_ERR(editor->add_directory(path, root_baton, NULL,
+ SVN_INVALID_REVNUM, pool,
+ &dir_baton));
+
+ SVN_ERR(editor->close_directory(dir_baton, pool));
+ }
+
   SVN_ERR(editor->close_directory(root_baton, pool));
 
   SVN_ERR(editor->close_edit(edit_baton, pool));
@@ -2291,8 +2310,8 @@ prop_validation(const char **msg,
 
 
   /* Test an invalid commit log message: UTF-8 */
- err = prop_validation_commit_with_revprop
- ("/non_utf8_log_msg",
+ err = commit_path_with_revprops
+ ("/non_utf8_log_msg", svn_node_file,
              SVN_PROP_REVISION_LOG, APR_HASH_KEY_STRING,
              svn_string_create(non_utf8_string, subpool),
              repos, subpool);
@@ -2311,8 +2330,8 @@ prop_validation(const char **msg,
 
 
   /* Test an invalid commit log message: LF */
- err = prop_validation_commit_with_revprop
- ("/non_lf_log_msg",
+ err = commit_path_with_revprops
+ ("/non_lf_log_msg", svn_node_file,
              SVN_PROP_REVISION_LOG, APR_HASH_KEY_STRING,
              svn_string_create(non_lf_string, subpool),
              repos, subpool);
@@ -2337,6 +2356,71 @@ prop_validation(const char **msg,
 }
 
 
+/* Committing non-UTF-8 paths should fail. */
+static svn_error_t *
+paths_are_UTF_8(const char **msg,
+ svn_boolean_t msg_only,
+ svn_test_opts_t *opts,
+ apr_pool_t *pool)
+{
+ svn_error_t *err;
+ svn_repos_t *repos;
+ const char non_utf8_string[] = { 'a', 0xff, 'b', 0 };
+ apr_pool_t *subpool = svn_pool_create(pool);
+
+ *msg = "test if paths are validated for UTF-8";
+
+ if (msg_only)
+ return SVN_NO_ERROR;
+
+ /* Create a filesystem and repository. */
+ SVN_ERR(svn_test__create_repos(&repos, "test-repo-path-validation",
+ opts, subpool));
+
+
+ /* The next two blocks of code are identical (up to s/file/dir/). */
+
+ /* Commit a non-UTF-8 file. */
+ err = commit_path_with_revprops(non_utf8_string, svn_node_file,
+ NULL, 0, NULL,
+ repos, subpool);
+ if (err == SVN_NO_ERROR)
+ {
+ return svn_error_create(SVN_ERR_TEST_FAILED, err,
+ "Failed to reject a non-UTF-8 file");
+ }
+ else if (err->apr_err != SVN_ERR_FS_PATH_SYNTAX)
+ {
+ return svn_error_create(SVN_ERR_TEST_FAILED, err,
+ "Unexpected error code for non-UTF-8 file");
+ }
+ else
+ svn_error_clear(err);
+
+ /* Commit a non-UTF-8 dir. */
+ err = commit_path_with_revprops(non_utf8_string, svn_node_dir,
+ NULL, 0, NULL,
+ repos, subpool);
+ if (err == SVN_NO_ERROR)
+ {
+ return svn_error_create(SVN_ERR_TEST_FAILED, err,
+ "Failed to reject a non-UTF-8 dir");
+ }
+ else if (err->apr_err != SVN_ERR_FS_PATH_SYNTAX)
+ {
+ return svn_error_create(SVN_ERR_TEST_FAILED, err,
+ "Unexpected error code for non-UTF-8 dir");
+ }
+ else
+ svn_error_clear(err);
+
+ /* Done. */
+ svn_pool_destroy(subpool);
+
+ return SVN_NO_ERROR;
+}
+
+
 
 /* The test table. */
 
@@ -2355,5 +2439,6 @@ struct svn_test_descriptor_t test_funcs[] =
     SVN_TEST_PASS(node_location_segments),
     SVN_TEST_PASS(reporter_depth_exclude),
     SVN_TEST_PASS(prop_validation),
+ SVN_TEST_PASS(paths_are_UTF_8),
     SVN_TEST_NULL
   };
Index: subversion/include/svn_path.h
===================================================================
--- subversion/include/svn_path.h (revision 33124)
+++ subversion/include/svn_path.h (working copy)
@@ -437,16 +437,27 @@ svn_path_is_ancestor(const char *path1, const char
  * a repository. There may be other, OS-specific, limitations on
  * what paths can be represented in a working copy.
  *
- * ASSUMPTION: @a path is a valid UTF-8 string. This function does
- * not check UTF-8 validity.
+ * When @a assume_utf8 is @c TRUE, assume that @a path is a valid
+ * UTF-8 string and do not check UTF-8 validity.
  *
  * Return @c SVN_NO_ERROR if valid and @c SVN_ERR_FS_PATH_SYNTAX if
  * invalid.
  *
- * @since New in 1.2.
+ * @since New in 1.6.
  */
-svn_error_t *svn_path_check_valid(const char *path, apr_pool_t *pool);
+svn_error_t *
+svn_path_check_valid2(const char *path, svn_boolean_t assume_utf8,
+ apr_pool_t *pool);
 
+/**
+ * Similar to svn_path_check_valid2(), but with @a assume_utf8 set to @c TRUE.
+ *
+ * @deprecated Provided for backward compatibility with the 1.5 API.
+ */
+SVN_DEPRECATED
+svn_error_t *
+svn_path_check_valid(const char *path, apr_pool_t *pool);
+
 
 /** URI/URL stuff
  *
Index: subversion/libsvn_repos/commit.c
===================================================================
--- subversion/libsvn_repos/commit.c (revision 33124)
+++ subversion/libsvn_repos/commit.c (working copy)
@@ -295,6 +295,9 @@ add_directory(const char *path,
       (SVN_ERR_FS_GENERAL, NULL,
        _("Got source path but no source revision for '%s'"), full_path);
 
+ /* Validate for UTF-8. */
+ SVN_ERR(svn_path_check_valid2(path, FALSE, pool));
+
   if (copy_path)
     {
       const char *fs_path;
@@ -440,6 +443,9 @@ add_file(const char *path,
       (SVN_ERR_FS_GENERAL, NULL,
        _("Got source path but no source revision for '%s'"), full_path);
 
+ /* Validate for UTF-8. */
+ SVN_ERR(svn_path_check_valid2(path, FALSE, pool));
+
   if (copy_path)
     {
       const char *fs_path;

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-09-17 14:42:44 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.