[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] UTF-8 fourth round...

From: Marcus Comstedt <marcus_at_mc.pp.se>
Date: 2002-06-29 02:39:15 CEST

Karl Fogel <kfogel@newton.ch.collab.net> writes:

> Um, the log message is what helps us *decide* what parts of the patch
> to apply. We can always edit the log message appropriately if we
> don't apply certain parts of the change... But not if there's nothing
> to edit!
>
> The text "conversion of text strings between internal repesentation
> (UTF-8) and external representation (locale specific)" is a summary of
> the change, but it isn't a log message. See the HACKING file, and run
> "svn log" for examples of full log messages.
>
> The log message is hugely important for anyone to understand the
> change. It can save the reviewer acres of time. Please (pretty
> please) post yours and/or attach it to issue #494...

Ok, message for patch "svn_utf_2366_libs.patch" (note that some files
occur in more than one item):

Provide UTF-8 conversion infrastucture, and enable use of UTF-8 as
internal representation of pathnames, properties, usernames and error
messages by making conversions where such values are interfaced with
APR/libc.

* configure.in: Added option --enable-utf8 to enable conversions
  to/from UTF-8. When conversions are disabled, only ASCII is allowed
  in strings.

* subversion/libsvn_subr/utf.c: New file; conversion functions for
converting text strings between UTF-8 and system native character
encoding.

* subversion/include/svn_utf.h: New file with prototypes for public
functions in subversion/libsvn_subr/svn_utf.c.

* subversion/libsvn_subr/libsvn_subr.dsp: Add new file utf.c to list
of source files.

* subversion/libsvn_subr/io.c
  (svn_io_check_path, svn_io_open_unique_file, svn_io_copy_file,
   svn_io_append_file, svn_io_copy_dir_recursively,
   svn_io_make_dir_recursively, svn_io_file_affected_time,
   svn_io_file_checksum, svn_io_set_file_read_only,
   svn_io_set_file_read_write, svn_io_set_file_executable,
   svn_string_from_file, svn_io_remove_file, svn_io_remove_dir,
   svn_io_get_dirents, svn_io_run_cmd, svn_io_detect_mimetype):
Convert pathnames from UTF-8 to native character set before calling
apr_stat/apr_file_open/apr_file_copy/apr_file_append/apr_dir_make/
apr_dir_make_recursively/apr_file_attrs_set/apr_file_remove/
apr_dir_open/apr_dir_remove/apr_procattr_dir_set/apr_proc_create.
  (svn_string_from_aprfile): Convert pathname returned by
apr_file_name_get to UTF-8 before using it with svn_error_createf.
  (svn_io_remove_dir, svn_io_get_dirents): Convert pathnames returned
by apr_dir_read to UTF-8
  (svn_io_run_diff, svn_io_run_diff3): Convert path to diff/diff3
(generated by autoconf) to UTF-8 so that it can be used with
svn_io_run_cmd.
  (svn_io_file_open, svn_io_stat, svn_io_file_rename, svn_io_dir_make,
   svn_io_dir_open, svn_io_dir_remove_nonrecursive, svn_io_dir_read):
New wrappers for APR functions dealing with pathnames that did not
already have a svn_io_*-wrapper.
  (svn_io_file_printf): New function that should be used instead of
apr_file_printf when writing UTF-8 text to a stream with native
character encoding (such as stdout/stderr).

* subversion/include/svn_io.h: Added prototypes for new functions in
io.c.

* subversion/libsvn_client/externals.c,
  subversion/libsvn_client/checkout.c,
  subversion/libsvn_client/copy.c,
  subversion/libsvn_client/repos_diff.c,
  subversion/libsvn_client/export.c,
  subversion/libsvn_client/update.c, subversion/libsvn_client/add.c,
  subversion/libsvn_client/commit.c, subversion/libsvn_repos/repos.c,
  subversion/libsvn_subr/target.c, subversion/libsvn_wc/merge.c,
  subversion/libsvn_wc/props.c, subversion/libsvn_wc/diff.c,
  subversion/libsvn_wc/copy.c, subversion/libsvn_wc/util.c,
  subversion/libsvn_wc/adm_crawler.c, subversion/libsvn_wc/log.c,
  subversion/libsvn_wc/adm_ops.c, subversion/libsvn_wc/adm_files.c,
  subversion/libsvn_wc/update_editor.c,
  subversion/libsvn_wc/questions.c, subversion/libsvn_wc/translate.c:
Replaced APR calls dealing with pathnames with calls to wrappers.

* subversion/libsvn_client/repos_diff.c
  (temp_file_cleanup_s, temp_file_plain_cleanup_handler,
   temp_file_cleanup_register): Pre-de-UTF pathname of tempfile for
use with apr_file_remove in pool cleanup handler.

* subversion/libsvn_fs/fs.c
  (svn_fs_create_berkeley, svn_fs_open_berkeley,
   svn_fs_berkely_recover, svn_fs_delete_berkeley): Convert repository
path from UTF-8 to native path before passing it to env->open or
env->remove.

* subversion/libsvn_subr/target.c
  (svn_path_get_absolute): Convert pathname from UTF-8 to native
character set before calling apr_filepath_merge, then convert result
back to UTF-8.
  (svn_path_get_absolute, svn_path_condense_targets,
   svn_path_remove_redundancies): Changed prototype of
svn_path_get_absolute to use const char ** for return parameter.

* subversion/include/svn_path.h: Changed prototype for
svn_path_get_absolute to take const char **.

* subversion/libsvn_client/client.h,
  subversion/libsvn_client/commit_util.c: Changed prototype for
svn_client__condense_commit_items to take const char **, due to
prototype change of svn_path_get_absolute.

* subversion/libsvn_subr/svn_error.c
  (svn_handle_error, svn_handle_warning): Convert text of error
message from UTF-8 to native charset before displaying it.

* subversion/libsvn_client/diff.c
  (display_prop_diffs, diff_file_changed): Use svn_io_file_printf to
print headers and property diffs, so that path names and properties
are converted from internal UTF-8 representation into something
readable.

* subversion/libsvn_wc/props.c
  (append_prop_conflict): Convert property conflict description text
from UTF-8 to native character set before writing it to file.

* subversion/libsvn_client/auth.c: Convert username returned by
apr_get_username to UTF-8.

Message for patch "svn_utf_2366_client.patch":

Adapted the cmdline client, svnadmin and svnlook to the notion that
textual information exchanged with the svn libraries should be UTF-8
encoded.

* subversion/clients/cmdline/props.c
  (svn_cl__print_prop_hash, svn_cl__print_prop_names): Convert
properties from UTF-8 to native character encoding before printing
them to stdout.

* subversion/clients/cmdline/cl.h
  (svn_cl__opt_state_t): Added comments to mark which text strings
will always contain UTF-8 encoded characters, even in the client
layer.
  (svn_cl__args_to_target_array): Added comment about result being
UTF-8 encoded.
  (svn_cl__print_prop_hash, svn_cl__print_prop_names,
   svn_cl__get_trace_update_editor, svn_cl__get_trace_commit_editor):
Changed prototype to return svn_error_t *.

* subversion/clients/cmdline/propdel-cmd.c
  (svn_cl__propdel): Convert property name and pathname from UTF-8 to
native character encoding before printing them to stdout.

* subversion/clients/cmdline/util.c
  (array_push_str_utf8): New function used by svn_cl__parse_num_args
and svn_cl__parse_all_args.
  (svn_cl__parse_num_args, svn_cl__parse_all_args): Convert args to
UTF-8 as they are parsed, using array_push_str_utf8.
  (parse_path): Mark the "path" argument as being UTF-8 encoded.
Convert the revision suffix before passing it to
svn_cl__parse_revision, which works on the native character encoding.
  (svn_cl__args_to_target_array): Convert target names to UTF-8.
  (svn_cl__edit_externally): Convert UTF-8 text to native character
encoding before passing it to the editor, then convert the result
back. Also, convert the UTF-8 pathname generated by
svn_io_open_unique_file before using it.
  (log_msg_baton, svn_cl__make_log_msg_baton): Mark "base_dir" as
being UTF-8 encoded.
  (svn_cl__get_log_message): Convert log message to UTF-8 before
returning it.

* subversion/clients/cmdline/prompt.c
  (svn_cl__prompt_user): Convert prompt from UTF-8 before printing
it. Convert user input to UTF-8 before returning it.

* subversion/clients/cmdline/propget-cmd.c
  (svn_cl__propget): Convert properties from UTF-8 to native character
encoding before printing them to stdout.

* subversion/clients/cmdline/log-cmd.c
  (log_message_receiver): Convert log message text and changed
pathnames from UTF-8 to native character encoding before printing them
to stdout.

* subversion/clients/cmdline/status.c
  (svn_cl__print_status_list): Convert the paths in the statusarray
from UTF-8 to native character encoding.

* subversion/clients/cmdline/help-cmd.c
  (print_version_info): Convert RA modules description text from UTF-8
to native character encoding before printing it.
  (svn_cl__help): Convert back arguments that have previously been
UTF-8 converted by svn_cl__args_to_target_array.

* subversion/clients/cmdline/propset-cmd.c
  (svn_cl__propset): Convert property value to UTF-8 if taken from a
file (command line arguments are already converted); convert property
name and pathname to native character encoding for status printout.

* subversion/clients/cmdline/proplist-cmd.c
  (svn_cl__proplist): Convert properties from UTF-8 to native
character encoding before printing them to stdout.

* subversion/clients/cmdline/main.c
  (main): Enable LC_CTYPE locale settings. Convert command line
arguments to UTF-8 before using them with svn_error_createf. Convert
--xml-file, -d, --username, --password, and -x parameters to UTF-8
before storing them in opt_state. Convert argment of -F to UTF-8
before using it with svn_string_from_file and svn_wc_entry. Convert
both pathname and contents of --targets to UTF-8.

* subversion/clients/cmdline/feedback.c
  (notify): Convert pathname from UTF-8 to native character encoding
before printing it.

* subversion/clients/cmdline/propedit-cmd.c
  (svn_cl__propedit): Convert property name and pathname from UTF-8 to
native character encoding before printing them to stdout.

* subversion/clients/cmdline/info-cmd.c
  (svn_cl__info): Convert all textual information about the target
from UTF-8 to native character encoding before printing it.

* subversion/svnadmin/svnadmin.h
  (shctx_t): Added a comment about the cwd parameter always being
UTF-8 encoded.

* subversion/svnadmin/main.c
  (print_tree): Convert entry names from UTF-8 to native character set
before printing them.
  (main): Enable LC_CTYPE locale settings. Convert repository path,
lscr path, setlog path, deltify path, and any txn names to UTF-8.
Convert txn and revision descriptions to native character encoding
before printing them.

* subversion/svnadmin/shell.c
  (path_stat, compute_new_path, print_dirent): Added comments about
pathname arguments being UTF-8 encoded.
  (print_dirent): Convert entry names from UTF-8 to native character
encoding before printing them.
  (cd, ls): Convert user supplied pathname to UTF-8.
  (display_prompt): Convert name of cwd from UTF-8 to native character
encoding before printing it.
  (svnadmin_run_shell): Handle possible errors from display_prompt.

* subversion/svnlook/main.c
  (svnlook_ctxt_t): Added comment about the txn_name parameter always
being UTF-8 encoded.
  (get_property, print_dirs_changed_tree, print_changed_tree,
   open_writable_binary_file, dump_contents, print_diff_tree,
   print_ids_tree): Added comments about some arguments being UTF-8
encoded.
  (print_dirs_changed_tree, print_changed_tree, print_ids_tree,
   print_tree): Changed prototype to return svn_error_t *. Convert
pathnames to native character encoding before printing them.
  (print_diff_tree): Convert pathnames from UTF-8 before printing
them. Use wrappers for deleting temporary files, since they have
UTF-8 encoded pathnames.
  (open_writable_binary_file): Use wrappers to open file since
pathname is UTF-8 encoded.
  (do_log): Convert log message text from UTF-8 to native character
encoding before printing it.
  (do_author): Convert author name from UTF-8 to native character
encoding before printing it.
  (do_dirs_changed, do_changed, do_tree): Handle possible errors from
functions which have changed their prototype.
  (main): Enable LC_CTYPE locale settings. Convert repository path
and txn name to UTF-8.

*Phew* Sorry if the quality of the comments dropped a little towards
the end, but this thing took 3 hours to write, and it's now like 3 AM
and I'm pretty tired...

  // Marcus

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 29 02:44:51 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.