[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: check-mime-type, Windows client, non-ASCII path

From: Eliop <igtorque.eliop_at_googlemail.com>
Date: Thu, 2 Feb 2012 20:12:52 +0100

Hello, Stefan.

El 2 de febrero de 2012 10:33, Stefan Sperling <stsp_at_elego.de> escribió:
> On Wed, Feb 01, 2012 at 09:00:39AM +0100, Ignacio González (Eliop) wrote:
> > Clients: Windows-XP, Windows 7, svn 1.6.16 (Spanish)
> > Server: Linux (CentOS), svn 1.6.16 (Spanish)
> >
> > Repository created OK
> > Hundreds of revisions already checked-in OK
> > Hook "check-mime-type" (bash) added in server
> > A couple of revisions checked-in OK
> > New file added with non-ASCII characters -> Problem:
> > Path name (in Windows, client): C:\Usuarios\arenero\Inútil.TXT
> > (note the u with an acute accent: ú)
> >
> > C:\Usuarios\arenero>svn ci acentos -m "Prueba 1"
> > Adding         acentos
> > Adding         acentos\In£til.TXT
> > Transmitting file data .svn: Commit failed (details follow):
> > svn: Commit blocked by pre-commit hook (exit code 1) with output:
> > /opt/csvn/data/repositories/telecontrol/hooks/check-mime-type:
> > `/opt/csvn/bin/sv
> > nlook proplist /opt/csvn/data/repositories/arenero -t 44-1e --verbose
> > acentos/In
> > ?\195?\186til.TXT' failed with this output:
> > svnlook: Path 'acentos/In?\195?\186til.TXT' does not exist
>
> 195 186 in hex is 0xc38a
>
> $ echo 0xc3ba | xxd -r | ExplicateUTF8
> The sequence 0xC3     0xBA
>             11000011 10111010
> is a valid UTF-8 character encoding equivalent to UTF32 0x000000FA.
>
> (ExplicateUTF8 is part of the 'unitools' suit).
>
> Written out as UTF-8 in email, unicode code point 0xfa is the character 'ú'.

Right.

> > To help diagnose it, I tried to check out an already existing file with
> > accents in its name
> > (checked in before the Hook "check-mime-type" (bash) was added in the
> > server).
> > Check out fails.
>
> And how exactly does it fail? What's the error message?
> Does it print the same error message as you get with the hook?
>
> Whenever you write a problem report and you describe parts of the
> problem by "X fails" without showing how X fails, recipients of your
> report can only make wild guesses.

Agree, I forgot to detail this part.

And I should really have been more careful! What I was trying to do is
to checkout the file directly, instead of its parent directory. So:

svn co http://localhost/svn/arenero/pru/%fasame.TXT

fails telling me that blah,blah was a file, not a directory, but

svn co http://localhost/svn/arenero/pru/

succeeds.

Stupid, stupid, stupid.

> > Oh, my God.
>
> Don't panic. This is nothing that cannot be fixed.
> You'll just have to figure out where it goes wrong.
>
> You didn't specify what type of server you are running (svnserve or
> mod_dav_svn), so I'm going to guess that you're using mod_dav_svn,
> i.e. an Apache HTTPD server is serving your repositories.

I'm using httpd / mod_dav_svn, in fact, CollabNet Subversion Edge.

> In that case, issue #2487 might be the problem:
> http://subversion.tigris.org/issues/show_bug.cgi?id=2487
> Though this would not explain a failing checkout, only problems
> in the hook script. Does your hook script set any of the LANG, LC_CTYPE
> or LC_ALL environment variables to some value? (If possible, please just
> show us the entire hook script.)

Locale in this Linux server is:

[csvn_at_svn tmp]$ locale
LANG=es_ES.UTF-8
LC_CTYPE="es_ES.UTF-8"
LC_NUMERIC="es_ES.UTF-8"
LC_TIME="es_ES.UTF-8"
LC_COLLATE="es_ES.UTF-8"
LC_MONETARY="es_ES.UTF-8"
LC_MESSAGES="es_ES.UTF-8"
LC_PAPER="es_ES.UTF-8"
LC_NAME="es_ES.UTF-8"
LC_ADDRESS="es_ES.UTF-8"
LC_TELEPHONE="es_ES.UTF-8"
LC_MEASUREMENT="es_ES.UTF-8"
LC_IDENTIFICATION="es_ES.UTF-8"
LC_ALL=
[csvn_at_svn tmp]$

Here's the hook script (note that I have to comment out the line with the
check-mime-type invocation in order to check in new 'accented' files:

[csvn_at_svn tmp]$ cat /opt/csvn/data/repositories/arenero/hooks/pre-commit
#!/bin/sh

# pre-commit
# PRE-COMMIT HOOK

REPOS="$1"
TXN="$2"

# Make sure that the log message contains some text.
SVNLOOK=/opt/csvn/bin/svnlook
$SVNLOOK log -t "$TXN" "$REPOS" | grep "[a-zA-Z0-9]" > /dev/null
if [ $? -ne 0 ]
then
  echo "*** Debe introducir un texto para ***" > /dev/stderr
  echo "*** describir los cambios realizados ***" > /dev/stderr
  exit 1
fi

# Check that every added file has the svn:mime-type property set
# and every added file with a mime-type matching text/* also has
# svn:eol-style set
#/opt/csvn/data/repositories/telecontrol/hooks/check-mime-type
"$REPOS" "$TXN" || exit 1

# All checks passed, so allow the commit.
exit 0
[csvn_at_svn tmp]$

And /opt/csvn/data/repositories/telecontrol/hooks/check-mime-type is:

[csvn_at_svn tmp]$ cat
/opt/csvn/data/repositories/telecontrol/hooks/check-mime-type
#!/usr/bin/env perl

# ====================================================================
# commit-mime-type-check.pl: check that every added file has the
# svn:mime-type property set and every added file with a mime-type
# matching text/* also has svn:eol-style set. If any file fails this
# test the user is sent a verbose error message suggesting solutions and
# the commit is aborted.
#
# Usage: commit-mime-type-check.pl REPOS TXN-NAME
# ====================================================================
# Most of commit-mime-type-check.pl was taken from
# commit-access-control.pl, Revision 9986, 2004-06-14 16:29:22 -0400.
# ====================================================================
# Copyright (c) 2000-2004 CollabNet. All rights reserved.
#
# This software is licensed as described in the file COPYING, which
# you should have received as part of this distribution. The terms
# are also available at http://subversion.tigris.org/license.html.
# If newer versions of this license are posted there, you may use a
# newer version instead, at your option.
#
# This software consists of voluntary contributions made by many
# individuals. For exact contribution history, see the revision
# history and logs, available at http://subversion.tigris.org/.
# ====================================================================

# Turn on warnings the best way depending on the Perl version.
BEGIN {
  if ( $] >= 5.006_000)
    { require warnings; import warnings; }
  else
    { $^W = 1; }
}

use strict;
use Carp;

######################################################################
# Configuration section.

# Svnlook path.
my $svnlook = "/opt/csvn/bin/svnlook";

# Since the path to svnlook depends upon the local installation
# preferences, check that the required program exists to insure that
# the administrator has set up the script properly.
{
  my $ok = 1;
  foreach my $program ($svnlook)
    {
      if (-e $program)
        {
          unless (-x $program)
            {
              warn "$0: required program `$program' is not executable, ",
                   "edit $0.\n";
              $ok = 0;
            }
        }
      else
        {
          warn "$0: required program `$program' does not exist, edit $0.\n";
          $ok = 0;
        }
    }
  exit 1 unless $ok;
}

######################################################################
# Initial setup/command-line handling.

&usage unless @ARGV == 2;

my $repos = shift;
my $txn = shift;

unless (-e $repos)
  {
    &usage("$0: repository directory `$repos' does not exist.");
  }
unless (-d $repos)
  {
    &usage("$0: repository directory `$repos' is not a directory.");
  }

# Define two constant subroutines to stand for read-only or read-write
# access to the repository.
sub ACCESS_READ_ONLY () { 'read-only' }
sub ACCESS_READ_WRITE () { 'read-write' }

######################################################################
# Harvest data using svnlook.

# Change into /tmp so that svnlook diff can create its .svnlook
# directory.
my $tmp_dir = '/tmp';
chdir($tmp_dir)
  or die "$0: cannot chdir `$tmp_dir': $!\n";

# Figure out what files have added using svnlook.
my @files_added;
foreach my $line (&read_from_process($svnlook, 'changed', $repos, '-t', $txn))
  {
                # Add only files that were added to @files_added
    if ($line =~ /^A. (.*[^\/])$/)
      {
        push(@files_added, $1);
      }
  }

my @errors;
foreach my $path ( @files_added )
        {
                my $mime_type;
                my $eol_style;

                # Parse the complete list of property values of the
file $path to extract
                # the mime-type and eol-style
                foreach my $prop (&read_from_process($svnlook,
'proplist', $repos, '-t',
                                  $txn, '--verbose', $path))
                        {
                                if ($prop =~ /^\s*svn:mime-type : (\S+)/)
                                        {
                                                $mime_type = $1;
                                        }
                                elsif ($prop =~ /^\s*svn:eol-style : (\S+)/)
                                        {
                                                $eol_style = $1;
                                        }
                        }

                # Detect error conditions and add them to @errors
                if (not $mime_type)
                        {
                                push @errors, "$path : svn:mime-type
is not set";
                        }
                elsif ($mime_type =~ /^text\// and not $eol_style)
                        {
                                push @errors, "$path :
svn:mime-type=$mime_type but svn:eol-style is not set";
                        }
        }

# If there are any errors list the problem files and give information
# on how to avoid the problem. Hopefully people will set up auto-props
# and will not see this verbose message more than once.
if (@errors)
  {
    warn "$0:\n\n",
         join("\n", @errors), "\n\n",
                                 <<EOS;

    Every added file must have the svn:mime-type property set. In
    addition text files must have the svn:eol-style property set.

    For binary files try running
    svn propset svn:mime-type application/octet-stream path/of/file

    For text files try
    svn propset svn:mime-type text/plain path/of/file
    svn propset svn:eol-style native path/of/file

    You may want to consider uncommenting the auto-props section
    in your ~/.subversion/config file. Read the Subversion book
    (http://svnbook.red-bean.com/), Chapter 7, Properties section,
    Automatic Property Setting subsection for more help.
EOS
    exit 1;
  }
else
  {
    exit 0;
  }

sub usage
{
  warn "@_\n" if @_;
  die "usage: $0 REPOS TXN-NAME\n";
}

sub safe_read_from_pipe
{
  unless (@_)
    {
      croak "$0: safe_read_from_pipe passed no arguments.\n";
    }
  print "Running @_\n";
  my $pid = open(SAFE_READ, '-|');
  unless (defined $pid)
    {
      die "$0: cannot fork: $!\n";
    }
  unless ($pid)
    {
      open(STDERR, ">&STDOUT")
        or die "$0: cannot dup STDOUT: $!\n";
      exec(@_)
        or die "$0: cannot exec `@_': $!\n";
    }
  my @output;
  while (<SAFE_READ>)
    {
      chomp;
      push(@output, $_);
    }
  close(SAFE_READ);
  my $result = $?;
  my $exit = $result >> 8;
  my $signal = $result & 127;
  my $cd = $result & 128 ? "with core dump" : "";
  if ($signal or $cd)
    {
      warn "$0: pipe from `@_' failed $cd: exit=$exit signal=$signal\n";
    }
  if (wantarray)
    {
      return ($result, @output);
    }
  else
    {
      return $result;
    }
}

sub read_from_process
  {
  unless (@_)
    {
      croak "$0: read_from_process passed no arguments.\n";
    }
  my ($status, @output) = &safe_read_from_pipe(@_);
  if ($status)
    {
      if (@output)
        {
          die "$0: `@_' failed with this output:\n", join("\n", @output), "\n";
        }
      else
        {
          die "$0: `@_' failed with no output.\n";
        }
    }
  else
    {
      return @output;
    }
}
[csvn_at_svn tmp]$

> See the issue link for more information and some workarounds (patches,
> but also an additional apache module you could load).
> A fix has just recently been committed but it is for 1.8. We cannot
> backport it to 1.7 because it requires API changes.

I will give it a try when I understand it :-) I hope to find some free
time soon.

> The character ú is a character which has a diacritic so another
> possible explanation is a problem with NFC/NFD normalisation.
> See http://subversion.tigris.org/issues/show_bug.cgi?id=2464
> This usually happens when MacOS X clients are involved. But in theory any
> Windows or Linux client could cause the same problem depening on how
> tools used on the client machine normalise UTF-8.

Ditto, I'll give it a try.

> Can you check if either of these apply?
> If not, we'll need to dig further.

OK, I'll investigate further.
Just to summarize, I have a problem and a no-problem:

Problem: how to use the aforementioned check-mime-type with 'accented' files
checked-in from Windows clients.

No-problem: how to check out 'accented' files already in the
repository with a Linux client. "Solved".
Received on 2012-02-02 20:13:26 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.