[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] check-mime-type.pl

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Sat, 12 Sep 2020 03:14:55 +0000

Nathan Hartman wrote on Fri, Sep 11, 2020 at 10:38:24 -0400:
> Subject: [PATCH] Check also property-modified files in hook script
> check-mime-type.pl
>
> * contrib/hook-scripts/check-mime-type.pl:
> (proplist) add property-modified files for checks

There are several changes not mentioned in the log message:

- Script name fix in comments

- Move 'use' before 'BEGIN'
  (incidentally, I'm not actually sure that's a correct change, given
  the BEGIN block implies perls as old as 5.4 are supported.)

- Add docstrings

> +++ b/contrib/hook-scripts/check-mime-type.pl
> @@ -1,18 +1,19 @@
> #!/usr/bin/env perl
>
> # ====================================================================
> -# commit-mime-type-check.pl: check that every added file has the
> -# svn:mime-type property set and every added file with a mime-type
> -# matching text/* also has svn:eol-style set.
> +# check-mime-type.pl: check that every added or property-modified file
> +# has the svn:mime-type property set and every added or property-modified
> +# file with a mime-type matching text/* also has svn:eol-style set.

I don't think it's acceptable for a script called "check-mime-type.pl"
to start enforcing svn:eol-style as well. This violates the principles
of least surprise and of "Do one thing and do it well". Also, the
choice of text/* is a bit arbitrary, and would seem to have both false
positives and false negatives: e.g., application/xml and text/x-diff.

I'm not even sure it's acceptable for this script to start enforcing
svn:mime-type's presence upon propmods, either. It's not exactly
a natural extension that existing users would expect.

> # ====================================================================
> # Copyright (c) 2000-2004 CollabNet. All rights reserved.
> +# Copyright (c) 2014-2020 Apache Software Foundation (ASF).

How did you come up with these numbers? The script received commits in
2013 and 2015, so the end point should be 2013. (Also, the next line
talks of a "COPYING" file which doesn't exist, but that's a preëxisting
problem.)

> @@ -100,19 +101,20 @@ my $tmp_dir = '/tmp';
> -# Figure out what files have added using svnlook.
> -my @files_added;
> +# Figure out what files have been added/property-modified using svnlook.
> +my @paths_to_check;
> +my $props_changed_re = qr/^(?:A |[U_ ]U) (.*[^\/])$/;
> foreach my $line (&read_from_process($svnlook, 'changed', $repos, '-t', $txn))
> {
> - # Add only files that were added to @files_added
> - if ($line =~ /^A. (.*[^\/])$/)
> + # Add only files that were added/property-modified to @paths_to_check
> + if ($line =~ /$props_changed_re/)

Why did you change /^A./ to /^A /?

> Subject: [PATCH] Enhanced platform support of hook script check-mime-type.pl
>
> * contrib/hook-scripts/check-mime-type.pl:
> (svnlook) Cross Platform: sane executable determination

> +++ b/contrib/hook-scripts/check-mime-type.pl
> @@ -28,6 +28,15 @@
>
> use strict;
> use Carp;
> +use Config;
> +use Cwd;

«use Cwd qw(getcwd);» would be better.

> +use Env;

Why?

This syntax (according to «perldoc Env») "ties all existing environment
variables ("keys %ENV") to scalars". This seems like a dangerous thing
to do; in fact, it negates the «use strict» a few lines above.

I suspect this line should be deleted.

> +use File::Which qw(which);
> +use File::Temp qw(tempdir tempfile);
> +
> +my $old_dir;
> +my $os_windows = $^O eq 'MSWin32';
> +my $os_dos = $^O eq 'dos';
>
> # Turn on warnings the best way depending on the Perl version.
> BEGIN {
> @@ -35,6 +44,12 @@ BEGIN {
> { require warnings; import warnings; }
> else
> { $^W = 1; }
> +
> + $old_dir = getcwd;
> +}
> +
> +END {
> + chdir($old_dir);
> }

Why is this necessary?

>
>
> @@ -42,31 +57,58 @@ BEGIN {
> # Configuration section.
>
> # Svnlook path.
> -my $svnlook = "/usr/bin/svnlook";
> +my $svnlook;
>
> # Since the path to svnlook depends upon the local installation
> # preferences, check that the required program exists to insure that
> # the administrator has set up the script properly.
> {
> - my $ok = 1;
> - foreach my $program ($svnlook)
> + my $svnlook_exe = 'svnlook' . $Config{_exe};
> + my @svnlook_paths;
> +
> + if ($os_windows || $os_dos)
> + {
> + # On Windows/DOS the environment is available
> + my $svn_bindir = $ENV{'SVN_BINDIR'};

Shouldn't the condition be «if (defined $ENV{SVN_BINDIR})»?

> + if (defined $svn_bindir and length $svn_bindir)
> + {
> + $svnlook_paths[++$#svnlook_paths] = "$svn_bindir/$svnlook_exe";

Why does this line use «$arrayvar[++$#arrayvar] = $foo» rather than
«push @arrayvar, $foo», which is equivalent and idiomatic?

> + }
> +
> + $svnlook = which($svnlook_exe);
> + if (defined $svnlook and length $svnlook)
> + {
> + $svnlook_paths[++$#svnlook_paths] = $svnlook;
> + }

Why do we bother to have $svnlook tested for existence and
executability? Don't we trust File::Which::which() to work as
advertised?

> + undef $svnlook;
> + }
> +
> + # On Unix the environment is empty, define fallback with absolute path
> + $svnlook_paths[++$#svnlook_paths] = "/usr/bin/$svnlook_exe";
> + foreach my $program (@svnlook_paths)
> {
> if (-e $program)
> {
> unless (-x $program)
> {
> - warn "$0: required program `$program' is not executable, ",
> - "edit $0.\n";
> - $ok = 0;
> + warn "$0: program `$program' is not executable.\n";
> + next;
> }
> + $svnlook = $program;
> + last;
> }
> else
> {
> - warn "$0: required program `$program' does not exist, edit $0.\n";
> - $ok = 0;
> + warn "$0: program `$program' does not exist.\n";
> + next;
> }
> }

What's the point of issuing a warning when there are some more
iterations yet to go?

If I'm reading this correctly, this entire block should be reduced to:

    # Use «||» rather than «//» since the BEGIN block implies versions of Perl that lack «//» are supported.
    my $svnlook = +(defined($ENV{SVN_BINDIR}) ? "$ENV{SVN_BINDIR}/$svnlook_exe" : undef) || which("svnlook") || "/usr/bin/svnlook";
    die … unless -x $svnlook and (-f _ or -l _);

I changed -e to (-f or -l) since -x implies -e.

I also disabled the fallbacks when the envvar is set but doesn't point
to a usable executable. I'm not married to these semantics, but I think
these semantics may be expected _given the particular variable name_:
SVN_BINDIR is the name of the configure AC_SUBST() variable that's
cooked into various scripts installed by install-tools.

> - exit 1 unless $ok;
> + unless (defined $svnlook and length $svnlook)
> + {
> + warn "$0: required program `$svnlook_exe' does not exist or is not executable, maybe have to edit $0.\n";
> + exit 1;
> + }
> }
>
> ######################################################################
> @@ -95,9 +137,9 @@ sub ACCESS_READ_WRITE () { 'read-write' }
> ######################################################################
> # Harvest data using svnlook.
>
> -# Change into /tmp so that svnlook diff can create its .svnlook
> +# Change into temp dir so that svnlook diff can create its .svnlook

When this script was first added, svnlook would create
a $TMPDIR/svnlook.$unique_integer directory (without a leading dot, and
in $TMPDIR rather than in cwd). svnlook doesn't do that nowadays, so
I see no reason to chdir at all — …

> # directory.
> -my $tmp_dir = '/tmp';
> +my $tmp_dir = tempdir( TMPDIR => 1, CLEANUP => 1 );

… in which case, this line — though otherwise correct — should simply be
deleted altogether, and the File::Temp and Cwd uses with it.

> @@ -220,19 +262,66 @@ sub safe_read_from_pipe
> + while ($command = shift)
> + {
> + if ("-m" eq $command)

- This is a general-purpose "run a child process" function, so it can't
  special-case "-m".

- If it could, it would have to check for all four variants: «-m foo»,
  «-mfoo», «--message=foo», and «--message foo».

- It would also have to look for the «--» end-of-options guard, since
  «-m» may occur after one. (This can actually happen in the second
  callsite of read_from_process().)

- None of the callers pass -m in the first place, so this is dead code.

> + {
> + $comment = shift;
> + my ($handle, $tmpfile) = tempfile( DIR => $tmp_dir);
> + print $handle $comment;
> + close($handle);
> +
> + push(@commandline, "--file");
> + push(@commandline, $tmpfile);

There may have been a "--" end-of-options guard earlier on the command
line.

> + }
> + else
> + {
> + # Munge the command to protect it from the command line
> + $command =~ s/\"/\\\"/g;
> + if ($command =~ m"\s") { $command = "\"$command\""; }
> + if ($command eq "") { $command = "\"\""; }

I'm pretty sure this escaping is incomplete. Use a library function
rather than roll your own escaping.

> + if ($command =~ m"\n")
> + {
> + warn "$0: carriage return detected in command - may not work\n";

If it "may not work", then die() rather than warn().

> + }
> + push(@commandline, $command);

> + open(SAFE_READ, "@commandline |")

> They can be applied in sequence:
> svn-patch-01-c73cb4415d.txt
> svn-patch-02-68cc555740.txt
>
> In principle I think the changes are useful, though might change
> behavior in a surprising way for those who are using the hook as it
> stands now.
>

Usefulness aside, I think the first patch is unacceptable because of
backwards compatibility considerations and the second patch would not be
acceptable unless revised.

> Would it make sense to copy check-mime-type.pl, calling it, say,
> check-mime-type-2.pl, and apply the changes to that file, keeping the
> original as it is now?

For the first patch, moving the new behaviours to a new script would
address the backwards compatibility point, as would making the new
behaviours optional and opt-in. However, neither would address the
other review points.

The second patch doesn't appear to raise any backwards compatibility
concerns.

Cheers,

Daniel
Received on 2020-09-12 05:15:09 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.