[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] check-mime-type.pl

From: M. Buecher <maddes+subversion_at_maddes.net>
Date: Mon, 21 Dec 2020 16:22:19 +0100

Hello Daniel,
Dear all,

thanks for your detailed review, Daniel. Finally got some more leisure
time to care about it.
I deceided to concentrate on the first part to handle property-modified
files plus your overall comments.
Left the Windows support aside as this has to be fine-tuned a lot, as I
learned a lot from your comments, thanks.

On 2020-09-12 05:14, Daniel Shahaf wrote:
> Nathan Hartman wrote on Fri, Sep 11, 2020 at 10:38:24 -0400:
>> Subject: [PATCH] Check also property-modified files in hook script
>> check-mime-type.pl
>>
>> * contrib/hook-scripts/check-mime-type.pl:
>> (proplist) add property-modified files for checks
>
> There are several changes not mentioned in the log message:
>
> - Script name fix in comments
>
> - Move 'use' before 'BEGIN'
> (incidentally, I'm not actually sure that's a correct change, given
> the BEGIN block implies perls as old as 5.4 are supported.)
>
> - Add docstrings
>
>> +++ b/contrib/hook-scripts/check-mime-type.pl
>> @@ -1,18 +1,19 @@
>> #!/usr/bin/env perl
>>
>> #
>> ====================================================================
>> -# commit-mime-type-check.pl: check that every added file has the
>> -# svn:mime-type property set and every added file with a mime-type
>> -# matching text/* also has svn:eol-style set.
>> +# check-mime-type.pl: check that every added or property-modified
>> file
>> +# has the svn:mime-type property set and every added or
>> property-modified
>> +# file with a mime-type matching text/* also has svn:eol-style set.
>
> I don't think it's acceptable for a script called "check-mime-type.pl"
> to start enforcing svn:eol-style as well. This violates the principles
> of least surprise and of "Do one thing and do it well". Also, the
> choice of text/* is a bit arbitrary, and would seem to have both false
> positives and false negatives: e.g., application/xml and text/x-diff.
>
> I'm not even sure it's acceptable for this script to start enforcing
> svn:mime-type's presence upon propmods, either. It's not exactly
> a natural extension that existing users would expect.

First I am not the original author of that hook script, just using it
with those modifications with Subversion 1.6 up to 1.14.0 on
Debian/Ubuntu and Windows.
Some people (like me) want to enforce "svn:mime-type" for some
repositories and using that script is optional to admins.
I agree with the original author to check mime-type plus EoL in a single
script, as EoL setting can be important to text files.
Enforcing a policy on creation but not on changes is inconsistent,
that's the reason for this fix.

But I do understand your worries and made the functionalities optional,
setting the default exactly as it acts right now.
Completely removing the EoL check would change its fucntionality after
16 years.
New version of the patch, log and script itself attached. Will change
the Pull Request according to upcoming comments.

Additionally it could be renamed, e.g. to
"check-mime-type-and-text-eol.pl"?

>> #
>> ====================================================================
>> # Copyright (c) 2000-2004 CollabNet. All rights reserved.
>> +# Copyright (c) 2014-2020 Apache Software Foundation (ASF).
>
> How did you come up with these numbers? The script received commits in
> 2013 and 2015, so the end point should be 2013. (Also, the next line
> talks of a "COPYING" file which doesn't exist, but that's a preëxisting
> problem.)

Those lines are inherited from commit-access-control.pl, when it was
copied as a base for check-mime-type.pl.
Copyright lines and license were not updated by CollabNet or ASF despite
new commits since 2004.
I used 2014 as a start as this was the time this patch was posted on the
ASF mailing list.

Maybe coyprigght and license should be adapted to the current version of
commit-access-control.pl either with or without copyright lines.
According to the overall Subversion history it may should be changed to:
# Copyright (c) 2000-2009 CollabNet. All rights reserved.
# Copyright (c) 2010-2020 Apache Software Foundation (ASF).

>> @@ -100,19 +101,20 @@ my $tmp_dir = '/tmp';
>> -# Figure out what files have added using svnlook.
>> -my @files_added;
>> +# Figure out what files have been added/property-modified using
>> svnlook.
>> +my @paths_to_check;
>> +my $props_changed_re = qr/^(?:A |[U_ ]U) (.*[^\/])$/;
>> foreach my $line (&read_from_process($svnlook, 'changed', $repos,
>> '-t', $txn))
>> {
>> - # Add only files that were added to @files_added
>> - if ($line =~ /^A. (.*[^\/])$/)
>> + # Add only files that were added/property-modified to
>> @paths_to_check
>> + if ($line =~ /$props_changed_re/)
>
> Why did you change /^A./ to /^A /?

The RegEx was taken from Leo Davis' mail, but I agree it should be /^A./
for added files and also /^.U/ for property-modified files.

>> Subject: [PATCH] Enhanced platform support of hook script
>> check-mime-type.pl
>>
>> * contrib/hook-scripts/check-mime-type.pl:
>> (svnlook) Cross Platform: sane executable determination
>
>> +++ b/contrib/hook-scripts/check-mime-type.pl
>> @@ -28,6 +28,15 @@
>>
>> use strict;
>> use Carp;
>> +use Config;
>> +use Cwd;
>
> «use Cwd qw(getcwd);» would be better.
>
>> +use Env;
>
> Why?
>
> This syntax (according to «perldoc Env») "ties all existing environment
> variables ("keys %ENV") to scalars". This seems like a dangerous thing
> to do; in fact, it negates the «use strict» a few lines above.
>
> I suspect this line should be deleted.
>
>> +use File::Which qw(which);
>> +use File::Temp qw(tempdir tempfile);
>> +
>> +my $old_dir;
>> +my $os_windows = $^O eq 'MSWin32';
>> +my $os_dos = $^O eq 'dos';
>>
>> # Turn on warnings the best way depending on the Perl version.
>> BEGIN {
>> @@ -35,6 +44,12 @@ BEGIN {
>> { require warnings; import warnings; }
>> else
>> { $^W = 1; }
>> +
>> + $old_dir = getcwd;
>> +}
>> +
>> +END {
>> + chdir($old_dir);
>> }
>
> Why is this necessary?

On Windows a sub script called by the precommit hook can change the
current directory whcih also affects the precommit hook, which had led
to weird results for the following sub scripts by the precommit hook.
This should have been explained in a comment.

>> @@ -42,31 +57,58 @@ BEGIN {
>> # Configuration section.
>>
>> # Svnlook path.
>> -my $svnlook = "/usr/bin/svnlook";
>> +my $svnlook;
>>
>> # Since the path to svnlook depends upon the local installation
>> # preferences, check that the required program exists to insure that
>> # the administrator has set up the script properly.
>> {
>> - my $ok = 1;
>> - foreach my $program ($svnlook)
>> + my $svnlook_exe = 'svnlook' . $Config{_exe};
>> + my @svnlook_paths;
>> +
>> + if ($os_windows || $os_dos)
>> + {
>> + # On Windows/DOS the environment is available
>> + my $svn_bindir = $ENV{'SVN_BINDIR'};
>
> Shouldn't the condition be «if (defined $ENV{SVN_BINDIR})»?
>
>> + if (defined $svn_bindir and length $svn_bindir)
>> + {
>> + $svnlook_paths[++$#svnlook_paths] =
>> "$svn_bindir/$svnlook_exe";
>
> Why does this line use «$arrayvar[++$#arrayvar] = $foo» rather than
> «push @arrayvar, $foo», which is equivalent and idiomatic?

Just because I'm not a Perl expert.
That's why I postponed a patch for Windows support.

>> + }
>> +
>> + $svnlook = which($svnlook_exe);
>> + if (defined $svnlook and length $svnlook)
>> + {
>> + $svnlook_paths[++$#svnlook_paths] = $svnlook;
>> + }
>
> Why do we bother to have $svnlook tested for existence and
> executability? Don't we trust File::Which::which() to work as
> advertised?
>
>> + undef $svnlook;
>> + }
>> +
>> + # On Unix the environment is empty, define fallback with absolute
>> path
>> + $svnlook_paths[++$#svnlook_paths] = "/usr/bin/$svnlook_exe";
>> + foreach my $program (@svnlook_paths)
>> {
>> if (-e $program)
>> {
>> unless (-x $program)
>> {
>> - warn "$0: required program `$program' is not
>> executable, ",
>> - "edit $0.\n";
>> - $ok = 0;
>> + warn "$0: program `$program' is not executable.\n";
>> + next;
>> }
>> + $svnlook = $program;
>> + last;
>> }
>> else
>> {
>> - warn "$0: required program `$program' does not exist, edit
>> $0.\n";
>> - $ok = 0;
>> + warn "$0: program `$program' does not exist.\n";
>> + next;
>> }
>> }
>
> What's the point of issuing a warning when there are some more
> iterations yet to go?
>
> If I'm reading this correctly, this entire block should be reduced to:
>
> # Use «||» rather than «//» since the BEGIN block implies versions
> of Perl that lack «//» are supported.
> my $svnlook = +(defined($ENV{SVN_BINDIR}) ?
> "$ENV{SVN_BINDIR}/$svnlook_exe" : undef) || which("svnlook") ||
> "/usr/bin/svnlook";
> die … unless -x $svnlook and (-f _ or -l _);
>
> I changed -e to (-f or -l) since -x implies -e.
>
> I also disabled the fallbacks when the envvar is set but doesn't point
> to a usable executable. I'm not married to these semantics, but I
> think
> these semantics may be expected _given the particular variable name_:
> SVN_BINDIR is the name of the configure AC_SUBST() variable that's
> cooked into various scripts installed by install-tools.
>
>> - exit 1 unless $ok;
>> + unless (defined $svnlook and length $svnlook)
>> + {
>> + warn "$0: required program `$svnlook_exe' does not exist or is
>> not executable, maybe have to edit $0.\n";
>> + exit 1;
>> + }
>> }
>>
>>
>> ######################################################################
>> @@ -95,9 +137,9 @@ sub ACCESS_READ_WRITE () { 'read-write' }
>>
>> ######################################################################
>> # Harvest data using svnlook.
>>
>> -# Change into /tmp so that svnlook diff can create its .svnlook
>> +# Change into temp dir so that svnlook diff can create its .svnlook
>
> When this script was first added, svnlook would create
> a $TMPDIR/svnlook.$unique_integer directory (without a leading dot, and
> in $TMPDIR rather than in cwd). svnlook doesn't do that nowadays, so
> I see no reason to chdir at all — …
>
>> # directory.
>> -my $tmp_dir = '/tmp';
>> +my $tmp_dir = tempdir( TMPDIR => 1, CLEANUP => 1 );
>
> … in which case, this line — though otherwise correct — should simply
> be
> deleted altogether, and the File::Temp and Cwd uses with it.
>
>> @@ -220,19 +262,66 @@ sub safe_read_from_pipe
>> + while ($command = shift)
>> + {
>> + if ("-m" eq $command)
>
> - This is a general-purpose "run a child process" function, so it can't
> special-case "-m".
>
> - If it could, it would have to check for all four variants: «-m foo»,
> «-mfoo», «--message=foo», and «--message foo».
>
> - It would also have to look for the «--» end-of-options guard, since
> «-m» may occur after one. (This can actually happen in the second
> callsite of read_from_process().)
>
> - None of the callers pass -m in the first place, so this is dead code.
>
>> + {
>> + $comment = shift;
>> + my ($handle, $tmpfile) = tempfile( DIR => $tmp_dir);
>> + print $handle $comment;
>> + close($handle);
>> +
>> + push(@commandline, "--file");
>> + push(@commandline, $tmpfile);
>
> There may have been a "--" end-of-options guard earlier on the command
> line.
>
>> + }
>> + else
>> + {
>> + # Munge the command to protect it from the command line
>> + $command =~ s/\"/\\\"/g;
>> + if ($command =~ m"\s") { $command = "\"$command\""; }
>> + if ($command eq "") { $command = "\"\""; }
>
> I'm pretty sure this escaping is incomplete. Use a library function
> rather than roll your own escaping.
>
>> + if ($command =~ m"\n")
>> + {
>> + warn "$0: carriage return detected in command - may
>> not work\n";
>
> If it "may not work", then die() rather than warn().
>
>> + }
>> + push(@commandline, $command);
> ⋮
>> + open(SAFE_READ, "@commandline |")
>
>> They can be applied in sequence:
>> svn-patch-01-c73cb4415d.txt
>> svn-patch-02-68cc555740.txt
>>
>> In principle I think the changes are useful, though might change
>> behavior in a surprising way for those who are using the hook as it
>> stands now.
>>
>
> Usefulness aside, I think the first patch is unacceptable because of
> backwards compatibility considerations and the second patch would not
> be
> acceptable unless revised.
>
>> Would it make sense to copy check-mime-type.pl, calling it, say,
>> check-mime-type-2.pl, and apply the changes to that file, keeping the
>> original as it is now?
>
> For the first patch, moving the new behaviours to a new script would
> address the backwards compatibility point, as would making the new
> behaviours optional and opt-in. However, neither would address the
> other review points.
>
> The second patch doesn't appear to raise any backwards compatibility
> concerns.
>
> Cheers,
> Daniel

Kind regards
Maddes

Received on 2020-12-21 16:22:41 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.