[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1807056 - in /subversion/trunk/subversion: include/private/svn_utf_private.h libsvn_client/list.c libsvn_repos/list.c libsvn_subr/utf8proc.c

From: Stefan Fuhrmann <stefan2_at_apache.org>
Date: Sat, 2 Sep 2017 18:06:24 +0200

On 02.09.2017 17:17, Branko Čibej wrote:
> On 02.09.2017 17:12, stefan2_at_apache.org wrote:
>> +svn_boolean_t
>> +svn_utf__fuzzy_glob_match(const char *str,
>> + const apr_array_header_t *patterns,
>> + svn_membuf_t *buf)
>> +{
>> + const char *normalized;
>> + svn_error_t *err;
>> + int i;
>> +
>> + /* Try to normalize case and accents in STR.
>> + *
>> + * If that should fail for some reason, continue with the original STR.
>> + * There is still a fair chance that it matches "*.ext" pattern despite
>> + * being "broken" UTF8. */
>
>
> What evidence do you have for this statement? It is, quite frankly,
> completely bonkers.
>
> "Broken," as you put in quotes, means it's not UTF-8. What kind of UTF-8
> do you think there's a fair chance it'll match then?

I've encountered old repositories where some path names
were apparently not converted properly into UTF8 but
contained whatever locale-based strings the client sent.

Those would still match "*.someExtension" and similar
patterns despite having non-UTF8. I would like to cover
those as well.

I see 3 options here:

1. Make these cases a non-match, hiding them in the output.
2. Handle these cases in the callers, duplicating that part.
3. Keep it as it is.

-- Stefan^2.
Received on 2017-09-02 18:06:31 CEST

This is an archived mail posted to the Subversion Dev mailing list.