[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: log --search test failures on trunk and 1.8.x

From: Bert Huijben <bert_at_qqmail.nl>
Date: Mon, 22 Apr 2013 11:22:11 +0200

> -----Original Message-----
> From: Stefan Sperling [mailto:stsp_at_elego.de]
> Sent: maandag 22 april 2013 00:08
> To: Ivan Zhakov
> Cc: Branko Čibej; Bert Huijben; dev_at_subversion.apache.org
> Subject: Re: log --search test failures on trunk and 1.8.x
> On Sun, Apr 21, 2013 at 07:11:14PM +0400, Ivan Zhakov wrote:
> > So I propose the following plan:
> > 1. Make 'log --search" case-sensitive in trunk and 1.8.x.
> > 2. Merge utf8proc stuff to trunk
> > 3. Implement svn_utf__casefold() using utf8proc
> > 4. Implement 'log --isearch' using
> No --isearch please. It did exist on trunk but we made --search
> case-insensitive in r1388530 to avoid having too many options.
> Has anyone tried my APR patch on windows yet? I'd be interested to
> know whether or not that helps... if you are running Windows and
> care about this issue, please let me know if my APR patch helps so
> that I can prepare a fix for APR and SVN 1.8.x. I think we can do
> better than making --search case-sensitive just because of a bug in APR.
> I don't see why we couldn't ship a private and fixed version of
> apr_fnmatch() for use by log --search in 1.8.x, to avoid undefined
> behaviour within tolower() via apr_fnmatch() without requiring future
> APR versions. Would this not be a good way of fixing this?

What would this fix?

This doesn't make apr_fnmatch use proper case folding, as that needs proper
UTF-8 processing and following locale rules.

The hack will make apr_fnmatch whatever is the current locale casing and
current chararacter encoding rules on the platform for our UTF-8 characters
in the log message. (Which are known to not always be strict utf-8). And
even there it ignores multibyte characters by handing them to tolower one at
a time.

This is just stacking undefined behaviour.

If you would first convert the characters back to the system encoding and
then pass things to apr_fnmatch it would be good enough for an 'svn'
implementation for me, but probably not for our libraries.

Like Ivan suggested: for our libraries we want strict behavior over all
platforms, no undefined behavior.

Received on 2013-04-22 11:23:19 CEST

This is an archived mail posted to the Subversion Dev mailing list.