At 9:33 AM +0200 7/22/03, Michael Wood wrote:
>On Mon, Jul 21, 2003 at 10:31:44AM -0400, Paul Lussier wrote:
>> In a message dated: Fri, 18 Jul 2003 16:23:22 PDT
>> Justin Erenkrantz said:
>>
>> >Namely, I believe fnmatch doesn't support [xyz]* expressions. I could be
>> >wrong, but that's what the Solaris man page on fnmatch(5) leads me to
> > >believe.
> >
Whoah, there, hold on a bit. Are these things actually patterns, not
regular expressions? The Book says "regular expressions' widely, if
that's incorrect, we need to fix it.
I dunno about *your* man page, by *my* fnmatch(3) page begins "The
fnmatch() function matches patterns according to the rules used by
the shell," a two-part attempt to clearly distinguish which sorts of
expressions it matches: "patterns" and "like the shell." Indeed, I
daresay the 'fn" there is meant to imply "file name," making three
separate claims. (Never mind that there are many shells, with some
variation in what they actually match....) If SVN uses fnmatch(),
then it appears to be matching _patterns_, not _regular_expressions_.
More detail than you ever wanted to know: a language theorist will
tell you that both these things are "regular expressions," or more
precisely "expressions from a regular language." Ultimately, this is
grounded in something called "The Chomsky Hierarchy," a way of
classifying the complexity of languages, defined by the noted
linguist and anti-war demonstrator Noam Chomsky. Somehow, Unix got
into the very messy state that editors (and other programs based on
editors, like grep) were using one kind of regular expressions, while
shells (indeed, "the shell" back then) were using another. POSIX, in
trying to unravel this skein of confusion, invented the term
"pattern" for the sort of regular expression the shells use, and
reserves the phrase "regular expression" for the sort of regular
expression that editors use. As has elsewhere been mentioned, the
convention is that "patterns" are used on file names, while "regular
expressions" are used on (any other kind of) strings. About the same
time, someone (might have been Bill Joy, in writing the csh, one of
the first widely popular "other" shells) invented the term "glob" for
what POSIX called "pattern," again leaving the language-theoretical
term "regular expression" for "that thing the editors do." So now,
instead of one term for two things, we have three, which may not be
entirely an improvement.
The POSIX terms don't seem entirely to have caught on (for example,
Perl also has "globbing" and "regular expressions," but also has
"patterns" which--oh, dear!--are regular expressions, not globs). I
don't know of anyone who uses "glob" to mean "that thing editors do."
So I suppose we can choose to bow to docmented standards (POSIX
patterns and regular expressions), or to bow to the closest approach
to consensus (globs and regular expressions).
But no way, Dr. Noam not withstanding, should we call this a "regular
expression!"
--
-==-
Jack Repenning
CollabNet, Inc.
8000 Marina Boulevard, Suite 600
Brisbane, California 94005
o: 650.228.2562
c: 408.835-8090
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jul 22 18:38:14 2003