[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: RFC: date parser strawman

From: mark benedetto king <mbk_at_lowlatency.com>
Date: 2003-12-30 22:56:09 CET

On Tue, Dec 30, 2003 at 02:08:21PM -0600, kfogel@collab.net wrote:
> mark benedetto king <mbk@lowlatency.com> writes:
> > +svn_error_t *
> > +svn_parse_date (svn_boolean_t *matched, apr_time_t *val, const char *value,
> > + apr_time_t now, apr_int32_t gmtoff, apr_pool_t *pool);
>
> A couple of interface comments:
>
> It's confusing that the input is named "value" and the output "val".
> Maybe choose clearer names?

Good idea.

>
> The function returns boolean *and* error -- why both? (I'm sure
> there's a good reason, just the doc string should explain the error
> conditions and how they relate to 'matched'.)
>

This is in order to distinguish between parse-errors and internal errors,
like not being able to convert an exploded date into an apr_time_t.

This could be accomplished by peeking at the error type, but I thought
it would be cleaner this way. I tend to think of SVN_ERR as a sort of
ad-hoc structured exception handling mechanism, and try to use it that
way.

> And a documentation comment: Obviously, we'll need to update the book
> and any other place where we document date formats. I understand this
> patch is just about implementation, and that you're intentionally
> waiting on the other stuff. I'm just noting it here so we don't

I'll make the necessary book changes and include them in an updated
patch.

> neglect it. (Glad you got the INSTALL file, though -- I'd totally
> forgotten about that.)
>

ISTR that C. Scott Ananian remembered in his original patch. Otherwise
I might have forgotten as well.

> Code comments below. It's extremely clean & well-designed, was really
> a pleasure to read, by the way.

Thanks, it was a pleasure to write.

 [snip]
> For space, should we match TAB (ascii 9) as well in the second field?
 [snip]
> No "/" permitted, I see. Is it atypical to want that? (More on this
> below.)
>

More on this below.

> > +/* Attempt to match the date-string in VALUE to the provided TEMPLATE,
> > + using the rules defined above. Use GMTOFF is to bias the results
> > + when no timezone indicator is present. On successful match, store
> > + the matched values in EXP, and return TRUE. Otherwise, return
> > + FALSE. */
>
> Second sentence of doc string unclear, extra word somewhere? Also,
> store matched values in "*EXP", not "EXP", right? (I know that's
> nitpicky, but might as well be consistent with elsewhere.)

How about:

"Use GMTOFF to bias the results when no timezone indicator is present
in VALUE."

Will update patch.

> > + char *base = (char *)&ms;
> > +
 [...]
> > + place = (apr_int32_t *)(base + match->offset);
>
> I'm a bit mystified by the casting of &ms to get base, and by the
> pointer arithmetic here. Probably I could puzzle it out, but a
> comment explaining what's going on might be good. My endianness bells
> started ringing when I saw the assignment to place above... (I'm sure
> you've thought of that and there's nothing to worry about, just wonder
> how many other people's bells will go off.)

The offsets that come from APR_OFFSETOF are in bytes. This means that
we need to use a char * in order to get the right results when doing
the pointer arithmetic. I don't think endianness comes into play here;
the pointer to the base of the apr_int32_t should be the same either way.

> > + case NOOP:
> > + continue;
>
> Am I correct in understanding that this permits repeats of NOOP
> characters? For example, a template of "YYYY-MM" would allow
> "2004----01", and "YYYY MM" would allow "2004 01"? (This is a
> good thing, just want to make sure I understand correctly.)

No, because the template pointer is unconditionally incremented
on each iteration of the match loop.

>
> Are we starting out deliberately very strict, with intent to loosen up
> and allow more formats later?
>

Yes (and that's why TABs are not supported in place of spaces). These
formats are essentially the ISO-8601 extended and basic date-time formats,
plus (for completeness and convenience of cut-n-pasting) the format from
"svn log", plus (for less typing) a time-only specification.

> I'm wondering about "YYYY/MM/DD", "YY/MM/DD", for example.
>

YYYY/MM/DD is locale-specific (I think). YY/MM/DD is locale-specific and
ambiguous, since it looks quite a bit like both DD/MM/YY and MM/DD/YY.

I suppose we could just pick one and document it as Our Standard, though.

> > + if (exp.tm_mon == 1
> > + && exp.tm_mday == 29
> > + && (exp.tm_year % 4 != 0
> > + || (exp.tm_year % 100 == 0 && exp.tm_year % 400 != 100)))
> > + return SVN_NO_ERROR;
>
> I'd put more parens for precedence clarity in there, but there are
> others who feel that's weak-kneed and lily-livered. Your call.
>

There are already more parentheses than necessary in that condition. :-)

--ben

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Dec 30 22:56:47 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.