Re: RFC: date parser rewrite

From: mark benedetto king <mbk_at_lowlatency.com>
Date: 2003-12-19 23:20:38 CET

On Fri, Dec 19, 2003 at 02:44:21PM -0600, kfogel@collab.net wrote:
> This thread's been dormant for a bit, so I'm afraid to disturb it :-).
>
> However, eventually someone's going to file it as a 1.0 candidate
> change (probably after they have a patch ready), or move issue #408 to
> be a 1.0 candidate.

That was my plan.

> So, this is an anticipatory post explaining why I hope we do *not*
> tackle this problem for 1.0.
>
> Reason 1: As much as we dislike our current date parser, it hasn't
> caused us many actual problems. It's not like we had a
> bunch of defects filed about date parsing, and then realized
> that we couldn't release 1.0 without fixing them. We have
> issue #408, sure, but that's "date parser rewrite", so it
> would be circular to view it as a justification in itself :-).

It has at least one major defect: it is not thread-safe. It's not even
close. This would be a problem for GUI clients and/or server-side
consumers of our APIs. I suppose that it could be *made* thread-safe,
we could (horrors) protect it with a big mutex, for example.

It also, for example, parses "2003-12-17T17:00:00" differently from
"2003-12-17 17:00:00". I don't know if that's a bug or not, but it's
a bit surprising. (If you're interested, 2003-12-17T17:00:00 comes
out as "Wed Dec 17 10:00:00 UTC 2003". I'm in EST, which doesn't
account for a 7 hour time difference, even if the 'T' were to somehow
magically stand for "this date-time is in UTC". Hmm...perhaps it's
off by twelve hours the wrong way... hard to say).

> Reason 2: Unlike some of the simple API fixes and whatnot we've been
> bandying about, this actually has the potential to cause
> bugs. It'd need a lot of review, and even then it's still
> possible for there to be subtle misparsings that don't get
> found until after 1.0.

I expect my implementation to be less than 100 lines of code. I think
Greg Hudson would be willing to review it thoroughly, and probably
Brane would too. Yes, they could be fixing other bugs or reviewing
other changes, instead...

> maintain. If we write a new parser, it certainly wouldn't
> differently parse any string that is correctly parsed by the
> old code. So, we can just pass input to the old code first,
> then if it fails, we try the new code. That way we don't
> have to "maintain" the old code, so much as preserve it in
> stasis until the new code matches it in sophistication -- a
> process that can take as long as we're comfortable with,
> since there's no great penalty to keeping the old code
> around.

The semantics of the existing parser are ill-defined at best. It is
unlikely that any implementation that does not have, as a design goal,
to match a superset of the defacto date-language domain will wind up
compatible. If it is axiomatic that "one day" we will remove that code
from the tree, it follows that on that day we will stop parsing some
non-empty set of strings that we used to parse.

When that happens, it is likely that some (possibly not many, granted)
scripts, programs, documentation, printed manuals, user's guides, insider's
guides, etc, will be broken. Undocumented features are still features.

>
> For these reasons, I think it is unnecessary (and even a bit risky) to
> futz with our date parser before 1.0. Let's leave well enough alone.
>

If we decide not to replace the date parser before 1.0, then I suggest
we:
   1.) protect the logic with a big mutex
   2.) follow your suggestion to add support for the formats we want
       to add by wrapping the existing logic
   3.) *Deprecate* all behavior outside the strictly limited input set
       I proposed earlier.

Consider, though: I expect the implementation of (2) to be roughly the
same, in terms of complexity, whether I try to only parse these
"new formats", or whether I try to parse the entire set of formats
covered in the proposal. If step (2) introduces all the instability
that you are hesitant to incur, we need to omit it completely.

--ben

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Dec 19 23:21:19 2003

This message: [ Message body ]
Next message: Justin Erenkrantz: "Re: process for including changes in 1.0"
Previous message: kfogel_at_collab.net: "Re: Ann: Subversion 0.35.0 released"
In reply to: kfogel_at_collab.net: "Re: RFC: date parser rewrite"
Next in thread: kfogel_at_collab.net: "Re: RFC: date parser rewrite"
Reply: kfogel_at_collab.net: "Re: RFC: date parser rewrite"
Reply: Julian Foad: "Re: RFC: date parser rewrite"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]