[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: RFC: date parser strawman

From: mark benedetto king <mbk_at_lowlatency.com>
Date: 2003-12-28 18:37:28 CET

On Sun, Dec 21, 2003 at 12:42:24AM -0500, mark benedetto king wrote:
> Here's a quick patch for review.
>

Here's an updated one, with an overhaul^W^Wimprovements based on
feedback from Brane and Greg Hudson. To save bandwidth, I have not
included the diffs for the deleted files.

Remove the getdate.y-based date-parser, replacing it with a simpler
though less powerful one.

* subversion/libsvn_subr/date.c: New file.
* subversion/libsvn_subr/getdate.y: Deleted.
* subversion/libsvn_subr/getdate.cw: Deleted.
* subversion/include/svn_time.h:
  (struct getdate_time): Removed.
  (svn_parse_date): Update prototype and documentation comment to reflect
   the new interface.
* subversion/libsvn_subr/opt.c
  (parse_one_rev): Add pool parameter, and use the new svn_parse_date
   interface.
  (svn_opt_parse_revision): Pass pool into parse_one_rev().
* autogen.sh:
  Remove generation of getdate.c from getdate.y.
* INSTALL:
  Remove bison/yacc dependency, renumber subsequent dependencies.

Index: subversion/include/svn_time.h
===================================================================
--- subversion/include/svn_time.h (revision 8109)
+++ subversion/include/svn_time.h (working copy)
@@ -50,17 +50,15 @@
 const char *svn_time_to_human_cstring (apr_time_t when, apr_pool_t *pool);
 
 
-/** Needed by @c getdate.y parser. */
-struct getdate_time {
- time_t time;
- short timezone;
-};
-
-/** The one interface in our @c getdate.y parser; convert
- * human-readable date @a text into a standard C @c time_t. The 2nd
- * argument is unused; we always pass @c NULL.
+/** Convert a human-readable date @a value into an @c apr_time_t, using
+ * @a now as the current time and @a gmtoff as an indicator of
+ * seconds east of GMT, storing the result in @a val. Set @a matched
+ * to indicate whether or not @a value was parsed successfully. Perform
+ * any allocation in @a pool.
  */
-time_t svn_parse_date (char *text, struct getdate_time *now);
+svn_error_t *
+svn_parse_date (svn_boolean_t *matched, apr_time_t *val, const char *value,
+ apr_time_t now, apr_int32_t gmtoff, apr_pool_t *pool);
 
 
 /** Sleep until the next second, to ensure that any files modified
Index: subversion/libsvn_subr/opt.c
===================================================================
--- subversion/libsvn_subr/opt.c (revision 8109)
+++ subversion/libsvn_subr/opt.c (working copy)
@@ -288,25 +288,41 @@
 
 /* Parse one revision specification. Return pointer to character
    after revision, or NULL if the revision is invalid. Modifies
- str, so make sure to pass a copy of anything precious. */
-static char *parse_one_rev (svn_opt_revision_t *revision, char *str)
+ str, so make sure to pass a copy of anything precious. Uses
+ POOL for temporary allocation. */
+static char *parse_one_rev (svn_opt_revision_t *revision, char *str,
+ apr_pool_t *pool)
 {
   char *end, save;
- time_t tm;
 
   if (*str == '{')
     {
+ apr_time_t now;
+ apr_time_exp_t nowexp;
+ svn_boolean_t matched;
+ apr_time_t tm;
+ svn_error_t *err;
+
       /* Brackets denote a date. */
       str++;
       end = strchr (str, '}');
       if (!end)
         return NULL;
       *end = '\0';
- tm = svn_parse_date (str, NULL);
- if (tm == -1)
+ now = apr_time_now();
+ if (apr_time_exp_lt (&nowexp, now) != APR_SUCCESS)
         return NULL;
+ err = svn_parse_date (&matched, &tm, str, now, nowexp.tm_gmtoff,
+ pool);
+ if (err)
+ {
+ svn_error_clear (err);
+ return NULL;
+ }
+ if (!matched)
+ return NULL;
       revision->kind = svn_opt_revision_date;
- apr_time_ansi_put (&(revision->value.date), tm);
+ revision->value.date = tm;
       return end + 1;
     }
   else if (apr_isdigit (*str))
@@ -350,11 +366,11 @@
   /* Operate on a copy of the argument. */
   left_rev = apr_pstrdup (pool, arg);
 
- right_rev = parse_one_rev (start_revision, left_rev);
+ right_rev = parse_one_rev (start_revision, left_rev, pool);
   if (right_rev && *right_rev == ':')
     {
       right_rev++;
- end = parse_one_rev (end_revision, right_rev);
+ end = parse_one_rev (end_revision, right_rev, pool);
       if (!end || *end != '\0')
         return -1;
     }
Index: subversion/libsvn_subr/date.c
===================================================================
--- subversion/libsvn_subr/date.c (revision 0)
+++ subversion/libsvn_subr/date.c (revision 0)
@@ -0,0 +1,228 @@
+#include <apr_lib.h>
+#include <svn_time.h>
+
+/* Valid rule actions */
+enum rule_action {
+ ACCUM, /* Accumulate a decimal value */
+ MILLI, /* Accumulate milliseconds */
+ TZIND, /* Handle +, -, Z */
+ NOOP, /* Do nothing */
+ SKIPFROM, /* If at end-of-value, accept the match. Otherwise,
+ if the next template character matches the current
+ value character, continue processing as normal.
+ Otherwise, attempt to complete matching starting
+ immediately after the first subsequent occurrance of
+ ']' in the template. */
+ SKIP, /* Ignore this template character */
+ ACCEPT /* Accept the value */
+};
+
+/* How to handle a particular character in a template */
+typedef struct
+{
+ char key; /* The template char that this rule matches */
+ const char *valid; /* String of valid chars for this rule */
+ enum rule_action action; /* What action to take when the rule is matched */
+ int offset; /* Where to store the any results of the action,
+ relative to the base of an apr_time_exp_t */
+} rule;
+
+
+/* The parsed values, before localtime/gmt processing */
+typedef struct
+{
+ apr_time_exp_t base;
+ apr_int32_t offhours;
+ apr_int32_t offminutes;
+} match_state;
+
+#define DIGITS "0123456789"
+
+/* A declarative specification of how each template character
+ should be processed, using a rule for each valid symbol. */
+static const rule
+rules[] =
+{
+ { 'Y', DIGITS, ACCUM, APR_OFFSETOF (match_state, base.tm_year) },
+ { 'M', DIGITS, ACCUM, APR_OFFSETOF (match_state, base.tm_mon) },
+ { 'D', DIGITS, ACCUM, APR_OFFSETOF (match_state, base.tm_mday) },
+ { 'h', DIGITS, ACCUM, APR_OFFSETOF (match_state, base.tm_hour) },
+ { 'm', DIGITS, ACCUM, APR_OFFSETOF (match_state, base.tm_min) },
+ { 's', DIGITS, ACCUM, APR_OFFSETOF (match_state, base.tm_sec) },
+ { 'u', DIGITS, MILLI, APR_OFFSETOF (match_state, base.tm_usec) },
+ { 'O', DIGITS, ACCUM, APR_OFFSETOF (match_state, offhours) },
+ { 'o', DIGITS, ACCUM, APR_OFFSETOF (match_state, offminutes) },
+ { '+', "-+", TZIND, 0 },
+ { 'Z', "Z", TZIND, 0 },
+ { ':', ":", NOOP, 0 },
+ { '-', "-", NOOP, 0 },
+ { 'T', "T", NOOP, 0 },
+ { ' ', " ", NOOP, 0 },
+ { '.', ".,", NOOP, 0 },
+ { '[', NULL, SKIPFROM, 0 },
+ { ']', NULL, SKIP, 0 },
+ { '\0', "", ACCEPT, 0 },
+};
+
+/* Return the rule associated with TCHAR, or NULL if there
+ is no such rule */
+static const rule *
+find_rule (char tchar)
+{
+ int i = sizeof (rules)/sizeof (rules[0]);
+ while (i--)
+ if (rules[i].key == tchar)
+ return &rules[i];
+ return NULL;
+}
+
+/* Attempt to match the date-string in VALUE to the provided TEMPLATE,
+ using the rules defined above. Use GMTOFF is to bias the results
+ when no timezone indicator is present. On successful match, store
+ the matched values in EXP, and return TRUE. Otherwise, return
+ FALSE. */
+static svn_boolean_t
+template_match (apr_time_exp_t *exp, const char *template, const char *value,
+ apr_int32_t gmtoff)
+{
+ int multiplier = 100000;
+ int tzind = 0;
+ match_state ms;
+ char *base = (char *)&ms;
+
+ memset (&ms, 0, sizeof (ms));
+
+ for (;;)
+ {
+ const rule *match = find_rule (*template++);
+ char vchar = *value++;
+ apr_int32_t *place;
+
+ if (!match || (match->valid && !strchr (match->valid, vchar)))
+ return FALSE;
+
+ place = (apr_int32_t *)(base + match->offset);
+ switch (match->action)
+ {
+ case ACCUM:
+ *place = *place * 10 + vchar - '0';
+ continue;
+ case MILLI:
+ *place += (vchar - '0') * multiplier;
+ multiplier /= 10;
+ continue;
+ case TZIND:
+ tzind = vchar;
+ continue;
+ case SKIP:
+ value--;
+ continue;
+ case NOOP:
+ continue;
+ case SKIPFROM:
+ if (!vchar)
+ break;
+ match = find_rule (*template);
+ if (!strchr (match->valid, vchar))
+ template = strchr (template, ']') + 1;
+ value--;
+ continue;
+ case ACCEPT:
+ break;
+ }
+
+ break;
+ }
+
+ switch (tzind)
+ {
+ case 0:
+ ms.base.tm_gmtoff = gmtoff;
+ break;
+ case '+':
+ ms.base.tm_gmtoff = ms.offhours * 3600 + ms.offminutes * 60;
+ break;
+ case '-':
+ ms.base.tm_gmtoff = -(ms.offhours * 3600 + ms.offminutes * 60);
+ break;
+ }
+
+ *exp = ms.base;
+ return TRUE;
+}
+
+static int
+valid_days_by_month[] = {
+ 31, 29, 31, 30,
+ 31, 30, 31, 31,
+ 30, 31, 30, 31
+};
+
+svn_error_t *
+svn_parse_date (svn_boolean_t *matched, apr_time_t *val, const char *value,
+ apr_time_t now, apr_int32_t gmtoff, apr_pool_t *pool)
+{
+ apr_time_exp_t exp;
+ apr_status_t apr_err;
+
+ *matched = FALSE;
+
+ if (template_match (&exp, /* try ISO-8601 extended, UTC */
+ "YYYY-MM-DD[Thh[:mm[:ss[.u[u[u[u[u[u][Z]",
+ value, gmtoff)
+ || template_match (&exp, /* try ISO-8601 extended, with offset */
+ "YYYY-MM-DD[Thh[:mm[:ss[.u[u[u[u[u[u]+oo[:oo]",
+ value, gmtoff)
+ || template_match (&exp, /* try ISO-8601 basic, UTC */
+ "YYYYMMDD[Thh[mm[ss[.u[u[u[u[u[u][Z]",
+ value, gmtoff)
+ || template_match (&exp, /* try ISO-8601 basic, with offset */
+ "YYYYMMDD[Thh[mm[ss[.u[u[u[u[u[u]+oo[oo]",
+ value, gmtoff)
+ || template_match (&exp, /* try the "svn log" format */
+ "YYYY-MM-DD hh:mm:ss[.u[u[u[u[u[u][ +oo[oo]",
+ value, gmtoff))
+ {
+ exp.tm_year -= 1900;
+ exp.tm_mon -= 1;
+ }
+ else if (template_match (&exp, /* try just a time */
+ "hh:mm[:ss[.u[u[u[u[u[u]",
+ value, gmtoff))
+ {
+ apr_time_exp_t expnow;
+ apr_err = apr_time_exp_tz (&expnow, now, gmtoff);
+ if (apr_err != APR_SUCCESS)
+ return svn_error_wrap_apr (apr_err, "Can't manipulate current date");
+ exp.tm_year = expnow.tm_year;
+ exp.tm_mon = expnow.tm_mon;
+ exp.tm_mday = expnow.tm_mday;
+ }
+ else
+ return SVN_NO_ERROR;
+
+ /* range validation, allowing for leap seconds */
+ if (exp.tm_mon > 11
+ || exp.tm_mday > valid_days_by_month[exp.tm_mon]
+ || exp.tm_hour > 23
+ || exp.tm_min > 59
+ || exp.tm_sec > 60
+ || abs (exp.tm_gmtoff) > 86399
+ || (abs (exp.tm_gmtoff/60)%60) > 59)
+ return SVN_NO_ERROR;
+
+ /* february/leap-year day checking. tm_year is bias-1900, so centuries
+ that equal 100 (mod 400) are multiples of 400. */
+ if (exp.tm_mon == 1
+ && exp.tm_mday == 29
+ && (exp.tm_year % 4 != 0
+ || (exp.tm_year % 100 == 0 && exp.tm_year % 400 != 100)))
+ return SVN_NO_ERROR;
+
+ apr_err = apr_time_exp_gmt_get (val, &exp);
+ if (apr_err != APR_SUCCESS)
+ return svn_error_wrap_apr (apr_err, "Can't calculate requested date");
+
+ *matched = TRUE;
+ return SVN_NO_ERROR;
+}
Index: INSTALL
===================================================================
--- INSTALL (revision 8109)
+++ INSTALL (working copy)
@@ -154,25 +154,8 @@
       newer. The autogen.sh script knows about that.
 
 
- 4. bison or yacc (Unix only)
+ 4. Neon library 0.24.4 (http://www.webdav.org/neon/)
 
- This is required only if you plan to build from the latest source
- (see section II.B), which you probably want to do. See above.
-
- The reason one of these programs is required is that it will
- generate the code which parses complex date formats, so that
- Subversion can work with dates like "yesterday" and "last month"
- and "four hours ago". Note that most modern Unices come with one
- or the other of these programs, and only one is required.
-
- The reason you don't need one of these programs on a Windows
- platform is that the date parsing file has been pregenerated
- and will automatically be copied into place by the Windows
- Build.
-
-
- 5. Neon library 0.24.4 (http://www.webdav.org/neon/)
-
       The Neon library allows a Subversion client to interact with remote
       repositories over the Internet via a WebDAV based protocol. If you
       want to use Subversion to connect to a server over ra_dav (via a
@@ -197,7 +180,7 @@
       subdirectory beneath wherever "--with-neon" is pointed.
 
 
- 6. Berkeley DB 4.2.52
+ 5. Berkeley DB 4.2.52
 
       Berkeley DB is needed to build a Subversion server, or to access
       a repository on local disk. If you are only interested in
@@ -225,7 +208,7 @@
           http://subversion.tigris.org/servlets/ProjectDocumentList
 
 
- 7. Apache Web Server 2.0.48 or newer
+ 6. Apache Web Server 2.0.48 or newer
           (http://httpd.apache.org/download.cgi)
 
       The Apache HTTP server is required if you wish to offer your
@@ -236,7 +219,7 @@
       done: See section III for details.
 
 
- 8. Python 2.0 (http://www.python.org/)
+ 7. Python 2.0 (http://www.python.org/)
 
       If you want to run "make check" or build from the latest source
       under Unix as described in section II.B and III.D, install
@@ -245,21 +228,21 @@
       system.
 
 
- 9. Visual C++ 6.0 or newer (Windows Only)
+ 8. Visual C++ 6.0 or newer (Windows Only)
 
       To build Subversion under any of the MS Windows platforms, you
       will need a copy of Microsoft Visual C++. You can generate the
       project files using the gen-make.py script.
 
 
- 10. Perl 5.8 or newer (Windows only)
+ 9. Perl 5.8 or newer (Windows only)
 
       To build Subversion under any of the MS Windows platforms, you
       will also need Perl 5.8 or newer to run apr-util's w32locatedb.pl
       script.
 
 
- 11. Libraries for our libraries
+ 10. Libraries for our libraries
 
       Some of the libraries that Subversion depends on themselves have
       optional dependencies that can add features to what Subversion
@@ -321,7 +304,7 @@
       libraries.
 
 
- 12. Building The Documentation
+ 11. Building The Documentation
 
       The master source format for Subversion's documentation is
       Docbook Lite. See doc/book/README for instructions how to
Index: autogen.sh
===================================================================
--- autogen.sh (revision 8109)
+++ autogen.sh (working copy)
@@ -70,23 +70,6 @@
 # any old aclocal.m4 left over from prior build so it doesn't cause errors.
 rm -f aclocal.m4
 
-# Produce getdate.c from getdate.y.
-# Again, this means that "developers" who run autogen.sh need either
-# yacc or bison -- but not people who compile sourceballs, since `make
-# dist` will include getdate.c.
-echo "Creating getdate.c..."
-bison -o subversion/libsvn_subr/getdate.c subversion/libsvn_subr/getdate.y
-if [ $? -ne 0 ]; then
- yacc -o subversion/libsvn_subr/getdate.c subversion/libsvn_subr/getdate.y
- if [ $? -ne 0 ]; then
- echo
- echo " Error: can't find either bison or yacc."
- echo " One of these is needed to generate the date parser."
- echo
- exit 1
- fi
-fi
-
 # Create the file detailing all of the build outputs for SVN.
 #
 # Note: this dependency on Python is fine: only SVN developers use autogen.sh

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Dec 28 18:37:41 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.