[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

[PATCH] Optionally use deltas in dumpfiles

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2004-03-28 07:58:50 CEST

The problem: the output of "svnadmin dump" is useful as a
platform-independent backup format, as a way of transporting new
revisions across the net, and for other purposes. But for some
workloads it's too big, because it dumps all the properties and all
the plaintext for each changed node-revision.

The following patch--another thousand-line ghudson weekend code bomb,
because I know you all love them so much--implements a new "svnadmin
dump --deltas" option to cause dumps to be made using deltas.
Although the dump format number has to be bumped to make this happen,
dumps made without the new option will continue to use the old format
number so that they can be read with Subversion 1.0.x.

Non-incremental dumps which start at a rev other than 0 will not use
deltas for the first revision.

After making this change, I'm not a big fan of the way the dumpfile
parser exports the parser interface. I doubt anyone will ever use
that functionality, so it just makes it harder to add new features to
the dump format. (Also, what's the point of exporting
svn_repos_get_fs_build_parser, when the only thing you can do with it
is call svn_repos_parse_dumpstream on it, and doing that is exactly
the same as calling svn_repos_load_fs?)

I'm soliciting comments on this patch before I commit it because it's
a significant change, so let me know what you think. (I'd also like
to look into adding some test cases for the new code, and to get some
performance numbers on a dump of the Subversion repository.) I've
prepared the patch with -uw for clarity, since some code moved due to
changing function names or the introduction of conditionals.

* notes/fs_dumprestore.txt: Document the version 3 dump format.
* subversion/svnadmin/main.c
  (svnadmin__deltas, options, cmd_table, svnadmin_opt_state, main):
   Add new "deltas" option.
  (subcommand_dump): Use new svn_repos_dump_fs2 function; pass true
   for new use_deltas parameter if --deltas specified.
* subversion/include/svn_repos.h
  (SVN_REPOS_DUMPFILE_FORMAT_VERSION): Bump. However, we will
   continue to use the previous version except when doing a delta
   dump.
  (SVN_REPOS_DUMPFILE_PROP_DELTA, SVN_REPOS_DUMPFILE_TEXT_DELTA);
   New headers to signify whether the property and text contents are
   deltas.
  (svn_repos_dump_fs2): New variant of svn_repos_dump_fs with a new
   parameter use_deltas.
  (svn_repos_parse_fns2_t): New variant of svn_repos_parse_fns_t
   with added callbacks delete_node_property and apply_textdelta,
   to handle dumpstreams using deltas.
  (svn_repos_parse_dumpstream2): New variant of
   svn_repos_parse_dumpstream, accepts an svn_repos_parse_fns2_t.
  (svn_repos_get_fs_build_parser2): New variant of
   svn_repos_get_fs_build_parser, yields an svn_repos_parse_fns2_t.
  (svn_repos_dump_fs, svn_repos_parse_fns_t,
   svn_repos_parse_dumpstream, svn_repos_get_fs_build_parser):
   Deprecate.
* subversion/libsvn_repos/load.c
  (fns2_from_fs): Utility function to convert old-style callbacks to
   new-style callbacks, for API transition.
  (parse_property_block): Accept new-style callbacks. For node
   property blocks, recognize "D <keylen>\n<key>\n" to mean a deleted
   property.
  (parse_text_block): Accept new-style callbacks and new parameter
   is_delta. If is_delta is true, treat the text block as svndiff
   data.
  (svn_repos_parse_dumpstream2): Renamed from
   svn_repos_parse_dumpstream. Accept new-style callbacks. If given
   old-style callbacks via deprecated function, reject current-format
   dumpstreams. Recognize delta property and text blocks.
  (svn_repos_parse_dumpstream): Implement deprecated function using
   fns2_from_fns and svn_repos_parse_dumpstream2.
  (delete_node_property, apply_textdelta): New callbacks to handle
   delta property and text blocks.
  (svn_repos_get_fs_build_parser2): Implement using deprecated
   svn_repos_get_fs_build_parser and fns2_from_fns. This is sorta
   backwards, but this way we can convert in the same direction as
   svn_repos_parse_dumpstream2 does.
  (svn_repos_load_fs): Use new parsing functions so that we can load
   dumps which use deltas.
* subversion/libsvn_repos/dump.c
  (write_hash_to_stringbuf): Accept an oldhash parameter; if given,
   don't write unchanged props, and do write deleted props.
  (store_delta): New helper function to compute a delta, store it
   to a temporary file, and then rewind and return the file.
  (edit_baton): Add use_deltas field.
  (dump_node): If use_deltas is set, output delta prop and text
   blocks. Also simplify the determination of whether we must dump
   props and text a bit.
  (get_dump_editor): Add use_deltas flag and store it in the editor
   baton.
  (write_revision_record): Adjust call to write_hash_to_stringbuf.
  (svn_repos_dump_fs2): Renamed from svn_repos_dump_fs. Accept new
   use_deltas flag. If not set, use the old format version so that
   dumps can still be read by svn 1.0.x. Pass use_deltas flag to
   get_dump_editor, except for the first rev of a non-incremental
   dump.
  (svn_repos_dump_fs): Implement deprecated function in terms of
   svn_repos_dump_fs.

Index: notes/fs_dumprestore.txt
===================================================================
--- notes/fs_dumprestore.txt (revision 9220)
+++ notes/fs_dumprestore.txt (working copy)
@@ -199,4 +199,36 @@
 
     This should be used to indicate the UUID of the originating repository.
 
+-------------------------------------
 
+SVN DUMPFILE VERSION 3 FORMAT
+
+This format is equivalent to the VERSION 2 format except for the
+following:
+
+1.) The format starts with the new version number of the dump format
+ ("SVN-fs-dump-format-version: 3\n").
+
+2.) There are two new optional headers for node changes:
+
+[Text-delta: true|false]
+[Prop-delta: true|false]
+
+ The default value for these headers is "false". If the value is
+ set to "true", then the text and property contents will be treated
+ as deltas against the previous contents of the node (as determined
+ by copy history for adds with history, or by the value in the
+ previous revision for changes--just as with commits).
+
+Property deltas have the same format as regular property lists except
+that (1) properties with the same value as in the previous contents of
+the node are not printed, and (2) deleted properties will be written
+out as
+
+D <name length>
+<name>
+
+just as a regular property is printed, but with the "K " changed to a
+"D " and with no value part.
+
+Text deltas are written out as a series of svndiff windows.
Index: subversion/svnadmin/main.c
===================================================================
--- subversion/svnadmin/main.c (revision 9220)
+++ subversion/svnadmin/main.c (working copy)
@@ -154,6 +154,7 @@
   {
     svnadmin__version = SVN_OPT_FIRST_LONGOPT_ID,
     svnadmin__incremental,
+ svnadmin__deltas,
     svnadmin__ignore_uuid,
     svnadmin__force_uuid,
     svnadmin__parent_dir,
@@ -188,6 +189,9 @@
     {"incremental", svnadmin__incremental, 0,
      "dump incrementally"},
 
+ {"deltas", svnadmin__deltas, 0,
+ "use deltas in dump output"},
+
     {"bypass-hooks", svnadmin__bypass_hooks, 0,
      "bypass the repository hook system"},
 
@@ -247,7 +251,7 @@
      "revision trees. If only LOWER is given, dump that one revision tree.\n"
      "If --incremental is passed, then the first revision dumped will be\n"
      "a diff against the previous revision, instead of the usual fulltext.\n",
- {'r', svnadmin__incremental, 'q'} },
+ {'r', svnadmin__incremental, svnadmin__deltas, 'q'} },
 
     {"help", subcommand_help, {"?", "h"},
      "usage: svnadmin help [SUBCOMMAND...]\n\n"
@@ -329,6 +333,7 @@
   svn_boolean_t help; /* --help or -? */
   svn_boolean_t version; /* --version */
   svn_boolean_t incremental; /* --incremental */
+ svn_boolean_t use_deltas; /* --deltas */
   svn_boolean_t quiet; /* --quiet */
   svn_boolean_t bdb_txn_nosync; /* --bdb-txn-nosync */
   svn_boolean_t bdb_log_keep; /* --bdb-log-keep */
@@ -492,9 +497,10 @@
     SVN_ERR (create_stdio_stream (&stderr_stream,
                                   apr_file_open_stderr, pool));
 
- SVN_ERR (svn_repos_dump_fs (repos, stdout_stream, stderr_stream,
+ SVN_ERR (svn_repos_dump_fs2 (repos, stdout_stream, stderr_stream,
                               lower, upper, opt_state->incremental,
- check_cancel, NULL, pool));
+ opt_state->use_deltas, check_cancel, NULL,
+ pool));
 
   return SVN_NO_ERROR;
 }
@@ -919,6 +925,9 @@
       case svnadmin__incremental:
         opt_state.incremental = TRUE;
         break;
+ case svnadmin__deltas:
+ opt_state.use_deltas = TRUE;
+ break;
       case svnadmin__ignore_uuid:
         opt_state.uuid_action = svn_repos_load_uuid_ignore;
         break;
Index: subversion/include/svn_repos.h
===================================================================
--- subversion/include/svn_repos.h (revision 9220)
+++ subversion/include/svn_repos.h (working copy)
@@ -823,7 +823,7 @@
 
 /* The RFC822-style headers in our dumpfile format. */
 #define SVN_REPOS_DUMPFILE_MAGIC_HEADER "SVN-fs-dump-format-version"
-#define SVN_REPOS_DUMPFILE_FORMAT_VERSION 2
+#define SVN_REPOS_DUMPFILE_FORMAT_VERSION 3
 #define SVN_REPOS_DUMPFILE_UUID "UUID"
 #define SVN_REPOS_DUMPFILE_CONTENT_LENGTH "Content-length"
 
@@ -839,6 +839,8 @@
 
 #define SVN_REPOS_DUMPFILE_PROP_CONTENT_LENGTH "Prop-content-length"
 #define SVN_REPOS_DUMPFILE_TEXT_CONTENT_LENGTH "Text-content-length"
+#define SVN_REPOS_DUMPFILE_PROP_DELTA "Prop-delta"
+#define SVN_REPOS_DUMPFILE_TEXT_DELTA "Text-delta"
 
 /** The different "actions" attached to nodes in the dumpfile. */
 enum svn_node_action
@@ -871,10 +873,35 @@
  * against the previous revision (usually it looks like a full dump of
  * the tree).
  *
+ * If @a use_deltas is @c TRUE, output only node properties which have
+ * changed relative to the previous contents, and output text contents
+ * as svndiff data against the previous contents. Regardless of how
+ * this flag is set, the first revision of a non-incremental dump will
+ * be done with full plain text. A dump with @a use_deltas set cannot
+ * be loaded by Subversion 1.0.x.
+ *
  * If @a cancel_func is not @c NULL, it is called periodically with
  * @a cancel_baton as argument to see if the client wishes to cancel
  * the dump.
  */
+svn_error_t *svn_repos_dump_fs2 (svn_repos_t *repos,
+ svn_stream_t *dumpstream,
+ svn_stream_t *feedback_stream,
+ svn_revnum_t start_rev,
+ svn_revnum_t end_rev,
+ svn_boolean_t incremental,
+ svn_boolean_t use_deltas,
+ svn_cancel_func_t cancel_func,
+ void *cancel_baton,
+ apr_pool_t *pool);
+
+
+/**
+ * @deprecated Provided for backward compatibility with the 1.0.0 API.
+ *
+ * Similar to svn_repos_dump_fs2(), but with the @a use_deltas
+ * parameter always set to @c FALSE.
+ */
 svn_error_t *svn_repos_dump_fs (svn_repos_t *repos,
                                 svn_stream_t *dumpstream,
                                 svn_stream_t *feedback_stream,
@@ -924,8 +951,8 @@
                                 apr_pool_t *pool);
 
 
-/** A vtable that is driven by @c svn_repos_parse_dumpstream. */
-typedef struct svn_repos_parse_fns_t
+/** A vtable that is driven by @c svn_repos_parse_dumpstream2. */
+typedef struct svn_repos_parse_fns2_t
 {
   /** The parser has discovered a new revision record within the
    * parsing session represented by @a parse_baton. All the headers are
@@ -968,6 +995,9 @@
                                      const char *name,
                                      const svn_string_t *value);
 
+ /** For a given @a node_baton, delete property @a name. */
+ svn_error_t *(*delete_node_property) (void *node_baton, const char *name);
+
   /** For a given @a node_baton, remove all properties. */
   svn_error_t *(*remove_node_props) (void *node_baton);
 
@@ -982,6 +1012,19 @@
   svn_error_t *(*set_fulltext) (svn_stream_t **stream,
                                 void *node_baton);
 
+ /** For a given @a node_baton, set @a handler and @a handler_baton
+ * to a window handler and baton capable of receiving a delta
+ * against the node's previous contents. A NULL window will be
+ * sent to the handler after all the windows are sent.
+ *
+ * If a @c NULL is returned instead of a handler, the vtable is
+ * indicating that no delta is desired, and the parser will not
+ * attempt to send it.
+ */
+ svn_error_t *(*apply_textdelta) (svn_txdelta_window_handler_t *handler,
+ void **handler_baton,
+ void *node_baton);
+
   /** The parser has reached the end of the current node represented by
    * @a node_baton, it can be freed.
    */
@@ -993,7 +1036,7 @@
    */
   svn_error_t *(*close_revision) (void *revision_baton);
 
-} svn_repos_parser_fns_t;
+} svn_repos_parser_fns2_t;
 
 
 
@@ -1020,8 +1063,8 @@
  * but still allow expansion of the format: most headers are ignored.
  */
 svn_error_t *
-svn_repos_parse_dumpstream (svn_stream_t *stream,
- const svn_repos_parser_fns_t *parse_fns,
+svn_repos_parse_dumpstream2 (svn_stream_t *stream,
+ const svn_repos_parser_fns2_t *parse_fns,
                             void *parse_baton,
                             svn_cancel_func_t cancel_func,
                             void *cancel_baton,
@@ -1045,6 +1088,71 @@
  *
  */
 svn_error_t *
+svn_repos_get_fs_build_parser2 (const svn_repos_parser_fns2_t **parser,
+ void **parse_baton,
+ svn_repos_t *repos,
+ svn_boolean_t use_history,
+ enum svn_repos_load_uuid uuid_action,
+ svn_stream_t *outstream,
+ const char *parent_dir,
+ apr_pool_t *pool);
+
+
+/**
+ * @deprecated Provided for backward compatibility with the 1.0.0 API.
+ *
+ * A vtable that is driven by @c svn_repos_parse_dumpstream. Lacks
+ * the delete_node_property and apply_textdelta callbacks.
+ */
+typedef struct svn_repos_parse_fns_t
+{
+ svn_error_t *(*new_revision_record) (void **revision_baton,
+ apr_hash_t *headers,
+ void *parse_baton,
+ apr_pool_t *pool);
+ svn_error_t *(*uuid_record) (const char *uuid,
+ void *parse_baton,
+ apr_pool_t *pool);
+ svn_error_t *(*new_node_record) (void **node_baton,
+ apr_hash_t *headers,
+ void *revision_baton,
+ apr_pool_t *pool);
+ svn_error_t *(*set_revision_property) (void *revision_baton,
+ const char *name,
+ const svn_string_t *value);
+ svn_error_t *(*set_node_property) (void *node_baton,
+ const char *name,
+ const svn_string_t *value);
+ svn_error_t *(*remove_node_props) (void *node_baton);
+ svn_error_t *(*set_fulltext) (svn_stream_t **stream,
+ void *node_baton);
+ svn_error_t *(*close_node) (void *node_baton);
+ svn_error_t *(*close_revision) (void *revision_baton);
+} svn_repos_parser_fns_t;
+
+
+/**
+ * @deprecated Provided for backward compatibility with the 1.0.0 API.
+ *
+ * Similar to svn_repos_parse_dumpstream2, but uses the more limited
+ * svn_repos_parser_fns_t vtable type.
+ */
+svn_error_t *
+svn_repos_parse_dumpstream (svn_stream_t *stream,
+ const svn_repos_parser_fns_t *parse_fns,
+ void *parse_baton,
+ svn_cancel_func_t cancel_func,
+ void *cancel_baton,
+ apr_pool_t *pool);
+
+
+/**
+ * @deprecated Provided for backward compatibility with the 1.0.0 API.
+ *
+ * Similar to svn_repos_get_fs_build_parser2, but yields the more
+ * limited svn_repos_parser_fns_t vtable type.
+ */
+svn_error_t *
 svn_repos_get_fs_build_parser (const svn_repos_parser_fns_t **parser,
                                void **parse_baton,
                                svn_repos_t *repos,
@@ -1053,6 +1161,8 @@
                                svn_stream_t *outstream,
                                const char *parent_dir,
                                apr_pool_t *pool);
+
+
 /** @} */
 
 #ifdef __cplusplus
Index: subversion/libsvn_repos/load.c
===================================================================
--- subversion/libsvn_repos/load.c (revision 9220)
+++ subversion/libsvn_repos/load.c (working copy)
@@ -76,6 +76,31 @@
 
 /*----------------------------------------------------------------------*/
 
+/** A conversion function between the two vtable types. **/
+static
+svn_repos_parser_fns2_t *fns2_from_fns (const svn_repos_parser_fns_t *fns,
+ apr_pool_t *pool)
+{
+ svn_repos_parser_fns2_t *fns2;
+
+ fns2 = apr_palloc (pool, sizeof (*fns2));
+ fns2->new_revision_record = fns->new_revision_record;
+ fns2->uuid_record = fns->uuid_record;
+ fns2->new_node_record = fns->new_node_record;
+ fns2->set_revision_property = fns->set_revision_property;
+ fns2->set_node_property = fns->set_node_property;
+ fns2->remove_node_props = fns->remove_node_props;
+ fns2->set_fulltext = fns->set_fulltext;
+ fns2->close_node = fns->close_node;
+ fns2->close_revision = fns->close_revision;
+ fns2->delete_node_property = NULL;
+ fns2->apply_textdelta = NULL;
+ return fns2;
+}
+
+
+/*----------------------------------------------------------------------*/
+
 /** The parser and related helper funcs **/
 
 
@@ -172,7 +197,7 @@
 static svn_error_t *
 parse_property_block (svn_stream_t *stream,
                       svn_filesize_t content_length,
- const svn_repos_parser_fns_t *parse_fns,
+ const svn_repos_parser_fns2_t *parse_fns,
                       void *record_baton,
                       svn_boolean_t is_node,
                       apr_pool_t *pool)
@@ -276,6 +301,41 @@
           else
             return stream_malformed (); /* didn't find expected 'V' line */
         }
+ else if ((buf[0] == 'D') && (buf[1] == ' '))
+ {
+ apr_size_t numread;
+ char *keybuf;
+ char c;
+
+ /* Get the length of the key */
+ apr_size_t keylen = (apr_size_t) atoi (buf + 2);
+
+ /* Now read that much into a buffer, + 1 byte for null terminator */
+ keybuf = apr_pcalloc (pool, keylen + 1);
+ numread = keylen;
+ SVN_ERR (svn_stream_read (stream, keybuf, &numread));
+ content_length -= numread;
+ if (numread != keylen)
+ return stream_ran_dry ();
+ keybuf[keylen] = '\0';
+
+ /* Suck up extra newline after key data */
+ numread = 1;
+ SVN_ERR (svn_stream_read (stream, &c, &numread));
+ content_length -= numread;
+ if (numread != 1)
+ return stream_ran_dry ();
+ if (c != '\n')
+ return stream_malformed ();
+
+ /* We don't expect these in revision properties, and if we see
+ one when we don't have a delete_node_property callback,
+ then we're seeing a v3 feature in a v2 dump. */
+ if (!is_node || !parse_fns->delete_node_property)
+ return stream_malformed ();
+
+ SVN_ERR (parse_fns->delete_node_property (record_baton, keybuf));
+ }
       else
         return stream_malformed (); /* didn't find expected 'K' line */
       
@@ -293,7 +353,8 @@
 static svn_error_t *
 parse_text_block (svn_stream_t *stream,
                   svn_filesize_t content_length,
- const svn_repos_parser_fns_t *parse_fns,
+ svn_boolean_t is_delta,
+ const svn_repos_parser_fns2_t *parse_fns,
                   void *record_baton,
                   char *buffer,
                   apr_size_t buflen,
@@ -302,8 +363,20 @@
   svn_stream_t *text_stream = NULL;
   apr_size_t num_to_read, rlen, wlen;
   
+ if (is_delta)
+ {
+ svn_txdelta_window_handler_t wh;
+ void *whb;
+
+ SVN_ERR (parse_fns->apply_textdelta (&wh, &whb, record_baton));
+ if (wh)
+ text_stream = svn_txdelta_parse_svndiff (wh, whb, TRUE, pool);
+ }
+ else
+ {
   /* Get a stream to which we can push the data. */
   SVN_ERR (parse_fns->set_fulltext (&text_stream, record_baton));
+ }
 
   /* If there are no contents to read, just write an empty buffer
      through our callback. */
@@ -384,8 +457,8 @@
 
 /* The Main Parser Logic */
 svn_error_t *
-svn_repos_parse_dumpstream (svn_stream_t *stream,
- const svn_repos_parser_fns_t *parse_fns,
+svn_repos_parse_dumpstream2 (svn_stream_t *stream,
+ const svn_repos_parser_fns2_t *parse_fns,
                             void *parse_baton,
                             svn_cancel_func_t cancel_func,
                             void *cancel_baton,
@@ -409,6 +482,14 @@
      number, and a blank line. */
   SVN_ERR (parse_format_version (linebuf->data, &version));
 
+ /* If we were called from svn_repos_parse_dumpstream(), the
+ callbacks to handle delta contents will be NULL, so we have to
+ reject dumpfiles with the current version. */
+ if (version == SVN_REPOS_DUMPFILE_FORMAT_VERSION
+ && (!parse_fns->delete_node_property || !parse_fns->apply_textdelta))
+ return svn_error_createf (SVN_ERR_STREAM_MALFORMED_DATA, NULL,
+ "Unsupported dumpfile version: %d", version);
+
   /* A dumpfile "record" is defined to be a header-block of
      rfc822-style headers, possibly followed by a content-block.
 
@@ -509,8 +590,14 @@
                                   SVN_REPOS_DUMPFILE_PROP_CONTENT_LENGTH,
                                   APR_HASH_KEY_STRING)))
         {
- /* First, remove all node properties. */
- if (found_node)
+ const char *delta = apr_hash_get (headers,
+ SVN_REPOS_DUMPFILE_PROP_DELTA,
+ APR_HASH_KEY_STRING);
+ svn_boolean_t is_delta = (delta && strcmp (delta, "true") == 0);
+
+ /* First, remove all node properties, unless this is a delta
+ property block. */
+ if (found_node && !is_delta)
             SVN_ERR (parse_fns->remove_node_props (node_baton));
 
           SVN_ERR (parse_property_block (stream,
@@ -526,8 +613,14 @@
                                   SVN_REPOS_DUMPFILE_TEXT_CONTENT_LENGTH,
                                   APR_HASH_KEY_STRING)))
         {
+ const char *delta = apr_hash_get (headers,
+ SVN_REPOS_DUMPFILE_TEXT_DELTA,
+ APR_HASH_KEY_STRING);
+ svn_boolean_t is_delta = (delta && strcmp (delta, "true") == 0);
+
           SVN_ERR (parse_text_block (stream,
                                      svn__atoui64 (valstr),
+ is_delta,
                                      parse_fns,
                                      found_node ? node_baton : rev_baton,
                                      buffer,
@@ -558,6 +651,19 @@
 }
 
 
+svn_error_t *
+svn_repos_parse_dumpstream (svn_stream_t *stream,
+ const svn_repos_parser_fns_t *parse_fns,
+ void *parse_baton,
+ svn_cancel_func_t cancel_func,
+ void *cancel_baton,
+ apr_pool_t *pool)
+{
+ svn_repos_parser_fns2_t *fns2 = fns2_from_fns (parse_fns, pool);
+
+ return svn_repos_parse_dumpstream2 (stream, fns2, parse_baton,
+ cancel_func, cancel_baton, pool);
+}
 
 /*----------------------------------------------------------------------*/
 
@@ -873,6 +979,20 @@
 
 
 static svn_error_t *
+delete_node_property (void *baton,
+ const char *name)
+{
+ struct node_baton *nb = baton;
+ struct revision_baton *rb = nb->rb;
+
+ SVN_ERR (svn_fs_change_node_prop (rb->txn_root, nb->path,
+ name, NULL, nb->pool));
+
+ return SVN_NO_ERROR;
+}
+
+
+static svn_error_t *
 remove_node_props (void *baton)
 {
   struct node_baton *nb = baton;
@@ -901,6 +1021,21 @@
 
 
 static svn_error_t *
+apply_textdelta (svn_txdelta_window_handler_t *handler,
+ void **handler_baton,
+ void *node_baton)
+{
+ struct node_baton *nb = node_baton;
+ struct revision_baton *rb = nb->rb;
+
+ return svn_fs_apply_textdelta (handler, handler_baton,
+ rb->txn_root, nb->path,
+ NULL, nb->md5_checksum,
+ nb->pool);
+}
+
+
+static svn_error_t *
 set_fulltext (svn_stream_t **stream,
               void *node_baton)
 {
@@ -995,6 +1130,33 @@
 
 
 svn_error_t *
+svn_repos_get_fs_build_parser2 (const svn_repos_parser_fns2_t **callbacks,
+ void **parse_baton,
+ svn_repos_t *repos,
+ svn_boolean_t use_history,
+ enum svn_repos_load_uuid uuid_action,
+ svn_stream_t *outstream,
+ const char *parent_dir,
+ apr_pool_t *pool)
+{
+ const svn_repos_parser_fns_t *fns;
+ svn_repos_parser_fns2_t *parser;
+
+ /* Fetch the old-style vtable and baton, convert the vtable to a
+ * new-style vtable, and set the new callbacks. */
+ SVN_ERR (svn_repos_get_fs_build_parser (&fns, parse_baton, repos,
+ use_history, uuid_action, outstream,
+ parent_dir, pool));
+ parser = fns2_from_fns (fns, pool);
+ parser->delete_node_property = delete_node_property;
+ parser->apply_textdelta = apply_textdelta;
+ *callbacks = parser;
+ return SVN_NO_ERROR;
+}
+
+
+
+svn_error_t *
 svn_repos_get_fs_build_parser (const svn_repos_parser_fns_t **parser_callbacks,
                                void **parse_baton,
                                svn_repos_t *repos,
@@ -1041,12 +1203,12 @@
                    void *cancel_baton,
                    apr_pool_t *pool)
 {
- const svn_repos_parser_fns_t *parser;
+ const svn_repos_parser_fns2_t *parser;
   void *parse_baton;
   
   /* This is really simple. */
 
- SVN_ERR (svn_repos_get_fs_build_parser (&parser, &parse_baton,
+ SVN_ERR (svn_repos_get_fs_build_parser2 (&parser, &parse_baton,
                                           repos,
                                           TRUE, /* look for copyfrom revs */
                                           uuid_action,
@@ -1054,7 +1216,7 @@
                                           parent_dir,
                                           pool));
 
- SVN_ERR (svn_repos_parse_dumpstream (dumpstream, parser, parse_baton,
+ SVN_ERR (svn_repos_parse_dumpstream2 (dumpstream, parser, parse_baton,
                                        cancel_func, cancel_baton, pool));
 
   return SVN_NO_ERROR;
Index: subversion/libsvn_repos/dump.c
===================================================================
--- subversion/libsvn_repos/dump.c (revision 9220)
+++ subversion/libsvn_repos/dump.c (working copy)
@@ -33,10 +33,14 @@
 
 /** A variant of our hash-writing routine in libsvn_subr; this one
     writes to a stringbuf instead of a file, and outputs PROPS-END
- instead of END. **/
+ instead of END. If OLDHASH is not NULL, then only properties
+ which vary from OLDHASH will be written, and properties which
+ exist only in OLDHASH will be written out with "D" entries
+ (like "K" entries but with no corresponding value). **/
 
 static void
 write_hash_to_stringbuf (apr_hash_t *hash,
+ apr_hash_t *oldhash,
                          svn_stringbuf_t **strbuf,
                          apr_pool_t *pool)
 {
@@ -55,6 +59,16 @@
 
       /* Get this key and val. */
       apr_hash_this (this, &key, &keylen, &val);
+ value = val;
+
+ /* Don't output properties equal to the ones in oldhash, if present. */
+ if (oldhash)
+ {
+ svn_string_t *oldvalue = apr_hash_get (oldhash, key, keylen);
+
+ if (oldvalue && svn_string_compare (value, oldvalue))
+ continue;
+ }
 
       /* Output name length, then name. */
 
@@ -68,7 +82,6 @@
       svn_stringbuf_appendbytes (*strbuf, "\n", 1);
 
       /* Output value length, then value. */
- value = val;
 
       svn_stringbuf_appendbytes (*strbuf, "V ", 2);
 
@@ -80,10 +93,77 @@
       svn_stringbuf_appendbytes (*strbuf, "\n", 1);
     }
 
+ if (oldhash)
+ {
+ /* Output a "D " entry for each property in oldhash but not hash. */
+ for (this = apr_hash_first (pool, oldhash); this;
+ this = apr_hash_next (this))
+ {
+ const void *key;
+ void *val;
+ apr_ssize_t keylen;
+ int bytes_used;
+
+ /* Get this key and val. */
+ apr_hash_this (this, &key, &keylen, &val);
+
+ /* Only output values deleted in hash. */
+ if (apr_hash_get (hash, key, keylen))
+ continue;
+
+ /* Output name length, then name. */
+
+ svn_stringbuf_appendbytes (*strbuf, "D ", 2);
+
+ sprintf (buf, "%" APR_SSIZE_T_FMT "%n", keylen, &bytes_used);
+ svn_stringbuf_appendbytes (*strbuf, buf, bytes_used);
+ svn_stringbuf_appendbytes (*strbuf, "\n", 1);
+
+ svn_stringbuf_appendbytes (*strbuf, (const char *) key, keylen);
+ svn_stringbuf_appendbytes (*strbuf, "\n", 1);
+ }
+ }
   svn_stringbuf_appendbytes (*strbuf, "PROPS-END\n", 10);
 }
 
 
+/* Compute the delta between OLDROOT/OLDPATH and NEWROOT/NEWPATH and
+ store it into a new temporary file *TEMPFILE. Record the length of
+ the temporary file in *LEN, and rewind the file before returning. */
+static svn_error_t *
+store_delta (apr_file_t **tempfile, svn_filesize_t *len,
+ svn_fs_root_t *oldroot, const char *oldpath,
+ svn_fs_root_t *newroot, const char *newpath, apr_pool_t *pool)
+{
+ const char *tempdir, *name;
+ svn_stream_t *temp_stream;
+ apr_off_t offset = 0;
+ svn_txdelta_stream_t *delta_stream;
+ svn_txdelta_window_handler_t wh;
+ void *whb;
+
+ /* Create a temporary file and open a stream to it. */
+ SVN_ERR (svn_io_temp_dir (&tempdir, pool));
+ SVN_ERR (svn_io_open_unique_file (tempfile, &name,
+ apr_psprintf (pool, "%s/dump", tempdir),
+ ".tmp", TRUE, pool));
+ temp_stream = svn_stream_from_aprfile (*tempfile, pool);
+
+ /* Compute the delta and send it to the temporary file. */
+ SVN_ERR (svn_fs_get_file_delta_stream (&delta_stream, oldroot, oldpath,
+ newroot, newpath, pool));
+ svn_txdelta_to_svndiff (temp_stream, pool, &wh, &whb);
+ SVN_ERR (svn_txdelta_send_txstream (delta_stream, wh, whb, pool));
+
+ /* Get the length of the temporary file and rewind it. */
+ SVN_ERR (svn_io_file_seek (*tempfile, APR_CUR, &offset, pool));
+ *len = offset;
+ offset = 0;
+ SVN_ERR (svn_io_file_seek (*tempfile, APR_SET, &offset, pool));
+ return SVN_NO_ERROR;
+}
+
+
 /*----------------------------------------------------------------------*/
 
 /** An editor which dumps node-data in 'dumpfile format' to a file. **/
@@ -106,6 +186,9 @@
   svn_fs_root_t *fs_root;
   svn_revnum_t current_rev;
 
+ /* True if dumped nodes should output deltas instead of full text. */
+ svn_boolean_t use_deltas;
+
   /* The first revision dumped in this dumpstream. */
   svn_revnum_t oldest_dumped_rev;
 
@@ -223,12 +306,13 @@
            apr_pool_t *pool)
 {
   svn_stringbuf_t *propstring;
- apr_hash_t *prophash;
- svn_filesize_t textlen = 0, content_length = 0;
- apr_size_t proplen = 0, len;
+ svn_filesize_t content_length = 0;
+ apr_size_t len;
   svn_boolean_t must_dump_text = FALSE, must_dump_props = FALSE;
   const char *compare_path = path;
   svn_revnum_t compare_rev = eb->current_rev - 1;
+ svn_fs_root_t *compare_root = NULL;
+ apr_file_t *delta_file = NULL;
 
   /* Write out metadata headers for this file node. */
   if (eb->stream)
@@ -257,9 +341,6 @@
 
   if (action == svn_node_action_change)
     {
- svn_fs_root_t *compare_root;
- svn_boolean_t text_changed = FALSE, props_changed = FALSE;
-
       if (eb->stream)
         SVN_ERR (svn_stream_printf (eb->stream, pool,
                                     SVN_REPOS_DUMPFILE_NODE_ACTION
@@ -270,17 +351,13 @@
                                      svn_fs_root_fs (eb->fs_root),
                                      compare_rev, pool));
       
- SVN_ERR (svn_fs_props_changed (&props_changed,
+ SVN_ERR (svn_fs_props_changed (&must_dump_props,
                                      compare_root, compare_path,
                                      eb->fs_root, path, pool));
       if (kind == svn_node_file)
- SVN_ERR (svn_fs_contents_changed (&text_changed,
+ SVN_ERR (svn_fs_contents_changed (&must_dump_text,
                                           compare_root, compare_path,
                                           eb->fs_root, path, pool));
- if (props_changed)
- must_dump_props = TRUE;
- if (text_changed)
- must_dump_text = TRUE;
     }
   else if (action == svn_node_action_replace)
     {
@@ -345,9 +422,6 @@
         }
       else
         {
- svn_fs_root_t *src_root;
- svn_boolean_t text_changed = FALSE, props_changed = FALSE;
-
           if ((cmp_rev < eb->oldest_dumped_rev)
               && eb->feedback_stream)
             svn_stream_printf
@@ -366,23 +440,19 @@
                                         ": %s\n",
                                         cmp_rev, cmp_path));
 
- SVN_ERR (svn_fs_revision_root (&src_root,
+ SVN_ERR (svn_fs_revision_root (&compare_root,
                                          svn_fs_root_fs (eb->fs_root),
                                          compare_rev, pool));
 
           /* Need to decide if the copied node had any extra textual or
              property mods as well. */
- SVN_ERR (svn_fs_props_changed (&props_changed,
- src_root, compare_path,
+ SVN_ERR (svn_fs_props_changed (&must_dump_props,
+ compare_root, compare_path,
                                          eb->fs_root, path, pool));
           if (kind == svn_node_file)
- SVN_ERR (svn_fs_contents_changed (&text_changed,
- src_root, compare_path,
+ SVN_ERR (svn_fs_contents_changed (&must_dump_text,
+ compare_root, compare_path,
                                               eb->fs_root, path, pool));
- if (props_changed)
- must_dump_props = TRUE;
- if (text_changed)
- must_dump_text = TRUE;
           
           /* ### someday write a node-copyfrom-source-checksum. */
         }
@@ -410,8 +480,21 @@
      property values here. */
   if (must_dump_props)
     {
+ apr_hash_t *prophash, *oldhash = NULL;
+ apr_size_t proplen;
+
       SVN_ERR (svn_fs_node_proplist (&prophash, eb->fs_root, path, pool));
- write_hash_to_stringbuf (prophash, &propstring, pool);
+ if (eb->use_deltas && compare_root)
+ {
+ /* Fetch the old property hash to diff against and output a header
+ saying that our property contents are a delta. */
+ SVN_ERR (svn_fs_node_proplist (&oldhash, compare_root, compare_path,
+ pool));
+ if (eb->stream)
+ SVN_ERR (svn_stream_printf (eb->stream, pool,
+ SVN_REPOS_DUMPFILE_PROP_DELTA ": true\n"));
+ }
+ write_hash_to_stringbuf (prophash, oldhash, &propstring, pool);
       proplen = propstring->len;
       content_length += proplen;
       if (eb->stream)
@@ -426,15 +509,34 @@
     {
       unsigned char md5_digest[APR_MD5_DIGESTSIZE];
       const char *hex_digest;
+ svn_filesize_t textlen;
 
+ if (eb->use_deltas && compare_root)
+ {
+ /* Compute the text delta now and write it into a temporary
+ file, so that we can find its length. Output a header
+ saying our text contents are a delta. */
+ SVN_ERR (store_delta (&delta_file, &textlen, compare_root,
+ compare_path, eb->fs_root, path, pool));
+ if (eb->stream)
+ SVN_ERR (svn_stream_printf (eb->stream, pool,
+ SVN_REPOS_DUMPFILE_TEXT_DELTA
+ ": true\n"));
+ }
+ else
+ {
+ /* Just fetch the length of the file. */
       SVN_ERR (svn_fs_file_length (&textlen, eb->fs_root, path, pool));
+ }
+
       content_length += textlen;
- SVN_ERR (svn_fs_file_md5_checksum (md5_digest, eb->fs_root, path, pool));
- hex_digest = svn_md5_digest_to_cstring (md5_digest, pool);
       if (eb->stream)
         SVN_ERR (svn_stream_printf (eb->stream, pool,
                                     SVN_REPOS_DUMPFILE_TEXT_CONTENT_LENGTH
                                     ": %" SVN_FILESIZE_T_FMT "\n", textlen));
+
+ SVN_ERR (svn_fs_file_md5_checksum (md5_digest, eb->fs_root, path, pool));
+ hex_digest = svn_md5_digest_to_cstring (md5_digest, pool);
       if (hex_digest && eb->stream)
         SVN_ERR (svn_stream_printf (eb->stream, pool,
                                     SVN_REPOS_DUMPFILE_TEXT_CONTENT_CHECKSUM
@@ -463,7 +565,11 @@
     {
       svn_stream_t *contents;
 
+ if (delta_file)
+ contents = svn_stream_from_aprfile (delta_file, pool);
+ else
       SVN_ERR (svn_fs_file_contents (&contents, eb->fs_root, path, pool));
+
       SVN_ERR (svn_stream_copy (contents, eb->stream, pool));
     }
   
@@ -707,6 +813,7 @@
                  svn_stream_t *stream,
                  svn_stream_t *feedback_stream,
                  svn_revnum_t oldest_dumped_rev,
+ svn_boolean_t use_deltas,
                  apr_pool_t *pool)
 {
   /* Allocate an edit baton to be stored in every directory baton.
@@ -723,6 +830,7 @@
   eb->path = apr_pstrdup (pool, root_path);
   SVN_ERR (svn_fs_revision_root (&(eb->fs_root), fs, to_rev, pool));
   eb->current_rev = to_rev;
+ eb->use_deltas = use_deltas;
 
   /* Set up the editor. */
   dump_editor->open_root = open_root;
@@ -784,7 +892,7 @@
                     datevalue);
     }
 
- write_hash_to_stringbuf (props, &encoded_prophash, pool);
+ write_hash_to_stringbuf (props, NULL, &encoded_prophash, pool);
 
   /* ### someday write a revision-content-checksum */
 
@@ -816,12 +924,13 @@
 
 /* The main dumper. */
 svn_error_t *
-svn_repos_dump_fs (svn_repos_t *repos,
+svn_repos_dump_fs2 (svn_repos_t *repos,
                    svn_stream_t *stream,
                    svn_stream_t *feedback_stream,
                    svn_revnum_t start_rev,
                    svn_revnum_t end_rev,
                    svn_boolean_t incremental,
+ svn_boolean_t use_deltas,
                    svn_cancel_func_t cancel_func,
                    void *cancel_baton,
                    apr_pool_t *pool)
@@ -867,9 +976,15 @@
      magic header followed by a dumpfile format version. */
   if (stream)
     {
+ int version = SVN_REPOS_DUMPFILE_FORMAT_VERSION;
+
+ /* If we're not using deltas, use the previous version, for
+ compatibility with svn 1.0.x. */
+ if (!use_deltas)
+ version--;
       SVN_ERR (svn_stream_printf (stream, pool,
                                   SVN_REPOS_DUMPFILE_MAGIC_HEADER ": %d\n\n",
- SVN_REPOS_DUMPFILE_FORMAT_VERSION));
+ version));
       SVN_ERR (svn_stream_printf (stream, pool, SVN_REPOS_DUMPFILE_UUID
                                   ": %s\n\n", uuid));
     }
@@ -879,6 +994,7 @@
     {
       svn_revnum_t from_rev, to_rev;
       svn_fs_root_t *to_root;
+ svn_boolean_t use_deltas_for_rev;
 
       /* Check for cancellation. */
       if (cancel_func)
@@ -914,10 +1030,13 @@
       /* Write the revision record. */
       SVN_ERR (write_revision_record (stream, fs, to_rev, subpool));
 
- /* The editor which dumps nodes to a file. */
- SVN_ERR (get_dump_editor (&dump_editor, &dump_edit_baton,
- fs, to_rev, "/", stream, feedback_stream,
- start_rev, subpool));
+ /* Fetch the editor which dumps nodes to a file. Regardless of
+ what we've been told, don't use deltas for the first rev of a
+ non-incremental dump. */
+ use_deltas_for_rev = use_deltas && (incremental || i != start_rev);
+ SVN_ERR (get_dump_editor (&dump_editor, &dump_edit_baton, fs, to_rev,
+ "/", stream, feedback_stream, start_rev,
+ use_deltas_for_rev, subpool));
 
       /* Drive the editor in one way or another. */
       SVN_ERR (svn_fs_revision_root (&to_root, fs, to_rev, subpool));
@@ -959,3 +1078,19 @@
 
   return SVN_NO_ERROR;
 }
+
+svn_error_t *
+svn_repos_dump_fs (svn_repos_t *repos,
+ svn_stream_t *stream,
+ svn_stream_t *feedback_stream,
+ svn_revnum_t start_rev,
+ svn_revnum_t end_rev,
+ svn_boolean_t incremental,
+ svn_cancel_func_t cancel_func,
+ void *cancel_baton,
+ apr_pool_t *pool)
+{
+ return svn_repos_dump_fs2 (repos, stream, feedback_stream, start_rev,
+ end_rev, incremental, FALSE, cancel_func,
+ cancel_baton, pool);
+}

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Mar 28 07:59:14 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.