[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Byte order issue in FSFS 7

From: Philip Martin <philip.martin_at_wandisco.com>
Date: Sun, 02 Mar 2014 01:54:23 +0000

There is a problem with the FNV1a checksums in format 7: the on-disk
representation for big-endian systems, like SPARC, is different from
that of little-endian systems, like x86. Both systems calculate the
same checksum value, however the checksum code calls htonl() before
returning the value to the caller. On SPARC this has no effect and on
x86 it reverses the byte order, changing the value. If we were to write
the resulting memory directly to disk as a block of data this would be
good because the disk representations would be the same, but that is not
what happens. The value is passed to encode_uint() which uses arithmetic
to write out a representation of the bytes. Since the htonl() call
changes the x86 value this means the disk representation changes.

To get the same disk representation on both systems we do one of:
  1/ add a swap on little-endian systems
  2/ remove a swap on little-endian systems
  3/ add a swap on big-endian systems.

Since there is no ready-made swap for big-endian systems I have
discounted 3/. I've written patches for 1/ and 2/; they are more or
less equivalent the difference is the in-memory representation. I
prefer 2/.

[If we ignore exotic systems then htonl() and ntohl() are the same
function, both are noops on SPARC and both reverse on x86, which name to
use where is really just documentation.]

Here's 1/ calling ntohl() before encode_uint() and htonl() on the value
retrieved from disk:

Index: subversion/libsvn_fs_fs/index.c
===================================================================
--- subversion/libsvn_fs_fs/index.c (revision 1573204)
+++ subversion/libsvn_fs_fs/index.c (working copy)
@@ -1691,7 +1691,8 @@ svn_fs_fs__p2l_index_create(svn_fs_t *fs,
                                   encode_int(encoded, rev_diff),
                                   iter_pool));
       SVN_ERR(svn_spillbuf__write(buffer, (const char *)encoded,
- encode_uint(encoded, entry.fnv1_checksum),
+ encode_uint(encoded,
+ ntohl(entry.fnv1_checksum)),
                                   iter_pool));
 
       last_entry_end = entry_end;
@@ -2018,7 +2019,7 @@ read_entry(svn_fs_fs__packed_number_stream_t *stre
   entry.item.revision = *last_revision;
 
   SVN_ERR(packed_stream_get(&value, stream));
- entry.fnv1_checksum = (apr_uint32_t)value;
+ entry.fnv1_checksum = htonl((apr_uint32_t)value);
 
   APR_ARRAY_PUSH(result, svn_fs_fs__p2l_entry_t) = entry;
   *item_offset += entry.size;

Here's 2/ removing htonl() from the checksum return and reversing the
in-memory representation (there is a still an htonl() call inside the
32x4 checksum but that is correct):

Index: subversion/libsvn_subr/checksum.c
===================================================================
--- subversion/libsvn_subr/checksum.c (revision 1573204)
+++ subversion/libsvn_subr/checksum.c (working copy)
@@ -54,12 +54,12 @@ static const unsigned char sha1_empty_string_diges
 
 /* The FNV-1a digest for the empty string. */
 static const unsigned char fnv1a_32_empty_string_digest_array[] = {
- 0x81, 0x1c, 0x9d, 0xc5
+ 0xc5, 0x9d, 0x1c, 0x81
 };
 
 /* The FNV-1a digest for the empty string. */
 static const unsigned char fnv1a_32x4_empty_string_digest_array[] = {
- 0xcd, 0x6d, 0x9a, 0x85
+ 0x85, 0x9a, 0x6d, 0xcd
 };
 
 /* Digests for an empty string, indexed by checksum type */
Index: subversion/libsvn_subr/fnv1a.c
===================================================================
--- subversion/libsvn_subr/fnv1a.c (revision 1573204)
+++ subversion/libsvn_subr/fnv1a.c (working copy)
@@ -109,15 +109,15 @@ finalize_fnv1a_32x4(apr_uint32_t hashes[SCALING],
   if (len)
     memcpy(final_data + sizeof(apr_uint32_t) * SCALING, input, len);
 
- return htonl(fnv1a_32(FNV1_BASE_32,
- final_data,
- sizeof(apr_uint32_t) * SCALING + len));
+ return fnv1a_32(FNV1_BASE_32,
+ final_data,
+ sizeof(apr_uint32_t) * SCALING + len);
 }
 
 apr_uint32_t
 svn__fnv1a_32(const void *input, apr_size_t len)
 {
- return htonl(fnv1a_32(FNV1_BASE_32, input, len));
+ return fnv1a_32(FNV1_BASE_32, input, len);
 }
 
 apr_uint32_t
@@ -157,7 +157,7 @@ svn_fnv1a_32__update(svn_fnv1a_32__context_t *cont
 apr_uint32_t
 svn_fnv1a_32__finalize(svn_fnv1a_32__context_t *context)
 {
- return htonl(context->hash);
+ return context->hash;
 }
 
In both cases we need to change the hard-coded representation of r0
written when the repository is created:

Index: subversion/libsvn_fs_fs/fs_fs.c
===================================================================
--- subversion/libsvn_fs_fs/fs_fs.c (revision 1573204)
+++ subversion/libsvn_fs_fs/fs_fs.c (working copy)
@@ -1274,9 +1274,9 @@ write_revision_zero(svn_fs_t *fs)
                   "\x80\x80\4\1\x1D" /* 64k pages, 1 page using 29 bytes */
                   "\0" /* offset entry 0 page 1 */
                                         /* len, item & type, rev, checksum */
- "\x11\x34\0\xe0\xc6\xac\xa9\x07"
- "\x59\x09\0\xc0\xfa\xf8\xc5\x04"
- "\1\x0d\0\xf2\x95\xbe\xea\x01"
+ "\x11\x34\0\xf5\xd6\x8c\x81\x06"
+ "\x59\x09\0\xc8\xfc\xf6\x81\x04"
+ "\1\x0d\0\x9d\x9e\xa9\x94\x0f"
                   "\x95\xff\3\x1b\0\0", /* last entry fills up 64k page */
                   38,
                   fs->pool));

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*
Received on 2014-03-02 02:55:00 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.