On 26 June 2014 19:08, Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> wrote:
>
>
> On Thu, Jun 26, 2014 at 11:52 AM, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
>>
>> On 25 June 2014 22:11, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
>> > Ivan Zhakov wrote on Wed, Jun 25, 2014 at 20:03:11 +0400:
>> >> On 25 June 2014 19:54, Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
>> >> wrote:
>> >> > On Wed, Jun 25, 2014 at 5:46 PM, Ivan Zhakov <ivan_at_visualsvn.com>
>> >> > wrote:
>> >> >>
>> >> >> On 25 June 2014 19:24, Stefan Fuhrmann
>> >> >> <stefan.fuhrmann_at_wandisco.com>
>> >> >> wrote:
>> >> >> > On Wed, Jun 25, 2014 at 4:57 PM, Ivan Zhakov <ivan_at_visualsvn.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> On 21 June 2014 17:15,  <stefan2_at_apache.org> wrote:
>> >> >> >> > Author: stefan2
>> >> >> >> > Date: Sat Jun 21 15:15:30 2014
>> >> >> >> > New Revision: 1604421
>> >> >> >> >
>> >> >> >> > URL: http://svn.apache.org/r1604421
>> >> >> >> > Log:
>> >> >> >> > Append index data to the rev data file instead of using
>> >> >> >> > separate
>> >> >> >> > files.
>
>
>>
>> >> >> >> > >
>> >> >> >> > > > >==============================================================================
>>
>> >> >> >> >--- subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c (original)
>> >> >> >> >+++ subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c Sat Jun 21
>> >> >> >> > 15:15:30
>> >> >> >> > 2014
>> >> >> >> >@@ -1449,15 +1449,35 @@ write_revision_zero(svn_fs_t *fs)
>> >> >> >> >+    SVN_ERR(svn_io_file_create_binary(path_revision_zero,
>> >> >> >> >+                  "PLAIN\nEND\nENDREP\n"
>> >> >> >> >+                  "id: 0.0.r0/2\n"
>> >> >> >> >+                  "type: dir\n"
>> >> >> >> >+                  "count: 0\n"
>> >> >> >> >+                  "text: 0 3 4 4 "
>> >> >> >> >+                  "2d2977d1c96f487abe4a1e202dd03b4e\n"
>> >> >> >> >+                  "cpath: /\n"
>> >> >> >> >+                  "\n\n"
>> >> >> >> >+
>> >> >> >> >+                  /* L2P index */
>> >> >> >> >+                  "\0\x80\x40"          /* rev 0, 8k entries
>> >> >> >> > per page
>> >> >> >> > */
>> >> >> >> >+                  "\1\1\1"              /* 1 rev, 1 page, 1
>> >> >> >> > page in
>> >> >> >> > 1st
>> >> >> >> > rev */
>> >> >> >> >+                  "\6\4"                /* page size: bytes,
>> >> >> >> > count */
>> >> >> >> >+                  "\0\xd6\1\xb1\1\x21"  /* phys offsets + 1 */
>> >> >> >> >+
>> >> >> >> >+                  /* P2L index */
>> >> >> >> >+                  "\0\x6b"              /* start rev, rev file
>> >> >> >> > size
>> >> >> >> > */
>> >> >> >> >+                  "\x80\x80\4\1\x1D"    /* 64k pages, 1 page
>> >> >> >> > using 29
>> >> >> >> > bytes */
>> >> >> >> >+                  "\0"                  /* offset entry 0 page
>> >> >> >> > 1 */
>> >> >> >> >+                                        /* len, item & type,
>> >> >> >> > rev,
>> >> >> >> > checksum */
>> >> >> >> >+                  "\x11\x34\0\xf5\xd6\x8c\x81\x06"
>> >> >> >> >+                  "\x59\x09\0\xc8\xfc\xf6\x81\x04"
>> >> >> >> >+                  "\1\x0d\0\x9d\x9e\xa9\x94\x0f"
>> >> >> >> >+                  "\x95\xff\3\x1b\0\0"  /* last entry fills up
>> >> >> >> > 64k
>> >> >> >> > page
>> >> >> >> > */
>> >> >> >> >+
>> >> >> >> >+                  /* Footer */
>> >> >> >> >+                  "107 121\7",
>> >> >> >> >+                  107 + 14 + 38 + 7 + 1, fs->pool));
>> >> >> >> This constant makes code unreadable, unmaintainable and very
>> >> >> >> error
>> >> >> >> prone.
>> > ...
>> >> I saw it, but it doesn't make code easier to maintain.
>> >
>> > Ivan, that's the third time in as many emails that you say "the new code
>> > is hard to maintain".  Could you please explain *how* you find it hard
>> > to maintain?
>> >
>> > Anyway, that's just me guessing.  Could you please clarify what exactly
>> > makes the code unmaintainable in your opinion?
>> >
>> In my opinion 'code maintainability' is how easy or hard to make
>> change and review the code. This magic constant with a lot 7b encoded
>> numbers are very hard to review and modify if needed in future. Even
>> Stefan2 mentioned that he made slight mistakes when changing that
>> constant that was caught by test suite. But I don't assume that code
>> that passes all test suite doesn't have bugs, while Stefan2 seems to
>> assume. I bet that nobody reviewed "\x11\x34\0\xf5\xd6\x8c\x81\x06"
>> constant in the code above. Even I didn't, because I think it's a
>> waste of time and proper code should be committed or change reverted.
>
>
> Hi Ivan,
>
> I see three alternative ways to code that function
>
> 1. As hard coded string / byte sequence (current implementation).
>   Cons:
> * Hard to write, hard to review by just looking at it (applies to time
>   until initial release only).
>   Pros:
> * Explicitly coded as constant, deterring people from changing it.
> * Independent of other code, i.e. unintended changes to the format /
>   encoding generated by the normal code usually become apparent
>   when running the test suite.
>
> 2. Use our txn code to write r0. This should be simple and might
>   at most require some special ID handling.
>   Cons:
> * Generating incompatible r0s is likely not caught by our test suite
>   (assuming that reader and writer functions are in sync). Basically
>   all the risk that comes with dynamically generating a "constant".
> * Test cases must make sure our backward compat repo creation
>   options create repos that can actually be used by old releases.
>   (we might want explicit test for that anyway, though)
>   Pros:
> * No or very little special code for r0 (not sure, yet).
> * Format / encoding changes don't require new r0 templates.
>
> 3. Write code to "stitch" r0 together, e.g. string_add(md5("END\n")).
>   Cons:
> * No more robust than 1. but requires much more code.
> * May _look_ easy to understand but an actual offline review is still hard.
>   Pros:
> * Widely independent of other code, although not as much as 1.
>
> Do you generally agree with the range of options and their assessment?
Yes, I generally agree with range of options. The only other I have is
do not implement log addressing in first place.
> Which one would you pick and why?
>
It's hard to pick option without looking to code, but I would start
with leaving string constant for revision body and then appending
indexes data using API. I.e. somewhat modified (2).
-- 
Ivan Zhakov
Received on 2014-06-27 12:56:37 CEST