>> Now, in order to distinguish different clusters, parts (1) and (2) remember
>> (like, "lock on") the (PATH, REP_ID) for the first noderev from the current
>> cluster. Iff the PATH changes (this means we have just entered the next
>> cluster), we update the stored (PATH, REP_ID) values. We need to remember
>> the REP_ID in order to reconstruct those delta chains in (2), because they
>> start from this REP_ID.
>>
>> As far as I can tell, the problem lies in the way we handle the first path.
>> For any non-empty path set, the first path always marks the beginning of the
>> corresponding cluster. However, we only remember the PATH part of the (PATH,
>> REP_ID) pair for the first path and omit the REP_ID. In case when the path
>> set consists of *a single cluster* (i.e. all path_order_t objects share the
>> same PATH), phase (2) will end up in an attempt to reconstruct the delta
>> chain starting from the uninitialized REP_ID. There are scenarios that lead
>> to this behavior in fs-fs-pack-test (see create_packed_filesystem() with
>> multiple /iota content tweaks falling into the same cluster) and this can
>> easily happen with real-world repositories.
>
> Well, chances are that PATH mismatches for the first
> non-NULL iteration in (2), which gets REP_ID set.
> However, if there is only one cluster with more than
> a single change in the pack file, the path PATH does
> match and REP_ID will have some random value.
Aha, for some reason I assumed that you need to carry the REP_ID from
(1) to (2) in order to be able to reconstruct the delta chain for this single
cluster (i.e. single path) scenario. That's clearly incorrect and now I
understand it.
> I committed a fix in r1572512. It removes the unnecessary
> write to REP_ID during (1) and makes sure that (PATH, REP_ID)
> get initialized to the first *relevant* entry at the being of (2).
Thank you for fixing this.
Regards,
Evgeny Kotkov
Received on 2014-02-27 12:49:03 CET