On Fri, May 22, 2020 at 12:16 AM Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
>
> Johan Corveleyn wrote on Thu, 21 May 2020 17:53 +0200:
> > Trying to summarize this thread a bit. I apologize in advance if I
> > forgot something, or have misrepresented any of the points that were
> > raised (feel free to correct / add).
> >
>
> I have additions, below.
>
> > Denis summed up the following problems that might happen while
> > 'verify' locks the repcache.db:
> >
> > > 1. a post-commit error "database is locked"
> > > 2. new representations will not be added in the rep-cache.db
> > > 3. deduplication does not work for new data committed at this time
> > > 4. commits work with delays.
> >
> > We have also established that the new tool build-repcache is not
> > suitable for post-factum fixing of 3). It does not reprocess already
> > committed revisions.
>
> Note, however, that (3) is simply an observation that Denis made about
> the semantics of the incumbent code. We do not know of a user who needs
> rep-cache.db entries to be added so soon after the commit that running
> «build-repcache» after «verify» finishes would be too late (or impractical).
>
> > We are currently considering two approaches to address these issues:
> >
>
> Hold on. Those issues are what happens *whenever* the post-commmit FS
> processing INSERTs are starved. They happen when a «verify» starves
> INSERTs, but they are known to have at least one other case: when two
> commits happen in quick succession, the INSERTs that happen during
> post-commit FS processing of the first commit can starve the second
> commit's INSERTs. (Don't we have a jira issue for this variant? I
> thought we did, but my searches came up dry.)
>
> Which is to say, the presentation of the matter as "four issues to be
> fixed" isn't accurate. There are _eight_ issues to consider: (1) to (4)
> when they are caused by a «verify» starving INSERTs, and (1) to (4) when
> they are caused by an «INSERT» starving INSERTs.
>
> Thus, your analysis of the pros and cons is incomplete: it glosses over
> the case of an «INSERT» starving other INSERTs.
Good point. So those "4 problems" are all consequences of "rep-cache
insertion starvation". One possible cause of this might be 'verify',
but there are other known causes, such as 'locked by another commit'.
I'm not sure how likely that last form of starvation is in practice,
though (but if you've seen a jira issue of it, hmmm, maybe).
Denis' patch tries to fix the 'verify' cause of starvation.
Focusing on avoiding / fixing the *consequences* of the starvation
might be another approach, yes.
> Furthermore, over the course of the thread several other ideas have been
> floated, intended to address various subsets of the eight issues. The
> "Perform INSERTs asynchronously" idea, for example (raised by me, but
> inspired by a remark of Brane's), can basically fix all eight issues by
> itself. There was also the idea to stop marshalling post-commit FS
> processing errors to the client, which would fix both variants of (1)
> but none of the other six. I think there were other ideas as well, but
> I've run out of time for this thread for tonight, sorry.
Ack.
So IIUC these are addressing the consequences of the starvation (which
might help regardless of the cause), rather than taking away the
locking-causes.
I think both 'angles' are quite useful. We might even want to combine
... mitigating the consequences, as well as avoiding some of the
causes for the starvation (because even with consequence-mitigation
things might not turn out ideal, or only be mitigated at the cost of
X, so fixing some known causes can still be valuable).
--
Johan
Received on 2020-05-22 10:03:09 CEST