Trying to summarize this thread a bit. I apologize in advance if I
forgot something, or have misrepresented any of the points that were
raised (feel free to correct / add).
Denis summed up the following problems that might happen while
'verify' locks the repcache.db:
> 1. a post-commit error "database is locked"
> 2. new representations will not be added in the rep-cache.db
> 3. deduplication does not work for new data committed at this time
> 4. commits work with delays.
We have also established that the new tool build-repcache is not
suitable for post-factum fixing of 3). It does not reprocess already
committed revisions.
We are currently considering two approaches to address these issues:
1) Let verify process the repcache entries in small batches, without
holding an sqlite lock (Denis' patch).
pro:
+ Fixes #1 through #4.
con:
- Relies more heavily on sqlite guarantees that all rows that were
present at the start of 'verify' are readable and correct, after
verify has finished. SQLite might have subtle bugs in this area, and
verify should be as conservative / careful as possible.
2) Shard repcache.db to make the locking window smaller (Daniel's proposal).
pro:
+ Fixes #1 through #4 (if the shard size is 'small enough')
con:
- If we ever ned atomic operations on the entire repcache, we need to
forbid rep-cache.db shards from using WAL mode and use the ATTACH
DATABASE statement with the master journal (which is rarely used and
is not supported by all journaling modes).
- Requires format bump, which means it will only work if the admin has
run 'svnadmin upgrade'.
- May not fully fix the problem if the shard size is too large and
verification of a single shard still takes too much time (e.g. because
it's located on a network drive).
I'll add one more concern of my own here, regarding the 'sharding' approach:
I'd like to warn for the NIHS (Not Invented Here Syndrome) that comes
peeking around the corner if we say "SQLite might have subtle bugs
that might hurt us if we do X, but rolling our own solution might be
better". Why would "rolling our own solution" like sharding
repcache.db be less susceptible to such subtle bugs than SQLite? Okay,
on the one hand SQLite is more complex, because it's generic database
software. But on the other hand it presumably has a lot more users /
audience than just Subversion. I have no clear answer here.
HTH,
--
Johan
Received on 2020-05-21 17:53:36 CEST