[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [#4843] svn wc verify -- pristine files consistency check, and possibly repair

From: Julian Foad <julianfoad_at_apache.org>
Date: Fri, 10 Jan 2020 10:27:43 +0000

Filed as http://subversion.apache.org/issue/4843 .

We could start by defining more precisely what it needs to do.

Some aims, in order from highest priority:
   * check if any pristine file's content is corrupted (according to its
filename hash)
     - report and rename/delete corrupted pristines
   * check if any pristines are missing (according to wc.db)
   * fetch missing (or corrupted) pristines from the repository
   * verify wc.db 'pristine' table entries against other tables

Checking for content corruption by recalculating the checksums is going
to be slow -- there is no getting away from that -- so this most
important check is probably going to be the last one we run, and we may
choose to make it optional. That's fine.

We could check quickly:
   * for each pristine file listed in the DB:
     - file is present
     - file size matches the DB
     - file mod-time matches the DB

The existing 'cleanup' implementation contains a function
'pristine_cleanup_wcroot' which has in its doc string:

[[[
   TODO: Ideas for possible extra clean-up operations:

   * Check and correct all the refcounts. Identify any rows missing
     from the 'pristine' table. [...]

   * Check the checksums. (Very expensive to check them all, so find
     a way to not check them all.)

   * Check for pristine files missing from disk but referenced in the
     'pristine' table.

   * Repair any pristine files missing from disk and/or rows missing
     from the 'pristine' table and/or bad checksums. Generally
     requires contacting the server, so requires support at a higher
     level than this function.

   * Identify any pristine text files on disk that are not referenced
     in the DB, and delete them.
]]]

The refcounts are references within the DB from nodes to the 'pristines'
table. They are enforced by SQLite with 'REFERENCES' clauses in the
schema, though I saw one comment somewhere saying this was "in debug
builds" so we might want to double-check.

I am not aware of problems in the consistency of the DB tables, so I
don't think checking that is a priority. Though I don't have hard
evidence, from problems reported over the years I think corrupted and
missing pristine files on disk is the main concern.

- Julian
Received on 2020-01-10 11:27:45 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.