Re: "Offset too large" error when packing repository in FSFS 7 format

From: Stefan Fuhrmann <stefan2_at_apache.org>
Date: Sat, 27 Aug 2016 10:59:50 +0200

On 22.08.2016 11:38, Radek Krotil wrote:
>
> Thanks Stefan and Daniel for your effort in analyzing this.
> Unfortunately, I missed your replies as I expected that they will
> address my mailbox as well. So let me restart the thread now..
>
No worries, I'm on travels right now, so replies
will be delayed as well ...
>
> Recently I hit this problem on another production repository from one
> of my customers. My test machine where I work with the repositories
> has 512 GB SSD drive, so I need to keep the repositories as small as
> possible, so I try to migrate them to the latest format with
> deltification enabled and pack them. Therefore I usually take the
> following steps:
> 1) Unpack the zipped repository the customer sends me under customer
> name folder
>
> 2) Rename it to repo-x, where x is the format of the repository
> 3) Dump the repository
>
> 4) Create new repository named repo-1.9
>
> 5) Load the dump to the new repository
>
> 6) Pack the repo-1.9 repository
>
Up to this point, the repo-1.9 should be fine and
any "handling mistakes" should only affect later
revisions.
>
> 7) Configure Polarion ALM to use the repository
>
> 8) Start Polarion – only at this point is the repository first used by
> Apache and svnserve
>
Assuming that those servers have never seen a
repository called repo-1.9 since they were started,
there should be no caching-related issues.
>
> There is a svnserve process serving all repositories under
> /opt/repositories/, where the customer folders are stored. However,
> the repositories are not accessed until I’m fully done with the
> migration. The test server is dedicated for my use only, so I’m
> confident there are no other users reading/writing the repositories.
>
Sounds good. From what you describe here, I think
your conversion / upgrade process is correct.
>
> On this particular repository, I ran the dump/load cycle twice and in
> both cases it resulted in the svnadmin pack command failure.
>
So, the freshly loaded repository (between step 5
and 6) can already not be read. This might either
be due to corrupted data on disk or a problem in
the reader code.
>
> I’ll re-try to do dump/load on the other repository as well later.
> Svnadmin verify confirmed that the current repository is not corrupted.
>
Questions:
* To what degree is the pack problem reproducible?
   (sometimes / always with the same repo, same /
    different revision, same / different position in
    revision)
* Does 'svnadmin verify' complete w/o error on the
   repo that won't pack a minute later?
* Does retrying the pack result in the same error
   or does it complete the process?
>
> The repository contains 334243 revisions total. As suggested by
> Stefan, I did the grep on the problematic repo file. The rev 203908 is
> about 231 MB big. This confirms my suspicion that the problem is
> related to big revision data.
>
Well, that is at least conceivable: Larger revisions
with many changed nodes come with larger index
information. There might be bugs along the line of
"this time we need one more page than usually".
>
> [root_at_babybear svn]# svnadmin pack repo-1.9/
>
> Packing revisions in shard 203...svnadmin: E160056: Offset 232966338
> too large in revision 203908
>
> [root_at_babybear 203]# grep -oba L2P-INDEX 203908
>
> 231917762:L2P-INDEX
>
O.k. that information is helpful: 232966338 is not
a valid data location. I'll try to trace back where
that number might come from.

Question:
* What is the exact size of the revision in bytes?
>
> I tried to restart Apache, svnserve, even the whole box. The problem
> still persists. Unlike the other occurrences reported by other users,
> e.g. in https://issues.apache.org/jira/browse/SVN-4588, this does not
> seem to be related to invalid server cache, because I’m only using
> svnadmin command to work with the repository.
>
I, too, think that it is a separate problem and would
very much like to track it down.

-- Stefan^2.
>
> Looking forward to further suggestions.
>
> Best regards,
>
> Radek Krotil
>
> On 2016-06-04 18:57 (+0200), Daniel Shahaf <d.s_at_daniel.shahaf.name
> <mailto:d.s_at_daniel.shahaf.name>> wrote:
>
> > Stefan Fuhrmann wrote on Sat, Jun 04, 2016 at 08:04:42 -0000:
>
> > > On 2016-06-03 09:36 (+0200), Radek Krotil
> <radek.krotil_at_polarion.com <mailto:radek.krotil_at_polarion.com>> wrote:
>
> > > > Hello.
>
> > > >
>
> > > > Today, I encountered a problem when trying to pack a repository
> after
>
> > > > migrating it to the FSFS 7 format by performing full dump / load
> sequence.
>
> > >
>
> > > I assume you ran 'svnadmin load' onto a repository
>
> > > that was not accessible to the server at that time,
>
> > > so no remote user could accicentally write to it?
>
> >
>
> > Why would that matter? What could happen if somebody makes a commit or
>
> > a propedit in parallel to an 'svnadmin load'? A concurrent commit will
>
> > cause mergeinfo in later revisions to have to have off-by-one errors,
>
> > but shouldn't cause FS corruption.
>
> >
>
> > > > Shortly, I get the following error
>
> > > > “Packing revisions in shard 5...svnadmin: E160056: Offset
> 391658998 too
>
> > > > large in revision 5102”
>
> > >
>
> > > This is basically an "invalid access" error message.
>
> > > Typical causes include repository corruption and
>
> > > admins tinkering with the repository without informing
>
> > > the server process. A maybe similar issue:
>
> > >
>
> > > https://issues.apache.org/jira/browse/SVN-4588
>
> > >
>
> > > In your case, however, the corruption is probably in
>
> > > the repository itself. Please run 'svnadmin verify' on it.
>
> > >
>
> > > > I was not able to understand from the documentation, what
> settings in
>
> > > > fsfs.conf should be modified to workaround this problem. Neither
> search in
>
> > > > the Internet brought any light into this. Is it even possible?
>
> > >
>
> > > This is most definitely not a configuration issue like
>
> > > "your data is too large". Maybe, we should prefix the
>
> > > error message with "invalid access" to prevent
>
> > > confusion.
>
> >
>
> > How about being even more specific:
>
> >
>
> > svnadmin: E1600NN: failed to locate representation of %s at
> revision %ld, offset %lld
>
> >
>
> > where %s identifies the origin of the offset value or the object that
>
> > was expected to be located at that offset.
>
> >
>
> > ?
>
> >
>
> > Cheers,
>
> >
>
> > Daniel
>
> >
>
Received on 2016-08-27 10:59:57 CEST

This message: [ Message body ]
Next message: Nick: "Assertion fail in svn merge 1.9.4"
Previous message: Stefan Hett: "Re: Permissions need for deletion"
In reply to: Radek Krotil: "Re: "Offset too large" error when packing repository in FSFS 7 format"
Next in thread: Stefan Fuhrmann: "Re: "Offset too large" error when packing repository in FSFS 7 format"
Reply: Stefan Fuhrmann: "Re: "Offset too large" error when packing repository in FSFS 7 format"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]