Re: backing up a repository over a network

From: Bruce Elrick <bruce.elrick_at_entropyreduction.ca>
Date: 2003-06-12 17:55:16 CEST

You should also be aware that any database system will manage it's
allocated physical space as it sees fit and it is not surprising that
the operating system files that provide the physical storage change when
any action against the database occurs.

Creating temp tables (or even inserting temporary rows in a fixed table)
then deleting the table (or rows in a fixed table) may cause blocks to
be updated when the tables are created (rows inserted) and then updated
again when the tables are deleted (rows deleted) such that the resultant
blocks are not the same as they originally were. For example, the DB
design might be that every time a block is updated a sequentially
increasing stamp is updated in the block and that stamp is tied to its
logs (log sequence number?).

This meta-data the database uses to manage the physical space is pretty
well guaranteed to change even if the two logical operations are
expected to bring the DB back to its original logical state (rows
inserted then deleted, or table added then dropped).

This is even more the case when you have a server-based RDBMS. BDB is a
library-based one where the application uses linked library calls which
managed the DB (in other words the application processes access the DB
files directly). A DB like Oracle or DB2 uses a standalone DB server
process (actually a set of cooperating processes) that manage the
physical store and application access that via network connections to
the RDBMS server processes. It is possible that the RDBMS server
processes are updating meta-data in the physical files even when no
application processes are connected to the RDBMS server process (think
of garbage collection -like operations).

Using something like rsync (or even cp) on the physical files of a DB
when the DB is "up" or "active" will usually require the DB recover
process to be run because of those metadata issues. If the DB is down
(in BDB's case that would be when there are no application processes
with the DB files open, in Oracle's case that would be when the server
engine is stopped) then a file copy will give you a clean copy of the
DB, but as you've observed, there will be meta-data differences that
rsync will have to transmit even when the logical state is the same.

Cheers...
Bruce

Faheem Mitha wrote:
>
> On 9 Jun 2003, Ben Collins-Sussman wrote:
>
>
>>If you want more detail than this, you need to read the BDB docs at
>>sleepycat.com. :-)
>
>
> That is a very complete description. Thanks for taking the trouble to
> reply in detail. I suppose the database must log every action before it
> performs it. I did not consider the possibility of a journalling
> filesystem style setup, though it makes perfect sense now that I think
> about it, to protect against crashes etc. So the differences in the
> repository I saw were presumably due to the changed logs.
>
> Also, thanks to cmpilato for taking the trouble to reply to my earlier
> questions. Sorry to trouble you guys. :-)
>
> Faheem.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Jun 12 17:56:23 2003

This message: [ Message body ]
Next message: kfogel_at_collab.net: "Re: the 0.24 release?"
Previous message: Roman Neuhauser: "Re: the 0.24 release?"
In reply to: Faheem Mitha: "Re: backing up a repository over a network"
Next in thread: Michael Wood: "Re: backing up a repository over a network"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]