Re: How to safely back up an svn repository on ubuntu?

From: Pierre Fourès <pierre.foures_at_gmail.com>
Date: Thu, 14 Jan 2021 10:14:57 +0100

Le mer. 13 janv. 2021 à 23:25, Bo Berglund <bo.berglund_at_gmail.com> a écrit :
>
> On Wed, 13 Jan 2021 12:58:59 +0100, Pierre Fourès <pierre.foures_at_gmail.com>
> wrote:
>
> Thanks for your comments Pierre! :)
>

You're welcome.

> So you mean I need to shut down the svn server first?
> I don't know how to do that since svn is kind of integrated with apache on my
> server...

This might look a bit harsh, but I basically shutdown the apache
service, rsync the files, and then restart the apache service. Some
years ago, I eventually admitted to myself that I really can offer me
the ease of not having to reach some kind of high availability, or
even good availability on this service. I decided to reach for ease of
setup, ease of maintenance and reliability.

My current setup is running on a old containerized Debian I will
upgrade next month, so there is not systemd, and it looks like :
service apache2 stop && rsync params && service apache2 start
which will soon turn into :
systemctl stop apache2 && rsync params && systemctl start apache2

I use && in order not to restart the apache server if something bad
happened. I have a external probe on the server checking that apache
running. If apache doesn't restart for whatever reason, I get an email
(optionally an sms) telling me something bad happened.

This might look against what most would call a nice setup, but I find
it very anti-fragile and simple to handle. At least two things could
mess up (rsync and restart), but I only probe the end result in a tied
execution. I seek for a fail-fast and fail-clean approach. I tried to
restrict how many cogs was at play in my system, and how many wrong
paths the thing could go into. I will have to think a little more for
systemd as there is more subtleties.

> OTOH I know exactly when the nightly svnsync is started on the source system to
> copy over the day's changes to the backup server, so I could choose another time
> to run my cron job.
> And this backup svn server on Ubuntu is not used for anything on the synced
> repositories except receiving the backup data.

It looks you can offer yourself the same ease.

>
> Concerning rsync, I have first tested what is happening when I use ordinary file
> system commands towards the nfs share on Synology...
>
> What I found was that the owner/group fields look strange to me.
> Here an example:
>
> $ ls -l ~/www/index.php
> -rw-rw-r-- 1 bosse bosse 177 Jan 13 21:32 index.php
>
> $ cp ~/www/index.php /nfs/backup/
>
> Then:
> $ ls -l /nfs/backup/
> -rw-rw-r-- 1 nobody 4294967294 177 Jan 13 23:04 index.php
>
> Notice how the file ownership has changed in this operation.
> Is rsync doing something else that keeps the ownership the same as on the
> source?
> Or have I set up the wrong kind of options on the mount command for the nfs
> share?

I'm really not an expert of NFS, and in order to restrict even further
the cogs at play in my systems, I prefer to use a simple rsync over
ssh/scp, instead of mounting a nfs share on the svn server. Basically
all my server have ssh, and this skip me having to install (and
understand) all the specifics of NFS. More over, I even don't have to
care where my remote server is, as the endpoint will more easily be
reachable and the communication will be encrypted over the wire. I
just have to put the svn-server public key on the authorized_keys of
the backup server, and manually accept the backup-server fingerprint
on the first rsync I run from the svn-server. Then I'm done setting up
"the share".

Doing so, and eventually setting up a reverse rsync command to grab
back the files on the svn-server, I don't really mind the file's
ownership. On the remote backup-server, the owner will be the one used
to connect with through ssh/scp. On the restore, you may configure
rsync to set precisely what you require, but, if I recall fine, it
will basically be the user:group of the one running the rsync command.
If you don't run rsync as root, but as the user meant to own the data,
you should be fine. In our case, the user should be www-data. One easy
way to "downcast" the user running your command launched from a cron
script is to simply use two scripts, the launcher, run as root, which
in turn launch the unprivileged script through the command :
su -c "time bash $mypath/$myscript.sh" www-data 2>&1 > $log_file

There might be a lot of other ways to do, but, to me, this gets the
job done in a very easy to remember, understand and replicate
approach. My unprivileged script isn't even set to executable. This
prevent me to accidentally call it by mistake, as this script is only
meant to be run by cron (or through conscious manual execution, like
on the first run).

My current rsync command looks like :
rsync -avhW $path_to_svn_repos/ $user@$bkp_srv:$path_to_svn_bkp/$rolling_index/

I use the -W flag, as I wanted to copy wholes files only, and not use
the delta algorithm, as the svn repository format is (mostly, fully ?)
append only. I had to check my notes and the man page to recall what
the -W was. I guess I will remove this "premature optimisation" (and
extra cog) when I will reinstall my svn infrastructure.

> I don't think I will need a rolling scheme here...
>

Then you will have an even simpler and anti-fragile solution than mine. ;)

Pierre.
Received on 2021-01-14 10:15:35 CET

This message: [ Message body ]
Next message: David Aldrich: "Can I authenticate to Subversion using ssh?"
Previous message: Bo Berglund: "Re: How to safely back up an svn repository on ubuntu?"
In reply to: Bo Berglund: "Re: How to safely back up an svn repository on ubuntu?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]