[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] svnadmin hotcopy, hot-backup.py race conditions fix

From: Philip Martin <philip_at_codematters.co.uk>
Date: 2003-09-10 01:44:05 CEST

"Vladimir Berezniker" <vladimir@berezniker.com> writes:

> <quote who="Philip Martin">
>> Vladimir Berezniker <vmpn@tigris.org> writes:
>>> * subversion/include/svn_error.h
>>> (SVN_ERR_POOL): Implemented macro to handle subversion errors in
> code that uses subpools.
>> Why is this necessary? If an error is returned then the place that
> handles the error is responsible for clearing the pool, this should take
> care of any subpools.
> I felt that since the subpool is allocated by the function, it should be
> destroyed but the time it is complete. Behavior of the caller is not
> known. Caller is not required to clear its pool if it receives an error.
> Caller could handle the returned error gracefully and continues execution
> without destroying its pool. As a result there will be memory allocated
> but not used.

But it's different from the rest of the Subversion, which makes it
harder to understand and maintain. It can only be justified if there
is a need for this bit of code to be different, and if such a need
exists it should be documented.

On the other hand, if SVN_ERR_POOL is just a good idea in general then
it should be applied to all the Subversion subpool code, not just to
one particular bit. If that is the case it should be a separate
patch, it has no place in this patch.

>> Odd whitespace, this appears in quite a few places.
> I just checked my original email and it seems to have correct whitespace
> when I view it. Maybe its my email client. I'll look more into this.

It appears that the context lines all have an extra leading space. I
see you are using the mail header

Content-Type: text/plain; charset=us-ascii; format=flowed

Is the format=flowed stuff responsible?

>>> + svn_error_t * err = svn_io_files_contents_same_p(&files_match,
> +
> live_log_path,
>>> +
>>> backup_log_path,
>>> + sub_pool); +
> if (err) {
>>> + svn_error_clear(err);
>>> + files_match = FALSE;
>> I'd like a bit of explanation here. Why is it acceptable to ignore
> errors?
> A case when unused log cannot be compared to its copy did not seem to
> qualify as fatal error. A recovery is executed on a copy to verify that it
> is working state. Main case was when there is a new unused log after the
> logs were copied, but before log cleaning is done. I beleive in this case
> an error will be returned indicating that one of the files is missing.
> Maybe a warrning should be issued to a client in such case? How would one
> do that?
> Here is the comment from the updated patch:
> /* If for some reason we failed to compare log files.
> Say file was not copied or unreadable, behave as if log file
> was not copied and keep it. */

I'd prefer it if the code did not need to ignore errors here, perhaps
the algorithm should change. How about keeping a record of the log
files that get copied, then the code could avoid attempting a
comparison of a log file that was not copied.

> [snip code performing unused log deletion]
>> Does this assume that the log files are always located directly within
> the db/ directory? BDB has set_lg_dir that allows a DB admin to change that.
> [snip code doing database and log copying]
>> This appears to assume that the data files are located directly in the
> db/ dir. BDB has set_data_dir that allows the DB admin to change the
> location.
> The answer to both questions is *yes*. However, the code is very easy to
> change to accomodate custom directories. There is one *catch* though.
> BDB does not seem to have either get_lg_dir or get_data_dir, so there is
> no way for the code to know where the logs and data are.

For log files the log_archive function can return absolute paths, that
should handle set_lg_dir. There doesn't appear to be a solution for
the data files.

>>> + if(strlen(path) == ctx->src_len)
>>> + {
>>> + sub_path = &"";
>> Huh? Do you mean sub_path = ""? (I have to guess since path isn't
> documented ;)
> Is there any difference between sub_path = "" and sub_path = &""? Since
> sub_path is a pointer are the two statements not equivalent?

For &"" gcc gives "assignment from incompatible pointer type".

>>> + SVN_ERR(svn_io_check_path(lock_file, &kind, pool));
>>> +
>>> + /* Try to create a lock file if it is missing.
>>> + Ignore creation errors just in case file got created by another
> process
>>> + while we were checking. */
>>> + if(kind == svn_node_none) {
>>> + svn_error_t * err = create_db_logs_lock(repos, pool);
>>> + if(err) {
>>> + svn_error_clear(err);
>>> + }
>> There is a race between svn_io_check_path and create_db_logs
>> _lock.
>> Since you are ignoring errors, why bother with svn_io_check_path at all?
> The db logs lock file will get created once and only for repositories
> created before this patch. So I threw in an optimization that will stop
> the application from recreating an existing lock file every time its run.
> In case of a race when lock file is missing, one of the processes will
> create a lock file and the other(s) will lock it. I do not see anything
> bad happening there.

First, there is no need for "optimization" here, and second, it's not
much of an optimization anyway. Calling svn_io_check_path, which is
essentially stat(), to avoid making an open() call (a call that will
fail as the file exists) doesn't save much, both are system calls that
involve filesystem IO. All you have really done is make the code more
complicated than it needs to be.

Philip Martin
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Sep 10 01:44:57 2003

This is an archived mail posted to the Subversion Dev mailing list.