[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] svnadmin hotcopy, hot-backup.py race conditions fix

From: Vladimir Berezniker <vmpn_at_tigris.org>
Date: 2003-09-10 17:22:34 CEST

<quote who="Philip Martin">
> "Vladimir Berezniker" <vladimir@berezniker.com> writes:
>
>> <quote who="Philip Martin">
>>> Vladimir Berezniker <vmpn@tigris.org> writes:
>>>> * subversion/include/svn_error.h
>>>> (SVN_ERR_POOL): Implemented macro to handle subversion errors in
>> code that uses subpools.
>>> Why is this necessary? If an error is returned then the place that
>> handles the error is responsible for clearing the pool, this should
take
>> care of any subpools.
>> I felt that since the subpool is allocated by the function, it should
be
>> destroyed but the time it is complete. Behavior of the caller is not
known. Caller is not required to clear its pool if it receives an error.
>> Caller could handle the returned error gracefully and continues
execution
>> without destroying its pool. As a result there will be memory
allocated
>> but not used.
>
> But it's different from the rest of the Subversion, which makes it
harder to understand and maintain. It can only be justified if there is a
need for this bit of code to be different, and if such a need exists it
should be documented.
>
> On the other hand, if SVN_ERR_POOL is just a good idea in general then
it should be applied to all the Subversion subpool code, not just to one
particular bit. If that is the case it should be a separate patch, it has
no place in this patch.
>
Removed.

>>> Odd whitespace, this appears in quite a few places.
>> I just checked my original email and it seems to have correct
whitespace
>> when I view it. Maybe its my email client. I'll look more into this.
>
> It appears that the context lines all have an extra leading space. I
see you are using the mail header
>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
> Is the format=flowed stuff responsible?
>
Based on my quick research looks like it has something to do with line
breaks and last character on the line being space. I will try using my
webmail client to send the next patch, to see if it will do a better job.

>>>> + svn_error_t * err =
>>>> svn_io_files_contents_same_p(&files_match,
>> +
>> live_log_path,
>>>> +
>>>> backup_log_path,
>>>> + sub_pool);
+
>> if (err) {
>>>> + svn_error_clear(err);
>>>> + files_match = FALSE;
>>> I'd like a bit of explanation here. Why is it acceptable to ignore
>> errors?
>> A case when unused log cannot be compared to its copy did not seem to
qualify as fatal error. A recovery is executed on a copy to verify that it
>> is working state. Main case was when there is a new unused log after
the
>> logs were copied, but before log cleaning is done. I beleive in this
case
>> an error will be returned indicating that one of the files is missing.
Maybe a warrning should be issued to a client in such case? How would one
>> do that?
>> Here is the comment from the updated patch:
>> /* If for some reason we failed to compare log files.
>> Say file was not copied or unreadable, behave as if log file
>> was not copied and keep it. */
>
> I'd prefer it if the code did not need to ignore errors here, perhaps
the algorithm should change. How about keeping a record of the log files
that get copied, then the code could avoid attempting a
> comparison of a log file that was not copied.
>
Changed the code to call svn_io_check_path() to check if the copy of a log
file exists and to no longer ignore the errors returned by
svn_io_files_contents_same_p(). The reason I did not create a list of
copied logs is to make svn_fs__clean_logs() independent of
svn_fs_hotcopy_berkeley().

>> [snip code performing unused log deletion]
>>> Does this assume that the log files are always located directly within
>> the db/ directory? BDB has set_lg_dir that allows a DB admin to change
that.
>> [snip code doing database and log copying]
>>> This appears to assume that the data files are located directly in the
>> db/ dir. BDB has set_data_dir that allows the DB admin to change the
location.
>> The answer to both questions is *yes*. However, the code is very easy
to
>> change to accomodate custom directories. There is one *catch* though.
BDB does not seem to have either get_lg_dir or get_data_dir, so there is
>> no way for the code to know where the logs and data are.
>
> For log files the log_archive function can return absolute paths, that
should handle set_lg_dir. There doesn't appear to be a solution for the
data files.
>
I am aware of such log_archive capability, however I see the following
problems with that approach. If I can only accommodate log files and not
data files I would prefer no to do either. But even if there was a
function for getting paths for data files, there exists one more problem.
 If custom paths are specified where should the BDB files be placed in a
copy. My feeling was to put them in default place in /db, while allowing
user to optionally specify where they want the logs and data files for the
copy to be placed. This brings up the need for the copy command to alter
DB_CONFIG file to update the location of the files to their new values.
So I offer another alternative, to enhance subversion to allow
specification of the alternative paths through a /db/config file. I will
be more than willing to create a separate patch to add such
functionality.

>>>> + if(strlen(path) == ctx->src_len)
>>>> + {
>>>> + sub_path = &"";
>>> Huh? Do you mean sub_path = ""? (I have to guess since path isn't
>> documented ;)
>> Is there any difference between sub_path = "" and sub_path = &""? Since
sub_path is a pointer are the two statements not equivalent?
>
> For &"" gcc gives "assignment from incompatible pointer type".
>
Fixed. Brane &#268;ibej was kind enough to explain to me what exactly I
was doing wrong. MS VC6 compiled that code without a single warning, go
figure.

>>>> + SVN_ERR(svn_io_check_path(lock_file, &kind, pool));
>>>> +
>>>> + /* Try to create a lock file if it is missing.
>>>> + Ignore creation errors just in case file got created by another
>> process
>>>> + while we were checking. */
>>>> + if(kind == svn_node_none) {
>>>> + svn_error_t * err = create_db_logs_lock(repos, pool);
>>>> + if(err) {
>>>> + svn_error_clear(err);
>>>> + }
>>> There is a race between svn_io_check_path and create_db_logs
>>> _lock.
>>> Since you are ignoring errors, why bother with svn_io_check_path at
all?
>> The db logs lock file will get created once and only for repositories
created before this patch. So I threw in an optimization that will stop
the application from recreating an existing lock file every time its run.
>> In case of a race when lock file is missing, one of the processes will
create a lock file and the other(s) will lock it. I do not see
anything
>> bad happening there.
>
> First, there is no need for "optimization" here, and second, it's not
much of an optimization anyway. Calling svn_io_check_path, which is
essentially stat(), to avoid making an open() call (a call that will fail
as the file exists) doesn't save much, both are system calls that involve
filesystem IO. All you have really done is make the code more complicated
than it needs to be.
Point taken. Fixed.
>
> --
> Philip Martin
>

I just need to test the changes made. Meanwhile I'll be watching for your
response on BDB logs and data file locations.

Sincerely,
Vladimir Berezniker

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Sep 10 17:23:23 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.