Daniel, thanks so much!  This is the sort of work Subversion really
needs on the repository side...
We'll start looking at these asap; for future reference, it will help
if you can supply a log message for the change (see the HACKING file
for guidelines on this and other stuff).
Thanks,
-Karl
Daniel Berlin <dan@dberlin.org> writes:
> Gee, it looked formatted okay in emacs (i used svn-dev.el and all)
> I blame something else.
> 
> Anyway, this patch adds a thread that auto-removes log files every minute.
> This is the right speed for my use, which currently consists of lots of 
> large cvs repository conversions (which are repeated commits, rather than 
> imports).
> 
> It's more the number of revisions than the size of the repository. In 
> this case, it's a 16 meg cvs repository, but it's got 2056 seperate 
> "revisions" in it.
> This generates 99 log files (989 meg):
> 
> 989MB 754KB 745B        Log bytes written (772841 bytes).
> 0       Log bytes written since last checkpoint.
> 40500   Total log file writes.
> 899     Total log file write due to overflow.
> 40021   Total log file flushes.
> 99      Current log file number.
> 10388385        Current log file offset.
> 99      On-disk log file number.
> 10388385        On-disk log file offset.
> 1       Max commits in a log flush.
> 1       Min commits in a log flush.
> 36435   Number of log flushes containing a transaction commit.
> 
> 
> The example i took it from in the berkeley docs had the sleep time  at 5 
> minutes, just for reference.
> 
> This patch  also ups the log file buffer from 32k to 256k to improve 
> throughput. This was based on statistics showing it was impeding 
> performance somewhat. With it at 256k, we don't overflow before a commit 
> anywhere near as often (the above is from a 256k log file buffer, a 32k 
> would show that we were consistently filling the log file buffer multiple 
> times per commit).
> 
> The removal thread will exit itself when the fs->env variable it was 
> passed the address to goes NULL, or when it can't remove a log file, or 
> the log_archive command returns an error.
> 
> The only thing changed from the version in the manual is the timeout, the 
> formatting, and the check to see when the fs->env we were passed goes 
> NULL.
> 
> I've assumed the berkeley guys know what they are doing when they wrote 
> the example.
> :)
> 
> 
> I think this enhancement might have been on the bite-sized task list, i'm 
> not sure.
> 
> 
>  Index: ./fs.c
> ===================================================================
> --- ./fs.c
> +++ ./fs.c	Mon Feb  4 11:06:08 2002
> @@ -38,6 +38,42 @@
>  #include "svn_private_config.h"
>  
>  
> +static void *
> +logfile_thread (void *arg)
> +{
> +  DB_ENV **argenv;
> +  DB_ENV *dbenv;
> +  int ret;
> +  char **begin, **list;
> +  argenv = arg;
> +  dbenv = *argenv;
> +  
> +  /* Check once every minute. */
> +  for (;; sleep(60)) 
> +    {
> +      if (!*argenv)
> +        pthread_exit(NULL);
> +      /* Get the list of log files. */
> +      if ((ret = dbenv->log_archive(dbenv, &list, DB_ARCH_ABS)) != 0) 
> +        {
> +          dbenv->err(dbenv, ret, "DB_ENV->log_archive");
> +          pthread_exit (NULL);
> +        }
> +      
> +      /* Remove the log files. */
> +      if (list != NULL) 
> +        {
> +          for (begin = list; *list != NULL; ++list)
> +            if ((ret = remove(*list)) != 0) 
> +              {
> +                dbenv->err(dbenv, ret, "remove %s", *list);
> +                pthread_exit (NULL);
> +              }
> +          free (begin);
> +        }
> +    }
> +}
> +
>  /* Checking for return values, and reporting errors.  */
>  
>  
> @@ -355,7 +393,11 @@
>       from those participating in the deadlock.  */
>    SVN_ERR (DB_WRAP (fs, "setting deadlock detection policy",
>                      fs->env->set_lk_detect (fs->env, DB_LOCK_RANDOM)));
> -
> +  /* For our purposes, 32k is too small of a log file buffer.
> +     Kick it up to 256k to increase throughput.  */
> +  SVN_ERR (DB_WRAP (fs, "setting log file buffer size",
> +		    fs->env->set_lg_bsize (fs->env, 256 * 1024)));
> +  
>    return SVN_NO_ERROR;
>  }
>  
> @@ -441,8 +483,9 @@
>  svn_error_t *
>  svn_fs_open_berkeley (svn_fs_t *fs, const char *path)
>  {
> +  pthread_t ptid;
>    svn_error_t *svn_err;
> -
> +  int ret;
>    SVN_ERR (check_already_open (fs));
>  
>    /* Initialize paths. */
> @@ -459,8 +502,12 @@
>                                       | DB_INIT_MPOOL
>                                       | DB_INIT_TXN),
>                                      0666));
> +  if (svn_err) goto error;  
> +  ret = pthread_create (&ptid, NULL, logfile_thread, (void *)(&fs->env));
> +  if (ret != 0)
> +    svn_err = svn_error_createf (SVN_ERR_BERKELEY_DB, 0, 0, fs->pool, "spawning log file removal thread");
>    if (svn_err) goto error;
> -
> +  
>    /* Open the various databases.  */
>    svn_err = DB_WRAP (fs, "opening `nodes' table",
>                       svn_fs__open_nodes_table (&fs->nodes, fs->env, 0));
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:04 2006