[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: python churning on import...

From: Nathan Kidd <nathan-svn_at_spicycrypto.ca>
Date: 2006-04-03 22:39:11 CEST

Peter Yamamoto wrote:
> A user accidently imported a few files and folders to the root of our
> repository (eg beside trunk,tags,branches instead of under the trunk)...
> the thing is that the svnservice had a python process churning on the
> rev path for over half an hour. A sampling from filemon is shown below.
> The user was using RapidSVN.
>
> I've had checkins take a long time but the checkins were large checkins.
>
> But this was a small "add"... is this file access normal?

If python is running then you must have some custom pre/post commit
hooks enabled. By default Subversion doesn't run any python process.

You need to find out which commit-hook is causing the problem, but based
on the symptoms you described I can give you a pretty good guess that it
is a check-case-insensitive.py related script. There are several
versions floating around that will recursively walk the tree from the
top-most changed path in a commit.

Attached is the case-insensitive.py script I use which has much better
performance because it doesn't do the recursive 'ls'. It is modified
from the file of the same name in the trunk.

-Nathan

P.S. the history is below in case you wonder what I changed.

Revision: 198 Author: nathan.kidd Date: 6:22:44 PM, January 31, 2006
Message:
[svn] case-insensitive.py improvements
  * encode output as utf-8 so tortoise can properly display output
    (encoding as 'native' meant tortoise showed ????)
  * the command line shows _____ regardless.

----
Modified : /trunk/subversion/hooks/case-insensitive.py
Revision: 197 Author: nathan.kidd Date: 6:15:51 PM, January 31, 2006
Message:
[svn] case-insensitive.py improvements
  * store directory entries by canonicalized filename hash so we can 
later access them directly instead of walking array of all filenames in 
directory
  * output which filename(s) are (existing) and (new)
  * variable/function name clarification
  * general cleanup
----
Modified : /trunk/subversion/hooks/case-insensitive.py
Revision: 196 Author: nathan.kidd Date: 4:03:13 PM, January 30, 2006
Message:
case-insensitive.py ported to use old SVN 1.2.3 Python 2.3.5 bindings
  * with references from http://www.thehirts.net/blog/?p=8
  * and check-case-insensitive.py
  * Not using proper core.run_app() and classes. Future TODO. (Code 
cleanness/maintainability issue)
----
Modified : /trunk/subversion/hooks/case-insensitive.py
Modified : /trunk/subversion/hooks/pre-commit.cmd

#!/usr/bin/python

# A pre-commit hook to detect case-insensitive filename clashes.
#
# What this script does:
# - Detects new paths that 'clash' with existing, or other new, paths.
# - Ignores existings paths that already 'clash'
# - Exits with an error code, and a diagnostic on stderr, if 'clashes'
# are detected.
#
# How it does it:
# - Get a list of changed paths.
# - From that list extract the new paths that represent adds or replaces.
# - For each new path:
# - Split the path into a directory and a name.
# - Get the names of all the entries in the version of the directory
# within the txn. Store each filename with its canonical name as
# the hash key, and value is list with pristine filename appended.
# - Compare the canonical new name with each canonical entry name.
# - If the canonical names match and the pristine names do not match
# then we have a 'clash'.
#
# Notes:
# - All the paths from the Subversion filesystem bindings are encoded
# in UTF-8 and the separator is '/' on all OS's.
# - The canonical form determines what constitutes a 'clash', at present
# a simple 'lower case' is used. That's probably not identical to the
# behaviour of Windows or OSX, but it might be good enough.
# - Hooks get invoked with an empty environment so this script explicitly
# sets a locale; make sure it is a sensible value.

import sys, locale

#sys.path.append('/usr/local/subversion/lib/svn-python')
SVNLIB_DIR = r"C:/Program Files/Python2.3.5/lib/libsvn"

if SVNLIB_DIR:
  sys.path.insert(0, SVNLIB_DIR)

from svn import core, repos, fs
locale.setlocale(locale.LC_ALL, 'C')

def canonicalize(path):
  return path.decode('utf-8').lower().encode('utf-8')

def get_new_paths(txn_root):
  new_paths = []
  for path, change in fs.paths_changed(txn_root, pool).iteritems():
    if (change.change_kind == fs.path_change_add
        or change.change_kind == fs.path_change_replace):
      new_paths.append(path)
  return new_paths

def split_path(path):
  slash = path.rindex('/')
  if (slash == 0):
    return '/', path[1:]
  return path[:slash], path[slash+1:]

def join_path(dir, name):
  if (dir == '/'):
    return '/' + name
  return dir + '/' + name

def get_repo_dir_entries(path, my_dir_entries, txn_root):
   my_dir_entries[path] = {}
   for name, dirent in fs.dir_entries(txn_root, path, pool).iteritems():
     if (not my_dir_entries[path].has_key(canonicalize(name))):
       my_dir_entries[path][canonicalize(name)] = []
     my_dir_entries[path][canonicalize(name)].append(name)

# --- Start ---

repo_dir_entries = {} # {dir}{canonicalname}[filenames]
clashes = {} # {dir+canonicalname}{clashedpathnames}

native = locale.getlocale()[1]
if not native: native = 'ascii'

core.apr_initialize()
pool = core.svn_pool_create(None)

repo_path = sys.argv[1].decode(native).encode('utf-8')
txn_name = sys.argv[2].decode(native).encode('utf-8')

repos_handle = repos.open(repo_path, pool)
fs_handle = repos.fs(repos_handle)
txn_handle = fs.open_txn(fs_handle, txn_name, pool)
txn_root = fs.txn_root(txn_handle, pool)

new_paths = get_new_paths(txn_root)
for path in new_paths:
  dir, new_name = split_path(path)
  canonical_name = canonicalize(new_name)
  if (not repo_dir_entries.has_key(dir)):
    get_repo_dir_entries(dir, repo_dir_entries, txn_root)
  for existing_name in repo_dir_entries[dir][canonical_name]:
    if (existing_name != new_name):
      canonical_path = join_path(dir, canonical_name)
      if (not clashes.has_key(canonical_path)):
        clashes[canonical_path] = {}
        clashes[canonical_path]["existing"] = []
      clashes[canonical_path]["new"] = join_path(dir, new_name)
      clashes[canonical_path]["existing"].append(join_path(dir, existing_name))

exitvalue = 0

if (clashes):
  exitvalue = 1
  for canonical_path in clashes.iterkeys():
    output = u'Clash:\n'
    for path in clashes[canonical_path]["existing"]:
      output += u'\'' + str(path).decode('utf-8') + u'\' (existing)\n'
    output += u'\'' + str(clashes[canonical_path]["new"]).decode('utf-8') + u'\' (new)\n'
    sys.stderr.write (output.encode('utf-8', 'replace'))

core.svn_pool_destroy(pool)
core.apr_terminate()

sys.exit(exitvalue)

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Mon Apr 3 22:39:09 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.