[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: rev 5750 - branches/cvs2svn-kfogel/tools/cvs2svn

From: Daniel Berlin <dberlin_at_dberlin.org>
Date: 2003-04-29 04:57:37 CEST

>
>> ...
>> (If you have any suggestions for a particularly Python-ish way to do
>> it, they are welcome of course. I was just going to use `anydbm' and
>> come up with some simple representation, perhaps involving pickling.)
>
> anydbm should be fine. 'marshal' is just fine and very fast, if you
> stick
> with native Python datatypes (list, dict, string, integer, etc). You
> only
> need Pickle if you need to serialize class instances.
or if things are shared, or you have a cyclic structure.

> My understanding is
> that cPickle is nearly as fast as marshal, but I doubt you'll need much
> beyond a "dictionary" that looks like:
>
> { "path1": None, "path2": None }
>
> And you just test with .has_key(path). (I used None in the example
> cuz you
> don't need a value(?); just the key)
>
Just make sure you don't have cyclic data structures, and don't want
shared instances to still be shared when you unmarshal.

 From the top of marshal:
"/* Write Python objects to files and read them back.
    This is intended for writing and reading compiled Python code only;
    a true persistent storage facility would be much harder, since
    it would have to take circular links and sharing into account. */"
Just to prove the comment is valid:
[dberlin@dberlin Python]$ python
Python 2.2.2 (#1, Feb 24 2003, 19:13:11)
[GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import marshal
>>> a={}
>>> b={}
>>> a["a"]=b
>>> b["a"]=a
>>> marshal.dumps(a)
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
ValueError: object too deeply nested to marshal
>>> marshal.dumps(b)
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
ValueError: object too deeply nested to marshal
>>> import pickle
>>> pickle.dumps(a)
"(dp0\nS'a'\np1\n(dp2\ng1\ng0\nss."
>>> pickle.dumps(b)
"(dp0\nS'a'\np1\n(dp2\ng1\ng0\nss."
>>>
> Be wary of anydbm. If that defaults to dumbdbm, then you're going to
> end up
> with the index in memory. Kaboom! Suggestion:
>
> import anydbm
>
> if anydbm._defaultmod.__name__ == 'dumbdbm':
> print 'ERROR: your installation of Python does not contain a
> proper'
> print ' DBM module. This script cannot continue.'
> print ' to solve: see blah blah blah'
> sys.exit(1)
>
> On my RH 7.2 installation, Python has a BDB module. I'd expect that
> you'll
> be fine with the DBM restriction.
Python has multiple BDB modules, last i looked. :)

>
> Cheers,
> -g
>
> --
> Greg Stein, http://www.lyra.org/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Apr 29 04:58:26 2003

This is an archived mail posted to the Subversion Dev mailing list.