Before I weight in with my newbie thoughts, I'll introduce myself. My name
is Scott Reedstrom, and I'm not really a programmer, I just play one at
work. Actually, I'm a EE doing IC design for implantable medical devices.
I've also done IC design and C coding for networking devices. The dirty
little secret for us IC designers is that in this day and age, virtually all
IC designs are done in software and compiled into hardware, so we actually
are primarily programmers. I've been using CVS and PRCS for controlling our
data for years. In this world, our controlled data ranges from short (500
lines) source files to 900 Meg layout data files, so I've managed to stress
a few systems to the breaking point. I've also spent the last two years
developing and rolling out a data management system for our IC data using a
TCL layer over PRCS, so I've had quite a bit of feedback on what does and
doesn't work from a non-programmer user perspective. I'm planning on
helping out, but I'll start by doing a little lurking. But before I lurk,
some thoughts on metadata...
The location of the metadata is a real thorny problem, and one that relates
to the user and use model. The arguments for local CVSish directories is
primarily from the sophisticated programmer who want's all the data
available and wants it now! On the other hand, having the metadata so close
to the file data makes it pretty easy for a less sophisticated user to go in
and mung it up. The ability to copy out a subdirectory to create a new
workspace is only useful if the system is to slow to make this an easy,
normal operation -- it really is a workaround for a slow system, not a good
design goal. In addition, messing up the metadata causes _very_ unexpected
results (I speak from experience...). I'm of the monkey-proof school, and
would argue for making the server keep the metadata. This has several
benefits: It simplifies the clients by pushing the complexity into the
server, where it seems the complexity belongs. It keeps the metadata out of
the ordinary operations. It allows tools that keep data directories instead
of files (of which there are many in IC design) from getting confused by
spurious data from getting confused. Most important, it cleaves off the
metadata format from the user space, allowing the operations that use the
metadata to be written to an API and separated from the operating system.
The downsides are speed and scalability. Well designed, the metadata does
not have to be large, and most of the metadata is of a shared nature. A
thousand users of a project don't create a thousand copies of every file.
And modern networks mitigate the speed issues. In fact, having the metadata
at the server allows easy query to metaquestions that always seem to come up
-- who has what, who's data is up to date, etc...
Now to reengage my cloak of lurking...
> -----Original Message-----
> From: Wilfredo Sánchez [SMTP:wsanchez@apple.com]
> Sent: Tuesday, August 22, 2000 11:10 PM
> To: kfogel@collab.net
> Cc: dev@subversion.tigris.org
> Subject: Re: Bundles Re: SVN, .SVN, and other meta-data directorys
>
> > I once ran into the zapping problem you mention, with a WebObjects
> > tool (I can't remember which one, but it sure got me good :-) ). With
> > all due respect, I think that's just ill-behaved. We shouldn't design
> > to compensate for a tool that randomly destroys data it doesn't
> > recognize; instead, the tool should be fixed.
>
> Sure, I've made that argument, in fact; but it does go both ways. The
> editor owns that data, and CVS is dropping turds into someone else's data.
> You're just assuming that because it's a directory, that this can be
> gotten
> away with, and usually, you are correct. But should these tools really
> copy whatever junk happens to magically appear into their bundles? Not
> really; you get .nfs9834876 turds from some NFS clients, for example,
> which you certainly don't want to keep around, etc. So it's not so
> simple.
>
> > Even storing all the data in one SVN/ dir at the top of the tree
> > wouldn't solve this problem, anyway -- what if you used one of those
> > editors on the root of the working tree? The SVN/ dir would still get
> > zapped. So the problem might not occur as often, but it would still
> > happen sometimes.
>
> That's not at all likely. Presumably a project is a directory with
> files in it which the user manages. Bundles are opaque to the user;
> they look like files, but they happen to be implemented as directories.
> Ideally, your revision control system would also recognize them as
> single objects, which is why the t/f thing got invented.
>
> Perhaps I'm looking at this sideways. Yup, I am. The better solution
> is for something less simpleminded than t/f wrappers, but which also can
> treat these things as single objects, and the SVN/ turd problem becomes
> moot in that context. This goes back to how we want to handle
> (encode & store, diff, etc.) other-than-ASCII-file data, a more
> interesting and useful problem to solve.
>
> -Fred
>
> Wilfredo Sánchez, wsanchez@apple.com
> Open Source Engineering Lead
> Apple Computer, Inc., Core Operating System Group
> 1 Infinite Loop, Cupertino, CA 94086, 408.974-5174
Received on Sat Oct 21 14:36:07 2006