[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Managing SQL in the repository

From: m christensen <dfs_at_xmission.com>
Date: 2004-05-30 07:47:02 CEST

Gary Affonso wrote:

>On 5/28/04 2:42 PM, "Brad Rhoads" <brhoads@zethcon.com> wrote:
>
>
>
>>How do we
>>deal with the SQL as it is functionally related to other parts of the
>>system?
>>
>>
>
>We're dealing this this one, too, right now and our solution for Store
>Procedures is this:
>
>1) We require that our developers "script" their schema-related (not data
>related) changes to the script. In short, they're not allowed to simply
>open up an sproc-editor and make a quick change. They have to put the
>change in a "create sproc" script.
>
>We keep these scripts in a directory on our repository.
>
>2) Ensure that the "script" is written such that it drops the old resource
>before adding the new one. In short we disallow the use of alter statements
>for sprocs.
>
>3) We require that there is one file for every sproc on the database.
>
>
>And then we just manage those scripts as if they were standard code.
>
>This gets us all the normal repository abilities (snapshots, branches,
>merges, diffs) on sproc-related SQL resources.
>
>----
>
>Tables are trickier because we can't "drop" the old table before adding new
>one (the data would go away). This means we're forced to use alter
>statements to make table-related mods (instead of the drop/replace scheme we
>use for sprocs).
>
>And that means that a table's given state is a combination of the original
>create-scripts and a handful of alter scripts that have come after its
>original creation. Which means no ability to easily diff different table
>versions.
>
>So for table-related DDL, I'm still not sure how we'll be handling it, I'd
>love to hear some suggestions.
>
>- Gary
>
>
>
First Off I'll ASSUME y'all are not falling for the Micro$oft pitfall of
bastardizing an
official 30 year old standard definition, i.e."SQL" into a closed source
proprietary brand name for
Microsoft SQL Server.

So, assuming we are talking about 'SQL' as a standard language used by
over a dozen RDBMS
systems answers are by definition generalities....

My particular target Database is Oracle, but the issues are common.

#1 Application 'Source' rev. ctl. is fairly straightforward.
#2 RDBMS data and Schema changes are problematic as mentioned above.
#3 Compiled binaries often pull parameters or configuration info from
the database, this means a particular
      Binary is a product of the "Source Code", the database Schema AND
the data in the database.

In the intrest of brevity I'll skip the proofs of the above 3
statements, if the validity of the points are in question
I'd be happy to elaborate, but I'll assume they are a given.

The approach taken will depend on the target of your development effort.
The 2 general targets I see are
a shipped product and incremental in-house system modification.

The way I manage a 'shipped product' development is based on several
general principles and steps.
A. Representative copies of supported legacy databases are maintained.
B. Application source code is managed in a Rev. Ctl. system.
C. Database build and migration scripts are managed as part of the
source tree.
      As such there is a Create table, Alter table or similar 'single
point' which defines every specific table.
       If the table is new, there is a create table statement, if a new
column is needed, the create statement is changed.
        How this applies to alter statement or migration code is a
exercise for the reader.
      The point is there is NOT a stream of schema modifications spread
thru hundreds of SQL 'Patch scripts'.
D. A 'current' database is built 'From Scratch' via the Creation /
Migration scripts for EVERY build of the
     Development, Testing and Production Databases.
E. Application Binaries are built from the source in the repository
against the database built from the same version of
     the build / patch scripts.
F. Those Binaries are checked into the Rev. Ctl Software (Some may not
agree this is required or appropriate, I think is is, IMHO).
G. Those biraries go thru the system testing process and are what gets
released, They are NEVER rebuilt without retesting.

An ongoing development effort has more pitfalls, I'll only detail the
differences from above.

A. Actual backup copies of the Production database are restored as a
starting point for databased used in part D.
B. No Diff.
C. All SQL EVER run against The Production database are checked in as
patches. Both DDL & DML are included.
     In other words, I don't care if it's 'Just a data change' or major
table schema change.
     It's a script, it's checked in, and it has been tested or it
doesn't run in production, Period.
D. ALL Modern Decent Production level RDBMS Systems have a "Point in
Time" recovery capability. IMHO.
     I would never run a Production RDBMS with ongoing development with
this feature turned off.
    As such, I have a backup copy of all files and logs required to
rebuild a full and complete copy of the database
    as of 'May 30 2004 at 15:34:12'.
E. Some SQL will need to be run before the binaries are rebuilt, and
some can only be run after, how these dependencies are
     Tracked and enforced is an excersise for the reader.
     The current "Timestamp" or System Change Number is logged to
identify a specific recoverable point-in-time of the database itself.
     Required pre-build patch scripts are pulled from the repository and
run.
     Binaries are built from a specific rev of source code and the
database at a specific point in time.
     Required post-build patch scripts are pulled from the repository
and run.
  
F. No Diff.
G. No Diff.
H. Production moves also need to execute the SQL PRE & POST build patch
scripts although the binaries are not rebuilt.

It's hard enough to track system schema changes as shown here and in
previous posts, but it is nearly IMPOSSIBLE to
track production database data changes from a live system, after all the
application is constantly changing data by definition.
I think the only real approach is to track overt system changes from the
developers for other purposes but to admit the
only way to get a copy of a database at a specific point is thru system
backups and point-in-time recovery capabilities
of the RDBMS itself.

Marc

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun May 30 06:40:56 2004

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.