Re: Default folder names compulsory?

From: Ryan Schmidt <subversion-2007b_at_ryandesign.com>
Date: Sun, 13 Jan 2008 06:17:00 -0600

On Jan 13, 2008, at 04:31, an0n1m0us_at_gmail.com wrote:

> At 04:31 PM 13/01/2008, you wrote:
>
>> On Jan 12, 2008, at 23:12, an0n1m0us_at_gmail.com wrote:
>>
>>> At 08:22 AM 12/01/2008, Ryan Schmidt wrote:
>>>
>>>> On Jan 11, 2008, at 05:57, an0n1m0us_at_gmail.com wrote:
>>>>
>>>>> 2Thanks for this, it's very helpful. I've printed the first two
>>>>> chapters of the book and am RTFMing :) The initial quality of the
>>>>> manual is quite impressive!
>>>>>
>>>>> One statement I can make that might clear up some confusion is
>>>>> that
>>>>> I don't care if the names "trunk", "branches" and "tags" are used
>>>>> in the repository's internals and theory. I was just hoping to
>>>>> ensure the trunk always related to code in the /home/username/
>>>>> folder.
>>>>
>>>> You can check out a working copy to anywhere you want. Checking out
>>>> to the home folder directly might not be the best idea in the
>>>> world,
>>>> but you can certainly do it:
>>>>
>>>> svn checkout url://to/something/in/your/repository $HOME
>>>
>>> Ok, that sounds like what I want but I don't think I've explained
>>> the context to you properly.
>>>
>>>> Working copies are meant to be disposable. They can get wedged or
>>>> otherwise into weird states, and the answer if you ask on the
>>>> mailing
>>>> list will often be "delete your working copy and start over." This
>>>> could be inconvenient if you find yourself needing to delete your
>>>> home folder in order to get back to work.
>>>
>>> Hmmm, sometimes svn "wedges [the code] ... into weird states" ?
>>
>> [the working copy], yes. Working copies can become messed up, and
>> they are considered disposable. Be prepared to be told "delete your
>> working copy and check out a new one."
>
> Hmm, doesn't give me a huge degree of confidence in the reliability
> of svn, however I thank you for your honesty.

In order to ensure the reliability of the repository, the reliability
of the working copy is sometimes sacrificed.

>>> However I think there's a miscommunication re context here. /
>>> home/ username/ is on the server, it's the project space, the
>>> project's
>>> VWS username. It's not the local machine's of developers.
>>>
>>>>> If this is the case, my scenario would need a situation where any
>>>>> code checked in to the trunk would automatically be outputted to a
>>>>> real folder. If I was to draw up a basic work flow diagram of what
>>>>> I am trying to achieve, would you be able to take a look?
>>>>
>>>> You probably don't want to automatically update users' working
>>>> copies.
>>>
>>> As above, this isn't what I was intending. I was hoping to update a
>>> project's output space on the development server.
>>
>> Ok, thanks, that does make more sense. And that's a perfectly normal
>> thing to do. There's even a FAQ that discusses setting up server-side
>> working copies for the purpose of previewing web sites:
>>
>> http://subversion.tigris.org/faq.html#website-auto-update
>
> Woohooo! Thanks for that.
>
>> I've just never heard of having server-side "users" representing each
>> project. I suppose there's nothing wrong with that.
>
> It doesn't seem to be a problem for us. This set up has been in
> place for the best part of 190 years. The /home/username/ accounts
> on our development server are actually just a convention really as
> customers do not have access to them. On the production server
> though, it's a bit tricky for us to develop a website, store it in
> a user's space and map their URL to it, then when they want to FTP
> in themselves to make a change down the track (which is their
> prerogative though we discourage it if we3 are maintaining their
> site), point them to a different spot on the server.

When you switch to Subversion, the customer must no longer be allowed
to change files on the production server like this. The customer must
be denied this FTP access. Nothing should be able to release code to
the production server or modify the code there except a checkout or
update or switch from the repository.

You could instead grant your customers write access to your
repository. Use path-based authorization to restrict them to just
their projects. Then you can test their changes and make a new tag
and release it to production. This way their changes are also tracked
just like your developers' changes, and become a part of the
project's history.

> /home/username/ or /home/username/public_html/ may be a strange
> place to store websites but it has worked for us over time. Having
> a separate customer user account and storage folder, on top of
> their website folder, would be less efficient I think, especially
> for backing up and so on. A separate list would have to be
> maintained of which folders belong to which users.
>
> Nevermind.

Your web site likely needs two directories on the server: the working
copy which contains the code, and another directory which contains
perhaps uploaded files (uploaded photos, attachments, other files
that maybe relate to database entries, etc.). These two directories
should ideally be completely separate. You should avoid the
temptation to, for example, have a directory "images" inside your
working copy into which images are uploaded via the web site. This is
contrary to the idea that the working copy must be disposable.
Instead, put the images directory in a project-specific directory
which is stored somewhere centrally on the server, as the web site's
database would also be. On our production server we had something
like this:

/www/projecta/images
/www/projecta/workingcopy/htdocs

The web server pointed at the htdocs directory. Other items could
exist at the same level as the htdocs directory that should not be
served directly via the web server, for example included script
files, password files, etc. Images which users could upload via the
web site would go into the images directory and would usually get a
corresponding database record. Other directories could exist at the
same level as the images directory for other files that the web site
manipulates.

>>>> Users should be in control of when and if to update their
>>>> working copies. The book will explain this, I'm sure.
>>>
>>> Yeah I understand working copies on developers machines are best
>>> controlled by developers.
>>>
>>>> It sounds like you're maybe not versioning regular source code
>>>> projects? What are you versioning?
>>>
>>> Website code. Our current workflow is:
>>>
>>> Setup a Virtual Web Server (VWS) account for a customer (aka user,
>>> aka project) in the development server's home directory. Something
>>> like this:
>>>
>>> http://devel.projecta.org/
>>>
>>> would point to
>>>
>>> /home/projecta/public_html/
>>>
>>> on the development server.
>>
>> So then public_html is the working copy, not the home directory
>> itself. That's better.
>
> Well almost all files would be in the public_html folder but there
> are a few files related to the website that are best stored hidden
> from public view. Files such as those that store DSN credentials.
> Ideally we'd like to version those too.

Ok, that's true. I'll just warn you again that it may be messy having
a home directory be a working copy.

>> But you're not accessing it via
>>
>> http://www.example.com/~projecta/
>>
>> ? Then it doesn't even need to be in a particular server-side
>> "user"'s directory.
>
> I think that is true: the subversion repository does not have to be
> stored in a user's home directory. We just felt this might be
> simpler and therefore cleaner.

Working copy. Not repository. We've been talking only about working
copies. The repository (or repositories) will live in another totally
different location on the server's disk (e.g. /repositories), which
should definitely not be under any Apache document root.

> This taps into our other question of whether to use separate
> repositories for each project. It may be a significant overhead but
> on the other hand it could have advantages of keeping project
> information separated from other projects.

I personally like keeping all projects in one repository because it
enables one to share code between projects, and because it's just a
single thing you have to backup. On the other hand, separate
repositories make it easy to archive projects should that need arise.
And you can still share code between repositories by using externals,
for example if you have one repository per project, and then a common
repository for reusable libraries. The problem with that, though, is
that for us, reusable libraries usually started out as code in a
particular project that only later were made reusable. With
everything in one repository, it's easy to "svn cp" the library from
the project to the common library location. With seprate
repositories, that's not possible at all.

> If we open one repository and have each project as a separate
> 'space' (not sure of the svn terminology here) then what is to stop
> other projects from 'spying' on other projects if we open up our
> repository to a web tool like trac?

Path-based authorization would ensure that users can only access the
parts of the repository they're supposed to access. I'm not sure how
that plays into Trac however -- whether Trac honors Subversion's path-
based authorization. You may have to ask the Trac list about that one.

>> All working copies could just live in a central
>> area on the server:
>>
>> /www/projecta
>> /www/projectb
>>
>> etc. Nobody other than the sysadmin will ever need to access these
>> files or know where they are; they will be touched solely by the
>> post- commit script of the repository which will auto-update them
>> following
>> each commit.
>
> Yep this is a possibility I guess. It's just somewhat more
> obfuscated then keeping everything in one spot.

Funny, I find your way more obfuscated, and mine more clear. Also, I
consider my way keeping everything in one spot, and yours spreading
things out. In the end, of course, it's you who has to understand and
be happy with the organization of your server.

>> So it doesn't matter where they live. (They can also
>> stay where they are now, I'm just saying there's no reason to
>> introduce the overhead of creating a new server-side "user" for each
>> new project, and spreading the server-side working copies out in
>> different "user" directories.)
>
> Yep thanks, I think we're 'on the same page' now. The repositories
> can be stored anywhere because they can always be versioned (?) to
> anywhere on the server.

Again: we're talking about working copies, not repositories. And the
working copies can be anywhere on disk, because they internally know
how to connect to the repository, and because you can configure your
web server to point to any directory on disk for web content.

>>> Developers simply log in from their workstations via FTP and work
>>> carte blanche (in reality the only two developers currently sit
>>> next to each other and communicate re which files we touch) on any
>>> file by downloading it to our workstation, modifying, saving the
>>> changes (uploading) and previewing immediately via the above URL.
>>
>> Then your new workflow is that each developer gets a working copy of
>> the project on their own machine, works on it, tests it, and commits
>> it when done. A post-commit script you write takes care of updating
>> the server-side working copies.
>
> That sounds about right. I'm not sure about 'working copies' on the
> server side though. This might be a terminology semantic thing but
> I would like for the post-commit script to update the project/users
> preview copy with the current trunk code on the devel server, I think.

Yes, that's exactly what I've been talking about, and what the FAQ
entry I mentioned above addresses.

> A nightly script could create a staging or production 'branch' for
> moving to the other servers, perhaps.

I don't know if automated branch creation is a good idea, but you
could give it a try if you want.

> I suppose I'm a little confused by the difference between a
> 'working copy' that a developer would use on their machine, and
> what the committed code on the server represents in terms of the
> trunk, branches and tags terminology.

Well, a working copy is a directory you can create from the
repository. Your code is stored in the repository as a database, but
to work on it, you need a working copy. You make changes to files in
the working copy, then commit them to the repository. You can have a
working copy of trunk or of a branch or of a tag or of any other
directory of the repository. When you're done with a working copy,
you can delete it from your disk and the committed code of course
stays safely in the repository.

A working copy can be used for purposes other than working on the
code, for example the server-side working copy used for the web
preview that we've been talking about.

>> Ideally the developer has Apache and all other necessary web software
>> running on his development machine so that he can test the web site
>> before committing.
>
> Yeah this seems to be the model with desktop software but it's not
> how we work. One disadvantage of that approach is decentralising
> the config. Each developer's machine would always have to be
> updated regularly to mirror any changes to the server config and
> that would be over the top administratively I think. FWIW we are
> talking about just two core developers with rare assistance from
> less than a handful of other people. This is not to say that any
> system we set up should not be scalable though.

There are tradeoffs either way. In the company where I worked, we
also decided it was too much work getting apache, php, pear modules,
mysql and so forth all updated and working happily on each user's
Windows desktop. So we used developer working copies stored on the
server. However I used a Mac, and was also a sysadmin, so I
personally did have all the necessary software installed on my Mac
and often used local working copies instead.

>> If this is not possible, the user can certainly
>> make a change, commit it, then check the result on the server, then
>> tweak it and commit again and check again, repeating until it's
>> correct.
>
> Wooohooo, that's what we're after. The auto-update FAQ you
> mentioned above seems like the way to go, I'll definitely read that.
>
>> But this will likely have the effect of breaking the project
>> during the developer's development work. One usually strives to keep
>> the trunk working correctly. You may want to consider using branches
>> for developer work in this case.
>
> Ok, as above, I'm not too sure about where the trunk, branches, and
> especially 'tags' come into this.

"trunk", "branches" and "tags" are just directories in the
repository. Nothing more. Meaning is given to them by convention
only. Read the book.

> In terms of 'breaking' code -> making mistakes that temporarily
> result in the post-commit, auto-update 'preview' version losing
> functionality, that is ok as we expect that once is a while.
>
> I think the model of copying a 'branch' of the development code to
> a staging and/or production server when the development version is
> ready, makes most sense to me. I'm just not 100% sure how this fits
> in with the SVN trunk, branch, tags situation.
>
>> Or, another alternative entirely is to store each developer's working
>> copy (or working copies) on the server, in the developer's
>> public_html directory. This is how the web development shop where I
>> worked did it. So on the server we had:
>>
>> /home/rschmidt/public_html/projecta/trunk
>> /home/jdoe/public_html/projectc/branch_1_0
>>
>> I could then access my working copy of the projecta trunk via
>>
>> http://www.example.com/~rschmidt/projecta/trunk
>>
>> I could work on these files by logging into the server via ssh and
>> working in vi, or I could mount my server-side public_html to my
>> client workstation via SMB or similar and work on it in my favorite
>> GUI editor. When done with a change, I can commit it.
>
> This is an interesting option. First thing I notice it is that it
> could make URL paths a PITA. I'll have to give this some thought.
> It would be a substantial change for us but if it's worth doing,
> it's worth doing.

Yes, we had to develop a new standard for URLs within our web
applications, so that it did not matter where the base of the URL
was. I believe we had a $BASE_URL variable which the config script
loaded at the top of every page would figure out. It took some doing,
but I think it was worth it for us.

>>> When a site is ready to go live, it is manually moved via FTP from
>>> the development server to a production server which is usually
>>> mapped to a corresponding VWS with a URL like:
>>>
>>> http://www.projecta.org/
>>
>> Your new strategy should then involve making a "tag" of the trunk
>> when you're ready to go live with a site. Then, on the production
>> server, you check out a copy of the tag from the repository.
>
> Ah, this is when I realise replying to email as I read it is a bad
> thing :) Ok so "tag"s fit in this way. Hmmm, interesting. I'm not
> sure exactly how we would be doing trunk and branching but that
> might be clearer once I decide on a more definite set up of the
> development side of things.
>
> The only example I have to follow in this whole thing is the
> diagrams I've seen of Firefox development schedules. Like this:
>
> http://www.mozilla.org/roadmap/branching-2004-05-28.png
>
> The ongoing code base they work on is the trunk, whereas releases
> are branches. I'm not sure that this differentiation is necessary
> for web work, or at least in the development workflow I am trying
> to perfect.

I begin to agree that for web sites, release branches are overkill
and a pain. You release so often for web sites, and there's no such
thing as version numbers. I say just commit everything to trunk, and
when you want to release something, copy the trunk to a new tag. Use
a branch if you want to do a major new feature or a reworking of an
existing feature, and these changes will take some time to develop,
during which you are still expected to do regular releases from the
old code. Once the new code is ready, you can merge the changes from
the branch to the trunk, make a tag from the trunk and release it as
usual.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: users-help_at_subversion.tigris.org
Received on 2008-01-13 13:18:08 CET

This message: [ Message body ]
Next message: Harvey, Edward: "RE: can access the repositry thru url in apache"
Previous message: G. T. Stresen-Reuter: "svn log exit status returns 1 with https"
In reply to: an0n1m0us_at_gmail.com: "Re: Default folder names compulsory?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]