[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Converting design doc(s) to HTML.

From: Karl Fogel <kfogel_at_red-bean.com>
Date: 2006-05-23 20:24:04 CEST

Our main design doc is currently in doc/design/*.xml. It consists of
multiple XML
files that compile (via the usual DocBook compilation process) to
HTML, PDF, etc.

I would like to convert it to a single HTML file, www/design.html. I
believe that
almost everyone reads it as HTML anyway, and that it would be both more
maintainable and more easily referable-into as a single HTML file. In
fact, I already
have converted it, and www/design.html is sitting in my working copy. (I had
content changes to make, and felt impeded by the current format and compilation
process, so I pushed stack and did the conversion first.)

I have not committed the conversion, because maxb prefers that the doc remain
multifile XML. We had a long discussion in IRC about it (involving also madan,
davidjames, rooneg, and malcomr). Max and I did not persuade each other,
but I promised him I'd post here before committing the conversion.

Below is the IRC transcript. I believe it lays out all the arguments
on both sides.
Comments welcome. If no one comments, I will go with the apparent majority
in IRC and commit the conversion (maxb I hope agrees that that's a fair
approach). Note that my content changes are *not* in this commit; they
would of course be a separate commit -- this first commit would just be the
straight conversion.


<kfogel> Last night I manually translated doc/design/*.xml into a
             single HTML page, www/design.html. I think that's what
             it should have been all along. It's more useful to
             people as a single HTML page, and it'll be easier for us
             to maintain that way.

<kfogel> I know you guys were working on some changes to the XML
             files. Do you mind if I commit this change?

<maxb> kfogel: eek!

<kfogel> I think your patches should be portable (with
             hand-tweaking) to the HTML.

<maxb> Why? After all, we can compile the XML to a single HTML
             easily enouhg

<madan> I would prefer the single page html

<kfogel> maxb: "easily" is relative.

<maxb> kfogel: 'make html' isn't easy?

<kfogel> maxb: what if you don't have the tools installed?

<kfogel> maxb: a compile step is *always* a barrier.

<davidjames> I agree it's easier to maintain as an HTML document

<kfogel> And frankly, who reads that document in anything but HTML
             format anyway?

<davidjames> Also, if it's in www/design.html, we can point users to
             it as http://subversion.tigris.org/design.html

<maxb> These days, libxslt and the xsl stylesheets are packaged
             for just about everybody. Except windows, of course, and
             they don't count :-)

<kfogel> I mean, what is the XML getting us here, except a) an
             extra compile step and possible tool trouble for some
             people, and b) a format with which many writers are less

<kfogel> Exactly.

<kfogel> what davidjames said

<kfogel> maxb: I personally have had trouble building XML->HTML
             docs, and I was *writing a book using XML* at the time.

<kfogel> This process is not as trivial as you are making it out
             to be. You are too good at managing your tool
             environment, you don't realize that others have it harder

<kfogel> wow

<kfogel> all the people I was looking for, in the same place

<kfogel> how nice

<madan> +1 from me, I dont see any specific use of xml... its
             making the change proces harder, for what I know

<madan> and all that makefile code to be maintianed.... yuk

<kfogel> madan: I applaud your obvious good taste! :-)

<maxb> I am opposed to HTML on the basis that it is far easier
             to slip into inconsistent formatting when you are marking
             up for presentation rather than role

<madan> so make it difficult to change the source doc?

<madan> ;)

<maxb> Also, on the basis that the having the design document
             being in the same form as the svnbook is nice.

<kfogel> maxb: indeed, you will probably find that some of my
             formatting decisions were presentation-oriented rather
             than role-oriented, when I did the translation. I think
             these are easy to fix & document, and anyway, it's not
             the worst sin in the world.

<kfogel> maxb: why is that "nice"?

<kfogel> What's the actual advantage?

* kfogel turns on all the fire hoses at once :-)

<sussman> fight, fight, fight!

* sussman grabs popcorn

<maxb> Because, we wish the people editing the design document
             would come and help with the svnbook, of course! :-)

* madan ducks the crossfire

<kfogel> maxb: and, uh, how often does that happen?

<kfogel> In fact, who helps edit the design doc at all?

<maxb> No one, at present.

<kfogel> Has the thing not been sitting in stasis since roughly
             the late Jurassic?

<maxb> But, anything which helps train committers to help out
             with the svnbook is good to me.

<kfogel> rooneg: sure! nothing wrong with a big party...

<kfogel> maxb: I highly doubt that the reason people do or don't
             contribute to svnbook has anything to do with the
             formatting. Book tasks tend to be big enough that what's
             daunting is the content itself, not the formatting
             overhead (which is, in relative term, low).

<clkao> kfogel!

<kfogel> The design doc is different: the formatting overhead is
             higher, relatively speaking.

<kfogel> Here, let me put it this way maxb:

<maxb> I think the design doc is large enough that there is
             value in it being in multiple files.

<kfogel> If the design doc had been www/design.html when you
             arrived at the Subversion project, would it ever have
             occurred to you to change it to multifile XML? Be honest
             with yourself, now :-)...

* kfogel plays his trump card

<maxb> And whilst I agree that there is the issue of learning
             the docbook vocabulary, you can do that by example for
             the most part

* maxb eyes hacking.html with a conspiratorial grin....

<kfogel> The trump card: I think that any argument claiming that
             the current format encourages or even is friendly to
             editing, is obviously wrong, by virtue of the fact that
             the design doc has been essentially unmaintained for
             years. People *are not* editing it. It's a fact. And
             in my case, the reason I haven't been doing so is mainly
             that the XML was too cumbersome.

<maxb> I feel that that reasoning isn't quite true

<kfogel> clkao: hey there!

<maxb> For a very long time, it was in texinfo, with a toolchain
             that was truly hard to come by, and poorly documented

<kfogel> maxb: there's another step that the XML forces on us: we
             have to upload the docs when we update them.

<kfogel> We can't just commit a change and have it be visible.

<maxb> And, by the time I converted it, it was already so out of
             date, that it required a large amount of momentum to
             break into.

<davidjames> There's no need to have a separate build step
             for... HTML! :)

<kfogel> Instead, the design docs -- which are very important --
             are off in the *downloads* section, which is totally
             counterintuitive, and it's hard for us to point people to
             specific parts.

<madan> may I add, that yesterday was the first time in 1.5 years
             that managed to 'build' the design doc and go through it

<madan> without distractions

<maxb> kfogel: I'm happy to set up regular builds on red-bean

<madan> maxb, its also difficult to edit

<kfogel> maxb: not a solution at all, as you well know

<maxb> And I'm also happy to improve the documentation
             concerning building the stuff

<madan> what are you planning to do about it

<kfogel> maxb: well, I've laid out my reasons.

<maxb> kfogel: "not a solution at all, as you well know" --- I
             really don't know

<kfogel> Wow.

<kfogel> The reasons why that is not a good solution seem so
             obvious that I'm surprised to be laying them out...

<kfogel> Hmmm.

<kfogel> 1. another build process somewhere that can break, and
             only one person who set it up and can debug it on a
             moment's notice.

<maxb> I see this proposal as a technical regression for the
             sake of convenience. As such, all avenues of providing
             equivalent convenience without the regression must be
             analyzed before going ahead with it.

<kfogel> 2. the URL to the doc becomes at red-bean instead of
             tigris.org, OR, we have some complex automated upload
             procedure that is sure to break.

* heisenbug has quit ()

<kfogel> maxb: I am not persuaded at all by these fancy words :-).

<maxb> OK, let me put it another way: I object to adapting
             ourselves to conform to our tools, instead of adapting
             our tools to conform to *us*! :-)

<kfogel> Okay, I feel I've laid out all the reasons. I'm going to
             make the commit, unless you want to veto it, which I
             would of course respect, but then I will immediately post
             explaining why I think it's a good move and calling for a
             formal vote.

<maxb> Hmph.

<kfogel> I'm not sure what else to say :-).

<maxb> Well, could you at least air the issue on the list

<kfogel> I could, if you really want. But it seems like all the
             people who actually ever *work* on the doc are here right
             now, no?

<madan> maxb, why cant we just have it simpler!

<malcolmr> maxb: While I like Docbook and semantic markup over HTML
             any day, is there a real advantage to keeping a
             (mostly-static, smallish) design document in XML format?

<maxb> No one previously has worked on that doc.

<kfogel> aren't you and madan working on it right now?

<kfogel> And, as I said before, I have several times sat down to
             work on it only to be disgusted by dealing with the XML
             overhead and giving up. So I count me as "working on it"

<madan> malcolmr had some comments on the build scripts... maybe
             he could add value too...

<malcolmr> I did?

<madan> yesterday

<maxb> I'd like to have at least some small attempt to
             streamlining the overhead, before we toss the format
             entirely and resort to HTML.

<maxb> I agree: I don't find it a problem - so I may have
             overlooked its implications for others.

<kfogel> No. I mean, that's a nice thought, but many people here
             have a *lot* of experience working with XML/DocBook docs,
             and we know how streamlined it can get -- there's a
             limit. It will never, ever be as easy as HTML.

<madan> malcolmr, about doc/tools/bin/find-xsl.py

<kfogel> I am not making this argument from ignorance, at least

<malcolmr> Oh that, just that it was broken on Gentoo. Max fixed it,
             didn't he?

<maxb> Indeed.

<malcolmr> (I don't even use Gentoo)

<kfogel> Under the current system, could you please send me a URL
             pointing directly to the "bubble-up" explanation? No,
             you can't really. And yet we need that all the time,
             people are always asking for it.

<madan> yeah, he did

<kfogel> Pointing them to a prebuilt HTML page on red-bean is not
             the way to go, because it takes them unnecessarily out of
             the tigris site.

<malcolmr> kfogel: Bad example. I can send you such a URL, but your
             web browser just can't render DocBook.

<kfogel> The style will be different. Or, if we download the
             stylesheet to make it the same, then we commit the
             opposite sin: we have a site that *looks* like it's
             tigris but in fact is red-bean.

* heisenbug (n=ggreif@p5494470B.dip.t-dialin.net) has joined #svn-dev

<maxb> But the only reason it can't be on tigris is because of a
             lack-of-feature in CEE. To which red-bean is our

<kfogel> maxb: so why is that a bad reason?

<kfogel> malcomr: these practicalities are what makes the world go
             round, though.

<malcolmr> Well, yes, I was just being precise :-)

* kfogel rolls eyes :-)

<davidjames> The issue with a prebuilt web page isn't that it's on
             red-bean, it's that, if I make a change to the 'XML'
             file, I have to build the HTML file again to see what the
             changes are -- I can't just open up my HTML file in

<kfogel> exactly!

<davidjames> hacking.html is much bigger than design.html

<kfogel> good point

<davidjames> Imagine if we split up hacking.html into a set of XML
             documents which had to be built

* kfogel names davidjames as new Ambassador to Maxb for the idea

* maxb eyes hacking.html with a conspiratorial grin, again.

<kfogel> Look, I didn't just wake up and decide I wanted HTML
             instead of XML. Rather, I had specific content
             improvements I wanted to make to the design doc.

<kfogel> But if I make them, what good will it do? I'll have to
             work harder to review them (as davidjames pointed out).
             I'll have to work harder to get them "live" on tigris.
             And I'll have to work harder to point other people to

<kfogel> The whole thing is silly. None of these obstructions
             need be there.

* sars is now known as sarnold

<malcolmr> Just to divert the conversation ever-so-slightly, could
             anyone explain to me the reason we litter out HTML with
             title tags that contain the id attribute value?

<malcolmr> I'm sure there's a reason, but it's really annoying.

<rooneg> what is the advantage that docbook is giving us? does
             anyone actually gain from the ability to produce a pdf of
             our design doc?

<davidjames> Maybe we should leave a fork of the design document in
             XML for maxb to maintain separately ;)

<wsanchez> perhaps if you use XHTML, you can have both. :)

<malcolmr> rooneg: Agreed. Good for svnbook, no real advantage for
             the design doc.

<davidjames> It'll be like the SWS (Subversion with space) project

<maxb> I don't especially care about PDF.

<kfogel> malcomr: anchors

<kfogel> malcomr: "#foo" URLs

<kfogel> oh, *title* tags?

<kfogel> dunno

<malcolmr> Yup

<malcolmr> For example, go to
             http://subversion.tigris.org/links.html . Hover over the
             'Mailing list archives' section with

<kfogel> malcomr: the two goals are: "#foo" URLs should work, and
             when you hover your mouse over a section, a little
             windowlet should pop up telling you the ID of that
             section, in case you want to refer someone else to it.
             As long as those two things work, we're fine.

<maxb> Given that the design document is going to grow, I think
             the automatic generation of contents is a valuable
             feature of docbook

<malcolmr> kfogel: title isn't supposed to be misused like that.

<malcolmr> Also, it's annoying.

<malcolmr> Alright, it's annoying me.

<davidjames> I think the 'title' tag enables the windowlet (which pops
             up with the id) on some browsers

<kfogel> malcomr: patches welcome, I just did what worked. I care
             only about the goals, not how they were achieved.

<maxb> I also find that scrolling up and down in a huge document
             is more awkward than navigating by links through a
             chaptered set of pages.

<kfogel> maxb: imho, the FAQ proves that maintaining ToC is
             totally possible.

<malcolmr> Well if your goal is to have an annoying tooltip pop up,
             you succeeded :-)

<kfogel> malcomr: s/annoying// sure

<malcolmr> Heh

<kfogel> It's not annoying to those of us who depend on it to give
             people more specific URLs :-).

<maxb> The FAQ proves very well that maintaining a ToC is
             annoying, yes.

<kfogel> Do you really think the current non-maintainedness of the
             design doc is unrelated to its format?

<maxb> Yes, I do.

<kfogel> oops, gotta go for a bit, someone's here, back later

<maxb> Actually, not unrelated: I credit the current
             non-maintainedness to the fact that it was in a much less
             known format (texinfo) for a long time, and in that time,
             became so stale, that no-one had the energy to launch a
             major restoration of it until after I XML-ized it.

<davidjames> maxb: Isn't this a bikeshed?

<davidjames> I.e., XML vs HTML, it's probably not a big difference
             either way, just a personal preference thing

<malcolmr> There are some differences: tool support, semantics
             vs. presentation (mostly), etc.

* madan_ (n=madan@ has joined #svn-dev

<davidjames> Sure, but, in actual practice, for a doc that is this
             small, there's not a big difference between HTML and XML

<maxb> I don't view it as a bikeshed, because it's not a choice
             between two solely stylistic alternatives - each choice
             has explicit advantages and disadvantages.

<maxb> Problem is, we can't seem to agree on which side the
             advantages outweigh the disadvantages

<maxb> I would also suggest that 140KB and growing is not small

<davidjames> maxb: Well, I think it's pretty much a wash, there are
             advantages to both

<davidjames> maxb: The major advantage for HTML is that I (and kfogel
             and madan) would be more likely to work on it, so that
             multiplies your maintainer base by 4 ;)

<maxb> Yes, there are. I happen to believe that XML has more.

* madan has quit (Read error: 110 (Connection timed out))

* madan_ watches as maxb polices the dev channel and the dev list

<madan_> :)

<maxb> davidjames, madan_ : What is your principle objection to
             DocBook? Is it the need for compilation to preview? Or
             need to learn the vocabulary? Or something else?

<davidjames> maxb: Both

<madan_> more than that

<madan_> difficult to change and review

<maxb> difficult to change? How? Just use a text editor, like

<madan_> <br> and <p> are more understandable than <bookinfo> and

<madan_> and on ttop of that somebody who wants to read that
             doc.. the new dev

<madan_> shouldnt be inconvinienced...

<madan_> better if the html is readily available

<maxb> Anyway: In conclusion, I'd prefer that the XML->HTML idea
             be aired on the dev list first. I realize given the
             numbers on each side, it looks like a conversion to HTML
             is inevitable, but since this isn't a trivial decision, I
             think it ought to go through a forum that everyone can
             see, not just those who happen to be online at the time.

* madan_ seconds maxb on that

<madan_> but I dont think many ppl are offline atm

<madan_> nevertheless

<kfogel> maxb: okay, I will air on the list first, instead of
             shooting^H^H^H^H^H committing from the hip.

<maxb> There are many who are never on IRC

* Kamesh (n=kamesh@ has joined #svn-dev

<kfogel> true

<maxb> One additional request:

<maxb> If it goes HTML, can it please go to a set of html pages,
             not a monolithic one.

<madan_> NOOOOOOOO

<maxb> Huh? How is *that* objectionable?

<madan_> it wont be searchable

<madan_> after all its not that big

<maxb> And this is why we should use docbook, so we can make
             monolithic and chunked versions :-)

<maxb> Since there are ardent supporters of both forms

<madan_> true

<madan_> but it doesnt really matter for a file this size

<madan_> cmon, madan_

<madan_> oops, I meant maxb

<maxb> It's 140KB already, and likely to grow

<maxb> If searching is an issue, fix up a google-form for it :-)

* madan_ wonders if it is possible to have one html with multiple html
files embedded in it

<davidjames> Haha

<kfogel> madan_: with server-side includes, yes

<davidjames> No really, we already have a monolithic HTML file for
             hacking.html and it works fine. 108k

<maxb> I contest that it works fine, and desire to split it.

<kfogel> maxb: I much prefer one file (and wonder how you use your
             editor, that it's a problem :-) )

<madan_> kfogel, so we need a webserver in the codebase?

<madan_> :P

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue May 23 20:27:49 2006

This is an archived mail posted to the Subversion Dev mailing list.