[Grok-dev] Re: To a multipage html tutorial
Philipp von Weitershausen
philipp at weitershausen.de
Sat Apr 7 06:29:10 EDT 2007
Darryl Cousins wrote:
> I've had a go at scripting the generation of a multi-page html version
> of the grok tutorial.
Great, thanks for looking into this!
> Firstly I tried digging into docutils internals which didn't get me very far
> (though I've learnt some). I found 2 references from docutils mailing list which
> appeared to advise preprocessing the restructured text document. That did get me
> part of the way and I have posted the result of this attempt [1]_
I personally think this is the more preferrable way, the detour via
LaTeX just adds more and more machinery. On the other hand, the docutils
guys seem to want this functionality (splitting up documents over
several HTML files) as well and they seem to think it requires a lot of
works in docutils itself[1]. Though probably our use case is simple
enough so that it works with some home-brew code.
I looked at your code in which you try to chunk up the reST file. You're
looking for lines that start with '====='. This is quite a hack. reST
doesn't mandate that sections must be underlined with '=====' and it
would also fail if there was a really short section heading.
It's also a hack because docutils provides a DOM-like representation of
a parsed document and a way to only publish parts of the DOM. You could
therefore simply walk each top-level section in the node tree and
publish them individually.
More information is given in the docutils docs[2]. You probably want to
pay close attention to the "Modifying the Document Tree Before It Is
Written" section. Basically, walking a document's sections could look
like this::
>>> import docutils.core
>>> source = open('tutorial.txt').read()
>>> document = docutils.core.publish_doctree(source)
>>> for node in document:
... if node.tagname == 'section':
... # do something here...
> What is missing here however is a contents listing of the entire multipage
> document. Back into the docutils internals but without success.
The contents listing is just another node in the document tree and could
likely be rendered separately. The tricky part will be to adjust the
links to the different output pages. Perhaps this is best done in a
post-processing step.
> The second attempt I made with ``latex2html``. This actually looks a little more
> hopeful although I've made no attempt at styling. [2]_
>
> Although I produced ``tutorial.tex`` using ``rst2latex`` (also borrowing
> from ``grok2pdf.sh``) I found I needed to do some editing of tutorial.tex to get
> rid of some errors and warnings when running ``latex2html``. The sidenotes are
> lost at this stage (I did get ``\fbox`` to work in a separate text.tex file but
> the resulting image was not very pretty). I think they will need to be rendered
> with inline tex markup rather than use the ``\fbox`` markup.
It should be no problem to add a LaTeX stylesheet with a new definition
of \fbox{} that simply inlines the text or whatever. Manual re-editing
shouldn't be necessary by any means.
It would be a shame to lose the visual sidebars, though.
[1] http://docutils.sourceforge.net/docs/dev/todo.html#document-splitting
[2] http://docutils.sourceforge.net/docs/dev/hacking.html
--
http://worldcookery.com -- Professional Zope documentation and training
More information about the Grok-dev
mailing list