[Zope-Checkins]
CVS: Zope/lib/python/third_party/docutils/docs/howto
- i18n.txt:1.1.4.1 rst-directives.txt:1.1.4.1 rst-roles.txt:1.1.4.1
Andreas Jung
andreas at andreas-jung.com
Fri Oct 29 15:08:18 EDT 2004
Update of /cvs-repository/Zope/lib/python/third_party/docutils/docs/howto
In directory cvs.zope.org:/tmp/cvs-serv23727/lib/python/third_party/docutils/docs/howto
Added Files:
Tag: Zope-2_7-branch
i18n.txt rst-directives.txt rst-roles.txt
Log Message:
moved docutils to lib/python/third_party
=== Added File Zope/lib/python/third_party/docutils/docs/howto/i18n.txt ===
================================
Docutils_ Internationalization
================================
:Author: David Goodger
:Contact: goodger at users.sourceforge.net
:Date: $Date: 2004/10/29 19:08:17 $
:Revision: $Revision: 1.1.4.1 $
:Copyright: This document has been placed in the public domain.
.. contents::
This document describes the internationalization facilities of the
Docutils_ project. `Introduction to i18n`_ by Tomohiro KUBOTA is a
good general reference. "Internationalization" is often abbreviated
as "i18n": "i" + 18 letters + "n".
.. Note::
The i18n facilities of Docutils should be considered a "first
draft". They work so far, but improvements are welcome.
Specifically, standard i18n facilities like "gettext" have yet to
be explored.
Docutils is designed to work flexibly with text in multiple languages
(one language at a time). Language-specific features are (or should
be [#]_) fully parameterized. To enable a new language, two modules
have to be added to the project: one for Docutils itself (the
`Docutils Language Module`_) and one for the reStructuredText parser
(the `reStructuredText Language Module`_).
.. [#] If anything in Docutils is insufficiently parameterized, it
should be considered a bug. Please report bugs to the Docutils
project bug tracker on SourceForge at
http://sourceforge.net/tracker/?group_id=38414&atid=422030.
.. _Docutils: http://docutils.sourceforge.net/
.. _Introduction to i18n:
http://www.debian.org/doc/manuals/intro-i18n/
Language Module Names
=====================
Language modules are named using a case-insensitive language
identifier as defined in `RFC 1766`_, converting hyphens to
underscores [#]_. A typical language identifier consists of a
2-letter language code from `ISO 639`_ (3-letter codes can be used if
no 2-letter code exists; RFC 1766 is currently being revised to allow
3-letter codes). The language identifier can have an optional subtag,
typically for variations based on country (from `ISO 3166`_ 2-letter
country codes). If no language identifier is specified, the default
is "en" for English. Examples of module names include ``en.py``,
``fr.py``, ``ja.py``, and ``pt_br.py``.
.. [#] Subtags are separated from primary tags by underscores instead
of hyphens, to conform to Python naming rules.
.. _RFC 1766: http://www.faqs.org/rfcs/rfc1766.html
.. _ISO 639: http://lcweb.loc.gov/standards/iso639-2/englangn.html
.. _ISO 3166: http://www.iso.ch/iso/en/prods-services/iso3166ma/
02iso-3166-code-lists/index.html
Python Code
===========
All Python code in Docutils will be ASCII-only. In language modules,
Unicode-escapes will have to be used for non-ASCII characters.
Although it may be possible for some developers to store non-ASCII
characters directly in strings, it will cause problems for other
developers whose locales are set up differently.
`PEP 263`_ introduces source code encodings to Python modules,
implemented beginning in Python 2.3. Until PEP 263 is fully
implemented as a well-established convention, proven robust in daily
use, and the tools (editors, CVS, email, etc.) recognize this
convention, Docutils shall remain conservative.
As mentioned in the note above, developers are invited to explore
"gettext" and other i18n technologies.
.. _PEP 263: http://www.python.org/peps/pep-0263.html
Docutils Language Module
========================
Modules in ``docutils/languages`` contain language mappings for
markup-independent language-specific features of Docutils. To make a
new language module, just copy the ``en.py`` file, rename it with the
code for your language (see `Language Module Names`_ above), and
translate the terms as described below.
Each Docutils language module contains three module attributes:
``labels``
This is a mapping of node class names to language-dependent
boilerplate label text. The label text is used by Writer
components when they encounter document tree elements whose class
names are the mapping keys.
The entry values (*not* the keys) should be translated to the
target language.
``bibliographic_fields``
This is a mapping of language-dependent field names (converted to
lower case) to canonical field names (keys of
``DocInfo.biblio_notes`` in ``docutils.transforms.frontmatter``).
It is used when transforming bibliographic fields.
The keys should be translated to the target language.
``author_separators``
This is a list of strings used to parse the 'Authors'
bibliographic field. They separate individual authors' names, and
are tried in order (i.e., earlier items take priority, and the
first item that matches wins). The English-language module
defines them as ``[';', ',']``; semi-colons can be used to
separate names like "Arthur Pewtie, Esq.".
Most languages won't have to "translate" this list.
reStructuredText Language Module
================================
Modules in ``docutils/parsers/rst/languages`` contain language
mappings for language-specific features of the reStructuredText
parser. To make a new language module, just copy the ``en.py`` file,
rename it with the code for your language (see `Language Module
Names`_ above), and translate the terms as described below.
Each reStructuredText language module contains just one module
attribute:
``directives``
This is a mapping from language-dependent directive names to
canonical directive names. The canonical directive names are
registered in ``docutils/parsers/rst/directives/__init__.py``, in
``_directive_registry``.
The keys should be translated to the target language. Synonyms
(multiple keys with the same values) are allowed; this is useful
for abbreviations.
``roles``
This is a mapping language-dependent role names to canonical role
names for interpreted text. The canonical directive names are
registered in ``docutils/parsers/rst/states.py``, in
``Inliner._interpreted_roles`` (this may change).
The keys should be translated to the target language. Synonyms
(multiple keys with the same values) are allowed; this is useful
for abbreviations.
Testing the Language Modules
============================
Whenever a new language module is added or an existing one modified,
the unit tests should be run. The test modules can be found in the
docutils/test directory from CVS_ or from the `latest CVS snapshot`_.
The ``test_language.py`` module can be run as a script. With no
arguments, it will test all language modules. With one or more
language codes, it will test just those languages. For example::
$ python test_language.py en
..
----------------------------------------
Ran 2 tests in 0.095s
OK
Use the "alltests.py" script to run all test modules, exhaustively
testing the parser and other parts of the Docutils system.
.. _CVS: http://sourceforge.net/cvs/?group_id=38414
.. _latest CVS snapshot: http://docutils.sf.net/docutils-snapshot.tgz
=== Added File Zope/lib/python/third_party/docutils/docs/howto/rst-directives.txt ===
======================================
Creating reStructuredText Directives
======================================
:Authors: Dethe Elza, David Goodger
:Contact: delza at enfoldingsystems.com
:Date: $Date: 2004/10/29 19:08:17 $
:Revision: $Revision: 1.1.4.1 $
:Copyright: This document has been placed in the public domain.
Directives are the primary extension mechanism of reStructuredText.
This document aims to make the creation of new directives as easy and
understandable as possible. There are only a couple of
reStructuredText-specific features the developer needs to know to
create a basic directive.
The syntax of directives is detailed in the `reStructuredText Markup
Specification`_, and standard directives are described in
`reStructuredText Directives`_.
Directives are a reStructuredText markup/parser concept. There is no
"directive" element, no single element that corresponds exactly to the
concept of directives. Instead, choose the most appropriate elements
from the existing Docutils elements. Directives build structures
using the existing building blocks. See `The Docutils Document Tree`_
and the ``docutils.nodes`` module for more about the building blocks
of Docutils documents.
.. _reStructuredText Markup Specification:
../ref/rst/restructuredtext.html#directives
.. _reStructuredText Directives: ../ref/rst/directives.html
.. _The Docutils Document Tree: ../ref/doctree.html
.. contents:: Table of Contents
Define the Directive Function
=============================
The directive function does any processing that the directive
requires. This may require the use of other parts of the
reStructuredText parser. This is where the directive actually *does*
something.
The directive implementation itself is a callback function whose
signature is as follows::
def directive_fn(name, arguments, options, content, lineno,
content_offset, block_text, state, state_machine):
code...
# Set function attributes:
directive_fn.arguments = ...
directive_fn.options = ...
direcitve_fn.content = ...
Function attributes are described below (see `Specify Directive
Arguments, Options, and Content`_). The directive function parameters
are as follows:
- ``name`` is the directive type or name.
- ``arguments`` is a list of positional arguments, as specified in the
``arguments`` function attribute.
- ``options`` is a dictionary mapping option names to values. The
options handled by a directive function are specified in the
``options`` function attribute.
- ``content`` is a list of strings, the directive content. Use the
``content`` function attribute to allow directive content.
- ``lineno`` is the line number of the first line of the directive.
- ``content_offset`` is the line offset of the first line of the
content from the beginning of the current input. Used when
initiating a nested parse.
- ``block_text`` is a string containing the entire directive. Include
it as the content of a literal block in a system message if there is
a problem.
- ``state`` is the state which called the directive function.
- ``state_machine`` is the state machine which controls the state
which called the directive function.
Directive functions return a list of nodes which will be inserted into
the document tree at the point where the directive was encountered.
This can be an empty list if there is nothing to insert. For ordinary
directives, the list must contain body elements or structural
elements. Some directives are intended specifically for substitution
definitions, and must return a list of ``Text`` nodes and/or inline
elements (suitable for inline insertion, in place of the substitution
reference). Such directives must verify substitution definition
context, typically using code like this::
if not isinstance(state, states.SubstitutionDef):
error = state_machine.reporter.error(
'Invalid context: the "%s" directive can only be used '
'within a substitution definition.' % (name),
nodes.literal_block(block_text, block_text), line=lineno)
return [error]
Specify Directive Arguments, Options, and Content
=================================================
Function attributes are interpreted by the directive parser (from the
``docutils.parsers.rst.states.Body.run_directive()`` method). If
unspecified, directive function attributes are assumed to have the
value ``None``. Three directive function attributes are recognized:
- ``arguments``: A 3-tuple specifying the expected positional
arguments, or ``None`` if the directive has no arguments. The 3
items in the tuple are:
1. The number of required arguments.
2. The number of optional arguments.
3. A boolean, indicating if the final argument may contain whitespace.
Arguments are normally single whitespace-separated words. The final
argument may contain whitespace when indicated by the value 1 (True)
for the third item in the argument spec tuple. In this case, the
final argument in the ``arguments`` parameter to the directive
function will contain spaces and/or newlines, preserved from the
input text.
If the form of the arguments is more complex, specify only one
argument (either required or optional) and indicate that final
whitespace is OK (1/True); the client code must do any
context-sensitive parsing.
- ``options``: The option specification. ``None`` or an empty dict
implies no options to parse.
An option specification must be defined detailing the options
available to the directive. An option spec is a mapping of option
name to conversion function; conversion functions are applied to
each option value to check validity and convert them to the expected
type. Python's built-in conversion functions are often usable for
this, such as ``int``, ``float``, and ``bool`` (included in Python
from version 2.2.1). Other useful conversion functions are included
in the ``docutils.parsers.rst.directives`` package (in the
``__init__.py`` module):
- ``flag``: For options with no option arguments. Checks for an
argument (raises ``ValueError`` if found), returns ``None`` for
valid flag options.
- ``unchanged_required``: Returns the text argument, unchanged.
Raises ``ValueError`` if no argument is found.
- ``unchanged``: Returns the text argument, unchanged. Returns an
empty string ("") if no argument is found.
- ``path``: Returns the path argument unwrapped (with newlines
removed). Raises ``ValueError`` if no argument is found or if the
path contains internal whitespace.
- ``nonnegative_int``: Checks for a nonnegative integer argument,
and raises ``ValueError`` if not.
- ``class_option``: Converts the argument into an ID-compatible
string and returns it. Raises ``ValueError`` if no argument is
found.
A further utility function, ``choice``, is supplied to enable
options whose argument must be a member of a finite set of possible
values. A custom conversion function must be written to use it.
For example::
from docutils.parsers.rst import directives
def yesno(argument):
return directives.choice(argument, ('yes', 'no'))
For example, here is an option spec for a directive which allows two
options, "name" and "value", each with an option argument::
directive_fn.options = {'name': unchanged, 'value': int}
- ``content``: A boolean; true if content is allowed. Directive
functions must handle the case where content is required but not
present in the input text (an empty content list will be supplied).
The final step of the ``run_directive()`` method is to call the
directive function itself.
Register the Directive
======================
If the directive is a general-use addition to the Docutils core, it
must be registered with the parser and language mappings added:
1. Register the new directive using its canonical name in
``docutils/parsers/rst/directives/__init__.py``, in the
``_directive_registry`` dictionary. This allows the
reStructuredText parser to find and use the directive.
2. Add an entry to the ``directives`` dictionary in
``docutils/parsers/rst/languages/en.py`` for the directive, mapping
the English name to the canonical name (both lowercase). Usually
the English name and the canonical name are the same.
3. Update all the other language modules as well. For languages in
which you are proficient, please add translations. For other
languages, add the English directive name plus "(translation
required)".
If the directive is application-specific, use the
``register_directive`` function::
from docutils.parsers.rst import directives
directives.register_directive(directive_name, directive_function)
Examples
========
For the most direct and accurate information, "Use the Source, Luke!".
All standard directives are documented in `reStructuredText
Directives`_, and the source code implementing them is located in the
``docutils/parsers/rst/directives`` package. The ``__init__.py``
module contains a mapping of directive name to module & function name.
Several representative directives are described below.
Admonitions
-----------
Admonition directives, such as "note" and "caution", are quite simple.
They have no directive arguments or options. Admonition directive
content is interpreted as ordinary reStructuredText. The directive
function simply hands off control to a generic directive function::
def note(*args):
return admonition(nodes.note, *args)
attention.content = 1
Note that the only thing distinguishing the various admonition
directives is the element (node class) generated. In the code above,
the node class is passed as the first argument to the generic
directive function (early version), where the actual processing takes
place::
def admonition(node_class, name, arguments, options, content, lineno,
content_offset, block_text, state, state_machine):
text = '\n'.join(content)
admonition_node = node_class(text)
if text:
state.nested_parse(content, content_offset, admonition_node)
return [admonition_node]
else:
warning = state_machine.reporter.warning(
'The "%s" admonition is empty; content required.'
% (name), '',
nodes.literal_block(block_text, block_text), line=lineno)
return [warning]
Three things are noteworthy in the function above:
1. The ``admonition_node = node_class(text)`` line creates the wrapper
element, using the class passed in from the initial (stub)
directive function.
2. The call to ``state.nested_parse()`` is what does the actual
processing. It parses the directive content and adds any generated
elements as child elements of ``admonition_node``.
3. If there was no directive content, a warning is generated and
returned. The call to ``state_machine.reporter.warning()``
includes a literal block containing the entire directive text
(``block_text``) and the line (``lineno``) of the top of the
directive.
"image"
-------
The "image" directive is used to insert a picture into a document.
This directive has one argument, the path to the image file, and
supports several options. There is no directive content. Here's an
early version of the image directive function::
def image(name, arguments, options, content, lineno,
content_offset, block_text, state, state_machine):
reference = ''.join(arguments[0].split('\n'))
if reference.find(' ') != -1:
error = state_machine.reporter.error(
'Image URI contains whitespace.', '',
nodes.literal_block(block_text, block_text),
line=lineno)
return [error]
options['uri'] = reference
image_node = nodes.image(block_text, **options)
return [image_node]
image.arguments = (1, 0, 1)
image.options = {'alt': directives.unchanged,
'height': directives.nonnegative_int,
'width': directives.nonnegative_int,
'scale': directives.nonnegative_int,
'align': align}
Several things are noteworthy in the code above:
1. The "image" directive requires a single argument, which is allowed
to contain whitespace (see the argument spec above,
``image.arguments = (1, 0, 1)``). This is to allow for long URLs
which may span multiple lines. The first line of the ``image``
function joins the URL, discarding any embedded newlines. Then the
result is checked for embedded spaces, which are *not* allowed.
2. The reference is added to the ``options`` dictionary under the
"uri" key; this becomes an attribute of the ``nodes.image`` element
object. Any other attributes have already been set explicitly in
the source text.
3. The "align" option depends on the following definitions (which
actually occur earlier in the source code)::
align_values = ('top', 'middle', 'bottom', 'left', 'center',
'right')
def align(argument):
return directives.choice(argument, align_values)
"contents"
----------
The "contents" directive is used to insert an auto-generated table of
contents (TOC) into a document. It takes one optional argument, a
title for the TOC. If no title is specified, a default title is used
instead. The directive also handles several options. Here's an early
version of the code::
def contents(name, arguments, options, content, lineno,
content_offset, block_text, state, state_machine):
"""Table of contents."""
if arguments:
title_text = arguments[0]
text_nodes, messages = state.inline_text(title_text, lineno)
title = nodes.title(title_text, '', *text_nodes)
else:
messages = []
title = None
pending = nodes.pending(parts.Contents, {'title': title},
block_text)
pending.details.update(options)
state_machine.document.note_pending(pending)
return [pending] + messages
contents.arguments = (0, 1, 1)
contents.options = {'depth': directives.nonnegative_int,
'local': directives.flag,
'backlinks': backlinks}
Aspects of note include:
1. The ``contents.arguments = (0, 1, 1)`` function attribute specifies
a single, *optional* argument. If no argument is present, the
``arguments`` parameter to the directive function will be an empty
list.
2. If an argument *is* present, its text is passed to
``state.inline_text()`` for parsing. Titles may contain inline
markup, such as emphasis or inline literals.
3. The table of contents is not generated right away. Typically, a
TOC is placed near the beginning of a document, and is a summary or
outline of the section structure of the document. The entire
document must already be processed before a summary can be made.
This directive leaves a ``nodes.pending`` placeholder element in
the document tree, marking the position of the TOC and including a
``details`` internal attribute containing all the directive
options, effectively communicating the options forward. The actual
table of contents processing is performed by a transform,
``docutils.transforms.parts.Contents``, after the rest of the
document has been parsed.
=== Added File Zope/lib/python/third_party/docutils/docs/howto/rst-roles.txt ===
==================================================
Creating reStructuredText Interpreted Text Roles
==================================================
:Authors: David Goodger
:Contact: goodger at python.org
:Date: $Date: 2004/10/29 19:08:17 $
:Revision: $Revision: 1.1.4.1 $
:Copyright: This document has been placed in the public domain.
Interpreted text roles are an extension mechanism for inline markup in
reStructuredText. This document aims to make the creation of new
roles as easy and understandable as possible.
Standard roles are described in `reStructuredText Interpreted Text
Roles`_. See the `Interpreted Text`_ section in the `reStructuredText
Markup Specification`_ for syntax details.
.. _reStructuredText Interpreted Text Roles: ../ref/rst/roles.html
.. _Interpreted Text:
../ref/rst/restructuredtext.html#interpreted-text
.. _reStructuredText Markup Specification:
../ref/rst/restructuredtext.html
.. contents::
Define the Role Function
========================
The role function creates and returns inline elements (nodes) and does
any additional processing required. Its signature is as follows::
def role_fn(name, rawtext, text, lineno, inliner,
options={}, content=[]):
code...
# Set function attributes for customization:
role_fn.options = ...
role_fn.content = ...
Function attributes are described below (see `Specify Role Function
Options and Content`_). The role function parameters are as follows:
* ``name``: The local name of the interpreted role, the role name
actually used in the document.
* ``rawtext``: A string containing the enitre interpreted text input,
including the role and markup. Return it as a ``problematic`` node
linked to a system message if a problem is encountered.
* ``text``: The interpreted text content.
* ``lineno``: The line number where the interpreted text begins.
* ``inliner``: The ``docutils.parsers.rst.states.Inliner`` object that
called role_fn. It contains the several attributes useful for error
reporting and document tree access.
* ``options``: A dictionary of directive options for customization
(from the `"role" directive`_), to be interpreted by the role
function. Used for additional attributes for the generated elements
and other functionality.
* ``content``: A list of strings, the directive content for
customization (from the `"role" directive`_). To be interpreted by
the role function.
Role functions return a tuple of two values:
* A list of nodes which will be inserted into the document tree at the
point where the interpreted role was encountered (can be an empty
list).
* A list of system messages, which will be inserted into the document tree
immediately after the end of the current block (can also be empty).
Specify Role Function Options and Content
=========================================
Function attributes are for customization, and are interpreted by the
`"role" directive`_. If unspecified, role function attributes are
assumed to have the value ``None``. Two function attributes are
recognized:
- ``options``: The option specification. All role functions
implicitly support the "class" option, unless disabled with an
explicit ``{'class': None}``.
An option specification must be defined detailing the options
available to the "role" directive. An option spec is a mapping of
option name to conversion function; conversion functions are applied
to each option value to check validity and convert them to the
expected type. Python's built-in conversion functions are often
usable for this, such as ``int``, ``float``, and ``bool`` (included
in Python from version 2.2.1). Other useful conversion functions
are included in the ``docutils.parsers.rst.directives`` package.
For further details, see `Creating reStructuredText Directives`_.
- ``content``: A boolean; true if "role" directive content is allowed.
Role functions must handle the case where content is required but
not supplied (an empty content list will be supplied).
As of this writing, no roles accept directive content.
Note that unlike directives, the "arguments" function attribute is not
supported for role customization. Directive arguments are handled by
the "role" directive itself.
.. _"role" directive: ../ref/rst/directives.html#role
.. _Creating reStructuredText Directives:
rst-directives.html#specify-directive-arguments-options-and-content
Register the Role
=================
If the role is a general-use addition to the Docutils core, it must be
registered with the parser and language mappings added:
1. Register the new role using the canonical name::
from docutils.parsers.rst import roles
roles.register_canonical_role(name, role_function)
This code is normally placed immediately after the definition of
the role funtion.
2. Add an entry to the ``roles`` dictionary in
``docutils/parsers/rst/languages/en.py`` for the role, mapping the
English name to the canonical name (both lowercase). Usually the
English name and the canonical name are the same. Abbreviations
and other aliases may also be added here.
3. Update all the other language modules as well. For languages in
which you are proficient, please add translations. For other
languages, add the English role name plus "(translation required)".
If the role is application-specific, use the ``register_local_role``
function::
from docutils.parsers.rst import roles
roles.register_local_role(name, role_function)
Examples
========
For the most direct and accurate information, "Use the Source, Luke!".
All standard roles are documented in `reStructuredText Interpreted
Text Roles`_, and the source code implementing them is located in the
``docutils/parsers/rst/roles.py`` module. Several representative
roles are described below.
Generic Roles
-------------
Many roles simply wrap a given element around the text. There's a
special helper function, ``register_generic_role``, which generates a
role function from the canonical role name and node class::
register_generic_role('emphasis', nodes.emphasis)
For the implementation of ``register_generic_role``, see the
``docutils.parsers.rst.roles`` module.
RFC Reference Role
------------------
This role allows easy references to RFCs_ (Request For Comments
documents) by automatically providing the base URL,
http://www.faqs.org/rfcs/, and appending the RFC document itself
(rfcXXXX.html, where XXXX is the RFC number). For example::
See :RFC:`2822` for information about email headers.
This is equivalent to::
See `RFC 2822`__ for information about email headers.
__ http://www.faqs.org/rfcs/rfc2822.html
Here is the implementation of the role::
def rfc_reference_role(role, rawtext, text, lineno, inliner,
options={}, content=[]):
try:
rfcnum = int(text)
if rfcnum <= 0:
raise ValueError
except ValueError:
msg = inliner.reporter.error(
'RFC number must be a number greater than or equal to 1; '
'"%s" is invalid.' % text, line=lineno)
prb = inliner.problematic(rawtext, rawtext, msg)
return [prb], [msg]
# Base URL mainly used by inliner.rfc_reference, so this is correct:
ref = inliner.rfc_url % rfcnum
node = nodes.reference(rawtext, 'RFC ' + text, refuri=ref, **options)
return [node], []
register_canonical_role('rfc-reference', rfc_reference_role)
Noteworthy in the code above are:
1. The interpreted text itself should contain the RFC number. The
``try`` clause verifies by converting it to an integer. If the
conversion fails, the ``except`` clause is executed: a system
message is generated, the entire interpreted text construct (in
``rawtext``) is wrapped in a ``problematic`` node (linked to the
system message), and the two are returned.
2. The RFC reference itself is constructed from a stock URI, set as
the "refuri" attribute of a "reference" element.
3. The ``options`` function parameter, a dictionary, may contain a
"class" customization attribute; it is passed through to the
"reference" element node constructor.
.. _RFCs: http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?query=rfc&action=Search&sourceid=Mozilla-search
More information about the Zope-Checkins
mailing list