1.1 --- a/docs/encodings.html Sat Sep 08 16:01:41 2007 +0000
1.2 +++ b/docs/encodings.html Sat Sep 08 16:02:18 2007 +0000
1.3 @@ -1,7 +1,7 @@
1.4 +<?xml version="1.0" encoding="iso-8859-1"?>
1.5 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
1.6 <html xmlns="http://www.w3.org/1999/xhtml"><head>
1.7 -
1.8 - <title>Character Encodings</title><meta name="generator" content="amaya 8.1a, see http://www.w3.org/Amaya/" />
1.9 + <title>Character Encodings</title>
1.10 <link href="styles.css" rel="stylesheet" type="text/css" /></head>
1.11 <body>
1.12 <h1>Character Encodings</h1>
1.13 @@ -19,16 +19,16 @@
1.14 users of your application, the text will be in some kind of character
1.15 encoding. For example, in English-speaking environments, the US-ASCII
1.16 encoding is common and contains the basic letters, numbers and symbols
1.17 -used in English, whereas in Western Europe encodings like
1.18 -ISO-8859-1 and ISO-8859-15 are typically used, since they contain
1.19 +used in English, whereas in Western Europe encodings like
1.20 +ISO-8859-1 and ISO-8859-15 are typically used, since they contain
1.21 additional letters and symbols in order to support other languages.
1.22 Often, UTF-8 is used to encode text because it covers most languages
1.23 simultaneously and is therefore flexible enough for many applications.</p>
1.24 <p>When URLs are received in applications, in order for some of the
1.25 request parameters to be interpreted, the situation is a bit more
1.26 awkward. The original text is encoded in US-ASCII but will contain
1.27 -special numeric codes that indicate character values in the
1.28 -original text encoding - see the <a href="parameters.html">description
1.29 +special numeric codes that indicate character values in the
1.30 +original text encoding - see the <a href="parameters.html">description
1.31 of query strings</a> for more information.</p>
1.32 <h2>Recommendations</h2>
1.33 <dl>
1.34 @@ -47,7 +47,7 @@
1.35 <li>If you must include hard-coded messages in your application code,
1.36 make sure to specify the encoding using the <a href="http://www.python.org/peps/pep-0263.html">standard declaration</a>
1.37 at the top of your source file.</li>
1.38 - <li>Remember that the standard library <code>codecs</code>
1.39 + <li>Remember that the standard library <code>codecs</code>
1.40 module contains useful functions to access streams as if Unicode
1.41 objects were being transmitted; for example:</li>
1.42 </ul>
1.43 @@ -73,14 +73,14 @@
1.44 can be used directly with various transaction methods. Here is an
1.45 outline of code which does this:</p>
1.46 <pre>from WebStack.Generic import ContentType<br /><br />class MyResource:<br /><br /> encoding = "utf-8" # We decide on "utf-8" as our chosen<br /> # encoding.<br /> def respond(self, trans):<br /> [Do various things.]<br /><br /> fields = trans.get_fields_from_body(encoding=self.encoding) # Explicitly use the encoding.<br /><br /> [Do other things with the Unicode values from the fields.]<br /><br /> trans.set_content_type(ContentType("text/html", self.encoding)) # The output Web page uses the encoding.<br /><br /> [Produce the response, making sure that self.encoding is used to convert Unicode to raw strings.]</pre>
1.47 -<h3>Use EncodingSelector to Set the Default Encoding</h3><p>An arguably better approach is to use selectors (as described in <a href="selectors.html">"Selectors - Components for Dispatching to Resources"</a>), typically in a "site map" arrangement (as described in <a href="deploying.html">"Deploying a WebStack Application"</a>), specifically using the <code>EncodingSelector</code>:</p><pre>from WebStack.Generic import ContentType<br /><br />class MyResource:<br /><br /> def respond(self, trans):<br /> [Do various things.]<br /><br /> fields = trans.get_fields_from_body() # Encoding set by EncodingSelector.<br /><br /> [Do other things with the Unicode values from the fields.]<br /><br /> trans.set_content_type(ContentType("text/html")) # The output Web page uses the default encoding.<br /><br /> [Produce the response, making sure that self.encoding is used to convert Unicode to raw strings.]<br /><br />def get_site_map():<br /><br /> return EncodingSelector(MyResource(), "utf-8")</pre><h3>Tell Encodings to Other Components</h3>
1.48 +<h3>Use EncodingSelector to Set the Default Encoding</h3><p>An arguably better approach is to use selectors (as described in <a href="selectors.html">"Selectors - Components for Dispatching to Resources"</a>), typically in a "site map" arrangement (as described in <a href="deploying.html">"Deploying a WebStack Application"</a>), specifically using the <code>EncodingSelector</code>:</p><pre>from WebStack.Generic import ContentType<br /><br />class MyResource:<br /><br /> def respond(self, trans):<br /> [Do various things.]<br /><br /> fields = trans.get_fields_from_body() # Encoding set by EncodingSelector.<br /><br /> [Do other things with the Unicode values from the fields.]<br /><br /> trans.set_content_type(ContentType("text/html")) # The output Web page uses the default encoding.<br /><br /> [Produce the response, making sure that self.encoding is used to convert Unicode to raw strings.]<br /><br />def get_site_map():<br /><br /> return EncodingSelector(MyResource(), "utf-8")</pre><h3>Tell Encodings to Other Components</h3>
1.49 <p>When using other components to generate content (see <a href="integrating.html">"Integrating with Other Systems"</a>), it may
1.50 be the case that such components will just write the generated content
1.51 -straight to a normal stream (rather than one wrapped by a <code>codecs</code>
1.52 +straight to a normal stream (rather than one wrapped by a <code>codecs</code>
1.53 module function). In such cases, it is likely that for textual content
1.54 such as XML or related formats (XHTML, SVG, HTML) you will need to
1.55 instruct the component to use your chosen encoding; for example:</p>
1.56 <pre> # In the respond method, xml_document is an xml.dom.minidom.Document object...<br /> xml_document.toxml(self.encoding)</pre>
1.57 <p>This will then generate the appropriate characters in the output <span style="font-style: italic;">and</span> specify the correct encoding
1.58 for the XML document.</p>
1.59 -</body></html>
1.60 \ No newline at end of file
1.61 +</body></html>