# HG changeset patch # User paulb # Date 1114540386 0 # Node ID b086884cf0175655d5b729f15fe8f976e8f8e9df # Parent 7b0c3aae7b35bede26993b72909a022a54609180 [project @ 2005-04-26 18:33:06 by paulb] Improved URL and path documentation, adding new documents to make the material more readable. diff -r 7b0c3aae7b35 -r b086884cf017 docs/parameters.html --- a/docs/parameters.html Mon Apr 25 22:19:20 2005 +0000 +++ b/docs/parameters.html Tue Apr 26 18:33:06 2005 +0000 @@ -1,106 +1,121 @@ - - + Request Parameters and Uploads - + -

Request Parameters and Uploads

- -

Even though it is possible to expose different parts of an application -using different URLs and paths, this usually is only +

Even though it is possible to expose different parts of an +application +using different URLs and paths, this usually +is only enough for applications which model some kind of filesystem or repository. Applications which -involve user input through forms, for example, need to be able to receive -such input by other means, and this is where request parameters come in. For + href="paths-filesystem.html">filesystem or repository. +Applications which +involve user input through forms, for example, need to be able to +receive +such input by other means, and this is where request parameters come +in. For example, when a user fills out a form in a Web browser, the following happens:

The browser collects the values in the form fields and puts them in a - request as request parameters.
The browser collects the values in the form fields and puts them +in a request as request parameters.
The request is sent to the server environment and into the - application.
The application reads the field values using the WebStack API.

Parameter Origins

Request parameters can originate from two sources:

Request headers - parameters are - found here when they are specified in the URL as a "query string".
Request bodies - parameters are - found here when the POST request method is - used.
Request headers - +parameters are found here when they are specified in the URL as a +"query string".
Request bodies - parameters +are found here when the POST request method +is used.

- -

One useful application of parameters transferred in request bodies is the +

One useful application of parameters transferred in request bodies +is the sending or uploading of file contents through such parameters - this is -described in "Request Body Parameters". Another way of uploading content in +described in "Request Body Parameters". Another way of uploading +content in conjunction with the PUT request method is mentioned below.

WebStack API - Getting All Parameters

- -

If the origin of the different parameters received in a request is not -particularly interesting or important, WebStack provides a convenience method +

If the origin of the different parameters received in a request is +not +particularly interesting or important, WebStack provides a convenience +method in transaction objects to get all known parameters from a request:

get_fields: This method returns a dictionary mapping field names to lists of - values for all known parameters. Each value will be a Unicode - object.
- An optional encoding parameter may be used to assist the - process of converting parameter values to Unicode objects - see "Request Body Parameters" and "Character Encodings" for more discussion of - this parameter.; This method returns a dictionary mapping field names to lists of +values for all known parameters. Each value will be a Unicode object.
+An optional encoding parameter may be used to assist the +process of converting parameter values to Unicode objects - see "Request Body Parameters" and "Character Encodings" for more discussion of +this parameter.
get_query_string: This method returns the part of the URL which contains parameter +information. Such information will be "URL-encoded", meaning that +certain characters will have the form %xx where xx +is a two digit hexadecimal number referring to the byte value of the +unencoded character - see "Character +Encodings" for information on how byte values should be +interpreted.

- -

Generally, it is not recommended to just get all parameters since there -may be some parameters from the request headers which have the same names as -some other parameters from the request body. Consequently, confusion could +

Generally, it is not recommended to just get all parameters since +there +may be some parameters from the request headers which have the same +names as +some other parameters from the request body. Consequently, confusion +could arise about the significance of various parameter values.

Using PUT Requests to Upload Files

- -

When handling requests in your application, instead of treating request as -containers of parameters and using the WebStack API methods to access those -parameters, you can instead choose to read directly from the data sent by the -user and interpret that data in your own way. In most situations, this is not -really necessary - those methods will decode request parameters (for example, -form fields) in a way which is fairly convenient - but when files are being -sent, and when the request method is specified as -PUT, it is necessary to obtain the input stream from the request +

When handling requests in your application, instead of treating +request as +containers of parameters and using the WebStack API methods to access +those +parameters, you can instead choose to read directly from the data sent +by the +user and interpret that data in your own way. In most situations, this +is not +really necessary - those methods will decode request parameters (for +example, +form fields) in a way which is fairly convenient - but when files are +being +sent, and when the request method is +specified as +PUT, it is necessary to obtain the input stream from the +request and to read the file contents from that stream.

WebStack API - Reading Directly from Requests

When the request does not contain standard form-encoded parameter -information and instead contains the contents of an uploaded file, methods -like get_fields and get_fields_from_body should be +information and instead contains the contents of an uploaded file, +methods +like get_fields and get_fields_from_body +should be avoided and other methods in the transaction employed.

get_request_stream: This returns the input stream associated with the request. Reading - from this will result in the request body being obtained as a plain - Python string.; This returns the input stream associated with the request. +Reading from this will result in the request body being obtained as a +plain Python string.
get_content_type: This returns a content type object (typically - WebStack.Generic.ContentType) which describes the request - body's contents.; This returns a content type object (typically WebStack.Generic.ContentType) +which describes the request body's contents.

The purpose and behaviour of PUT request methods is described in the HTTP + href="methods.html">request methods is described in the HTTP specification.

diff -r 7b0c3aae7b35 -r b086884cf017 docs/path-design.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/path-design.html Tue Apr 26 18:33:06 2005 +0000 @@ -0,0 +1,42 @@ + + + + + Path Design and Interpretation + + + + +

Path Design and Interpretation

There are various differing approaches to the problem of +interpreting +paths to resources within Web applications, but these can mostly be +divided +into three categories:

+ + + + + + + + + + + + + + + + + + + +

Approach	Examples
Path as filesystem	WebDAV interface to a repository
Path as resource or service +identifier	A Web shop with very simple paths, eg. `/products`, + `/checkout`, `/orders`
Path as opaque reference	An e-mail reader where the messages already have strange and +unreadable message identifiers

+ + diff -r 7b0c3aae7b35 -r b086884cf017 docs/path-info-support.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/path-info-support.html Tue Apr 26 18:33:06 2005 +0000 @@ -0,0 +1,59 @@ + + + + + Path Info Support in Server Environments + + + + +

Path Info Support in Server Environments

The following table summarises the support for "path info" within +applications +amongst the supported server environments or frameworks within WebStack:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Framework	Behaviour (Level of Support)
BaseHTTPRequestHandler	Same as path (correct)
CGI	Path beyond resource (correct)
Java Servlet API	Path beyond context (correct)
mod_python	Path beyond resource (correct)
Twisted	Same as path (correct)
Webware	<= 0.8.1: Not supported (needs `ExtraPathInfo` +support) +> 0.8.1: Path beyond context (correct)
WSGI	Path beyond resource (correct)
Zope	Path beyond resource (correct)

+ + diff -r 7b0c3aae7b35 -r b086884cf017 docs/path-info.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/path-info.html Tue Apr 26 18:33:06 2005 +0000 @@ -0,0 +1,144 @@ + + + + + Paths To and Within Applications + + + + +

Paths To and +Within Applications

One thing to be aware of in the +code of an application is which part +of +a +path refers to the location of the application in a server environment +and +which refers to some resource within the application itself. Consider +this +path:

/folder/application/resource

Let us say that the application +was deployed in a Zope server +instance +inside +folder +and with the name application. +We may +then +say that the path to the application is this: +

/folder/application

Meanwhile, the path within the +application is just this: +

/resource

In WebStack, we refer to this latter case - the path within the +application - as the "path info".

WebStack API - Paths To +Resources Within Applications

On transaction objects, the +following methods exist to inspect paths +to +resources within applications.

get_path_info: This gets the path of a +resource within an application. The path should always contain a +leading / character at the very least.
get_virtual_path_info: This gets the path of a +resource within a part of an application +- the application itself decides the scope of the path and can set the +"virtual path info" using the set_virtual_path_info +method. The path should always contain a leading / +character at the very least.

Choosing the Right Path Value

Given that the path may change depending on where an +application is deployed in a server environment, it may not be very +easy to use when determining which resources are being requested or +accessed within your application. Conversely, given that the "path +info" does not mention the full path to where the resources are, +it may be difficult to use that to provide references or links to those +resources. Here is a summary of how you might use the different path +values:

+ + + + + + + + + + + + + + + + + + + +

Type of information	Possible uses
Path	Building links to +resources within an application - subtract the "path info" from +the end and you should get the location of the application.
Path info	Determining which +resources are being accessed within an application.
Virtual path info	This is an +application-defined version of "path info" and is discussed below.

Using the Virtual Path

Although WebStack sets the "path info" so that applications +know which part of themselves are being accessed, you may decide +that upon +processing the request, these different parts of your application +should be +presented with different path information. For example, in a +hierarchical +structure of resources, each resource might use the first part of the +"path info" as an input to some kind of processing, but then have the +need to remove the +part they used, passing on a modified path to the other resources. For +such approaches, the "virtual path info" may be used instead, since it +permits modification within an application.

So starting with a virtual path like this (which would be the same +as the "path info")...

/company/department/employee

...a resource might extract company from the start +of the path as follows:

        # Inside a respond method...
        path = trans.get_virtual_path_info()    # get the virtual path
        parts = path.split("/")                 # split the path into components - the first will be empty

Then, having processed the first non-empty part (remembering that +the first part will be an empty string)...

        if len(parts) > 1:                      # check to see how deep we are in the path
            process_something(parts[1])         # process the first non-empty part

...it will reconstruct the path, removing the processed part (but +remembering to preserve a leading / character)...

            trans.set_virtual_path_info("/" + "/".join(parts[2:]))

...and hand over control to another resource which would do the same +thing with the first of the other path components (department +and employee), and so on.

The compelling thing about this strategy is the way that each +resource would only need to take the "virtual path info" into +consideration, and that each resource would believe that it is running +independently from any "parent" resource. Moreover, such resources +could be deployed independently and still operate in the same way +without being "hardcoded" into assuming that they always reside at a +particular level in a resource hierarchy.

WebStack API - Paths To +Resources Within Applications

On transaction objects, the +following method exists to set virtual paths within applications.

set_virtual_path_info: This sets the virtual path, affecting subsequent calls to the get_virtual_path_info +method. The path should always contain a leading / +character at the very least.

+ + diff -r 7b0c3aae7b35 -r b086884cf017 docs/paths-filesystem.html --- a/docs/paths-filesystem.html Mon Apr 25 22:19:20 2005 +0000 +++ b/docs/paths-filesystem.html Tue Apr 26 18:33:06 2005 +0000 @@ -1,8 +1,7 @@ - Treating the Path Like a -Filesystem + Treating the Path Like a Filesystem @@ -112,6 +111,7 @@ objects is not the only way to support such hierarchies. We could inspect paths and act dynamically on the supplied information, either choosing to create resources or choosing to handle -such paths in the same resource.

+such paths in the same resource. See "Paths +To and Within Applications" for some other strategies.

diff -r 7b0c3aae7b35 -r b086884cf017 docs/paths-opaque.html --- a/docs/paths-opaque.html Mon Apr 25 22:19:20 2005 +0000 +++ b/docs/paths-opaque.html Tue Apr 26 18:33:06 2005 +0000 @@ -1,25 +1,26 @@ - - + Using the Path as an Opaque Reference into an Application - + -

Using the Path as an Opaque Reference into an Application

Since many Web applications have complete control over how paths are -interpreted, the form of the path doesn't necessarily have to follow any -obvious structure as far as users of your application is concerned. Here's an +interpreted, the form of the path doesn't necessarily have to follow +any +obvious structure as far as users of your application is concerned. +Here's an example:

/000251923572ax-0015

- -

However, many would argue that such obscure references, whilst perfectly -acceptable to machines, would make any application counter-intuitive and very -difficult to reference. Sometimes, application developers do not want people +

Many people would argue that such obscure references, whilst +perfectly +acceptable to machines, would make any application counter-intuitive +and very +difficult to reference. However, application developers sometimes +do not want people "bookmarking" resources or functions within an application, and so such concerns don't matter to them.

diff -r 7b0c3aae7b35 -r b086884cf017 docs/paths.html --- a/docs/paths.html Mon Apr 25 22:19:20 2005 +0000 +++ b/docs/paths.html Tue Apr 26 18:33:06 2005 +0000 @@ -1,176 +1,82 @@ - - + URLs and Paths - + -

URLs and Paths

- -

The URL at which your application shall appear is arguably the first part -of the application's user interface that any user will see. In this context, +

The URL at which your application shall appear is arguably the first +part +of the application's user interface that any user will see. Remember +that a user of your application does not have to be a real person; in +fact, a user can be any of the following things:

A real person entering the URL into a browser's address bar.
A real person linking to your application by writing the URL in a - separate Web page.
A program which has the URL defined within it and which may manipulate - the URL to perform certain kinds of operations.
A program which has the URL defined within it and which may +manipulate the URL to perform certain kinds of operations.

- +

Some application developers have a fairly rigid view of what kind of +information a URL should contain and how it should be structured. In +this guide, we shall look at a number of different approaches.

Interpreting Path Information

- -

What the URL is supposed to do is to say where (on the Internet or on an -intranet) your application resides and which resource or service is being +

What the URL is supposed to do is to say where (on the Internet or +on an +intranet) your application resides and which resource or service is +being accessed, and these look like this:

http://www.boddie.org.uk/python/WebStack.html

- -

With WebStack, we also talk about a "path" as being just the part of the -URL which refers to the resource or service, ignoring the actual Internet -address, and so these look like this:

In an application the full URL, containing the address of the +machine on which it is running, is not always interesting. In the +WebStack API (and in other Web programming frameworks), we also talk +about "paths" - a path is just the part of the +URL which refers to the resource or service, ignoring the actual +Internet +address, and so the above example would have a path which looks like +this:

/python/WebStack.html

When writing a Web application, most of the time you just need to -concentrate on the path because the address doesn't usually tell you anything +concentrate on the path because the address doesn't usually tell you +anything you don't already know. What you need to do is to interpret the path -specified in the request in order to work out which resource or service the -request is destined for.

- +specified in the request in order to work out which resource or service +the user is trying to access.

WebStack API - Path Methods in Transaction Objects

- -

WebStack provides the following transaction methods for inspecting path +

WebStack provides the following transaction methods for inspecting +path information:

get_path: This gets the entire path of a resource including parameter - information (as described in "Request - Parameters and Uploads").; This gets the entire path of a resource including parameter +information (as described in "Request +Parameters and Uploads").
get_path_without_query: This gets the entire path of a resource but without any parameter - information.; This gets the entire path of a resource but without any parameter +information.

- +

Query Strings

Sometimes, a "query string" will be provided as part of a URL; for example:

http://www.boddie.org.uk/application?param1=value1

- -

The question mark character marks the beginning of the query string which -contains encoded parameter information; such information and its inspection +

The question mark character marks the beginning of the query string +which +contains encoded parameter information; such information and its +inspection is discussed in "Request Parameters and Uploads".

- -

Paths To and Within an Application

-One thing to be aware of in the code of an application is which part of a -path refers to the location of the application in a server environment and -which refers to some resource within the application itself. Consider this -path:
- -

/folder/application/resource

-Let us say that the application was deployed in a Zope server instance inside -folder and with the name application. We may then -say that the path to the application is this: -

/folder/application

-Meanwhile, the path within the application is just this: -

/resource

- -

WebStack API - Paths To Resources Within Applications

- -

On transaction objects, the following methods exist to inspect paths to -resources within applications.

get_path_info: This gets the path of a resource within an application.
get_virtual_path_info: This gets the path of a resource within a part of an application - - the application itself decides the scope of the path and can set the - "virtual path info" using the set_virtual_path_info - method.

- -

Approaches to Path Interpretation

- -

There are various differing approaches to the problem of interpreting -paths to resources within Web applications, but these can mostly be divided -into three categories:

- - - - - - - - - - - - - - - - - - - - -

Approach	Examples
Path as filesystem	WebDAV interface to a repository
Path as resource or service - identifier	A Web shop with very simple paths, eg. `/products`, - `/checkout`, `/orders`
Path as opaque reference	An e-mail reader where the messages already have strange and - unreadable message identifiers

- -

Path Info Support in Server Environments

- -

The following table summarises the support for paths within applications -amongst the supported server environments or frameworks within WebStack:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Framework	Behaviour/Level of Support
BaseHTTPRequestHandler	Same as path (correct)
CGI	Path beyond resource (correct)
Java Servlet API	Path beyond context (correct)
mod_python	Path beyond resource (correct)
Twisted	Same as path (correct)
Webware	<= 0.8.1: Not supported (needs `ExtraPathInfo` - support) - > 0.8.1: Path beyond context (correct)
WSGI	Path beyond resource (correct)
Zope	Path beyond resource (correct)

Request Parameters and Uploads

Parameter Origins

WebStack API - Getting All Parameters

Using PUT Requests to Upload Files

WebStack API - Reading Directly from Requests

Path Design and Interpretation

Path Info Support in Server Environments

Paths To and +Within Applications

WebStack API - Paths To +Resources Within Applications

Choosing the Right Path Value

Using the Virtual Path

WebStack API - Paths To +Resources Within Applications

Using the Path as an Opaque Reference into an Application

URLs and Paths

Interpreting Path Information

WebStack API - Path Methods in Transaction Objects

Query Strings

Paths To and Within an Application

WebStack API - Paths To Resources Within Applications

Approaches to Path Interpretation

Path Info Support in Server Environments

More About Paths