|
Unsuitable on Unsuitable: The CGI Interface
Samuel A. Falvo II
kc5tja -at- arrl.net 2010 Jul 10 23:50 PDT
The world-wide web architecture came into existence with the intention of serving static content stored in a hierarchical filesystem. Web applications require a means for dynamically synthesizing their user interfaces. Using a web server's Common Gateway Interface, Unsuitable exploits the opportunity to synthesize the user's interface with each page view. Unsuitable, therefore, possesses full control over what the user sees and can do at all times. 1 Why the Need for blog.fs?
When Tim Bernhers-Lee first proposed what eventually became the World-Wide Web, he focused primarily on documents and their relationships to each other. While he envisioned
At the end of the day, Unsuitable's job boils down to providing multiple views of a single repository of article content — that is, to serve as the very kind of gateway which Tim only briefly alludes to in his original proposal. As I write this article, three such views exist: RSS feed, front page view (a.k.a. I originally wanted to implement the blog with a constellation of Forth scripts at different URLs, each serving a particular view. URLs such as http://www.falvotech.com/blog2/index.fs and http://www.falvotech.com/blog2/rss.fs worked as I'd intended them. However, very few web users consider affixing index.html to a malfunctioning URL, and I suspect fewer still would ever consider index.fs for that purpose. I attempted to configure my webserver to map http://www.falvotech.com/blog2 to http://www.falvotech.com/blog2/index.fs internally (preferably without forcing a browser redirect) with only limited success. Finally, I decided the effort wasn't worth it, and that Unsuitable should just emulate a directory, where a single Forth script would parse the remainder of the URL as I intended it1. Unsuitable works via the Common Gateway Interface, or CGI, mechanism. The web server understands how to traverse a real filesystem, working on resource components one by one, as indicated in the URL. Once it encounters blog.fs, however, the server realizes it's a CGI handler, and passes all further control to it. Unsuitable, then, parses the rest of the resource locator, and acts according to the client's request, converting content into a form compatible with the file-like interface the web was originally designed for. The following illustration shows which software components react to the different parts of a URL:
The component immediately following the blog.fs name, called the module, indicates which view Unsuitable needs to provide the client. Three views exist at this time:
I could easily have hard-coded these three views and provided dedicated support for them within blog.fs. However, I decided against this approach. Instead, each module indicates a Forth source file to dispatch to, after a small transformation. Besides accidentally offering greater flexibility in case I want to expand Unsuitable in the future, it has the desired effect that unnecessary code never loads, which simplifies memory management. 2 Code Walk-throughThe web server always calls the same Forth executable for all application end-points. This executable, blog.fs, must have appropriate execute permissions for your operating system, and at least for Posix-compatible systems, must indicate Gforth as its interpreter. #! /usr/bin/env gforth The web server passes the tail of the URL to the handler via an environment variable named PATH_INFO. For example, when accessing this article, PATH_INFO will contain the string /articles/1036. Forth lacks intrinsic string types, so we store the address and length of the string in &path-info and /path-info, respectively. S" PATH_INFO" getenv constant /path-info constant &path-info When accessing the base URL of the blog installation, with or without the trailing slash of a directory, we must assume the user requests the index page. Note that a PATH_INFO of length one can only occur if the user supplied a slash with nothing after it. The web server will interpret anything else as another resource, and the CGI handler will not trigger. : |url|>=2 /path-info 2 u< if s" m-index.fs" included bye then ; |url|>=2 At this point in the code, we know the URL contains at least a module name, but modules sometimes consume parameters. We establish the constant module to point to the module name. &path-info /path-info + constant end-of-url &path-info 1+ constant module We then establish ¶meters to point to the module's parameters, if any exist at all. : -eou dup end-of-url >= if r> drop then ; : -/ dup c@ [char] / = if r> drop then ; : slash begin -eou -/ char+ again ; module slash constant ¶meters Knowing the module (spanning from module and consisting of [ ¶meters module - ] bytes) allows us to include and dispatch the appropriate Forth module responsible for that endpoint. Given a module name M, we include m-M.fs. Note that we assume modules will self-execute upon inclusion. : path [char] m c, [char] - c, ; : base module ¶meters over - here swap dup allot move ; : extension S" .fs" here swap dup allot move ; : filename path base extension ; : dispatch here filename here over - included ; dispatch bye 3 What's NextIn this article, I explained the rationale for using blog.fs to spring-board into other Forth modules, and how it works. Recall that Unsuitable currently defines three views: articles, index, and rss. The behavior of dispatch implies that three other Forth sources must exist: m-articles.fs, m-index.fs, and m-rss.fs. Next time, I'll explain how m-index.fs renders its index page. 4 Complete Source to blog.fs#! /usr/bin/env gforth S" PATH_INFO" getenv constant /path-info constant &path-info : |url|>=2 /path-info 2 u< if s" m-index.fs" included bye then ; |url|>=2 &path-info /path-info + constant end-of-url &path-info 1+ constant module : -eou dup end-of-url >= if r> drop then ; : -/ dup c@ [char] / = if r> drop then ; : slash begin -eou -/ char+ again ; module slash constant ¶meters : path [char] m c, [char] - c, ; : base module ¶meters over - here swap dup allot move ; : extension S" .fs" here swap dup allot move ; : filename path base extension ; : dispatch here filename here over - included ; dispatch bye 5 lighttpd.conf Configuration for UnsuitableThe configuration which follows works for the lighttpd web server. If you use Apache or nginx, consult your server's documentation for proper CGI handler configuration. $HTTP["host"] =~ "falvotech\.com" {
server.document-root = "/Files/WWW/falvotech.com/htdocs"
server.errorlog = "/Files/WWW/falvotech.com/error.log"
accesslog.filename = "/Files/WWW/falvotech.com/access.log"
$HTTP["url"] =~ "^/blog2" {
cgi.assign = ( ".fs" => "/usr/bin/gforth" )
}
}
1 The authors of Wordpress and Serendipity must have faced similar problems, for these blogs both work the same way. While I chalk my lack of success to my inexperience in configuring web servers, I feel better knowing web-app authors far smarter than I faced similar defeat. 2 The web browser also looks at the machine name component, so that it can connect to the server in the first place. |