Gmane
From: Daniel Barlow <dan <at> telent.net>
Subject: Review of CL web server APIs: (1) Araneida
Newsgroups: gmane.lisp.web
Date: 2003-08-19 13:02:38 GMT (5 years, 45 weeks, 4 days, 22 hours and 47 minutes ago)

Hidden deep in the middle of the unhelpfully named "where to start"
thread on comp.lang.lisp, there is a post from me from earlier today
saying "wouldn't it be nice if we had a standard API (a la Java
servlets) for Lisp web servers so that Lisp web applications 
could be moved betweeen servers just like CGI scripts can"

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=87fzjydg1u.fsf%40noetbook.telent.net&prev=/groups%3Fq%3Dlisp%2Bweb%2Bstandard%2Bapi%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8%26selm%3D87fzjydg1u.fsf%2540noetbook.telent.net%26rnum%3D9

Mm, lovely user-readable URLs.  Anyway.  To save you from having to go
and look at that if you're reading this offline, I reproduce the
intended scope here:

 - implement something that reflects the HTTP request/response model 
   fairly closely, rather than requiring users to tie themselves into
   some application server model providing sessions or state.  That
   can be layered on top.

 - HTML generation is likewise out of scope.  Generate your HTML any
   way you choose.

 - URI/URL parsing is probably _in_ scope, however: URI are used all
   over the place in HTTP; it'd be madness to treat them as text, so
   I'd expect each implementation will have to define a URI object for
   internal use anyway.  If we can get this right, it would make life
   easier to let the client applications use it too.

The first step towards standardizing is to study the existing work,
and I'm going to start with the work I know best, because I wrote it:
Araneida.  I invite authors and users of other CL web servers to
review their tools against similar criteria (configuration, HTTP
request model, URI manipulation tools) and follow up here.  A list of
links to responses will be collected, probably on an ALU wiki page set
up for the purpose.  Very well, then.  Let's go

* Araneida

Araneida is a single-threaded HTTP server that runs on SBCL (and
probably, with a little massaging, CMUCL) using the builtin
SERVE-EVENT loop to process HTTP requests in the background.  It's
designed for use behind Apache using mod_proxy to forward requests.
Probably the most popular application for Araneida is the CLiki engine

** Configuration

Telling Araneida which ports it should use and addresses it should
listen on is currently frankly a bit of a crock, and I wouldn't
suggest there's much for a standard webapi to learn from it

** Request model

The request model in Araneida is recursive, and based on calling
methods of a HANDLER object.  A handler is registered to be
responsible for a certain portion of the URL space, and may (if it is
a dispatching handler; see below) delegate the work of generating a
response to a 'sub-handler' private to it.

When a request is made of the server, it creates a REQUEST object with
accessors for attributes such as the HTTP method, the stream to which
a response should be sent, the headers and body, and the URL
requested.  Then it finds the appropriate handler and invokes

  (HANDLE-REQUEST handler method request)

*** Standard HANDLER

HANDLE-REQUEST in the default handler class invokes methods for several GFs,
whose signatures I paste in here to save me paraphrasing to no purpose.

;;; who does the user say he is?   Is he correct?
(defgeneric handle-request-authentication (handler method request))

;;; is the user allowed to see this resource?
(defgeneric handle-request-authorization (handler method request))
;;;  (default method calls request-authorized-p, request-not-authorized)
(defgeneric request-authorized-p (handler method request))
(defgeneric request-not-authorized (handler method request))

;;; send the resource back, or do whever else is appropriate at this
;;; stage for the requested method.  If -response returns (values NIL
;;; foo), we send a 404 and log the foo to a log stream, or to the
;;; browser if no log stream is set up
(defgeneric handle-request-response (handler method request))
;;; can be used to write stuff to a log file, or some other cleanup action
;;; that may take place after the response to the client has gone (i.e.
;;; request stream may be closed by now
(defgeneric handle-request-logging (handler method request))

These are invoked in the appropriate order until one of them returns T
to indicate that it's sent a response.  If nothing does, eventually a
404 response will be returned.

For a simple application with no authentication requirements, it's
usually sufficient just to define a method on handle-request-response 
that writes to the request-stream

[ Aside : It's possible this nested method mess would be more cleanly
expressed as a custom method combination on handle-request.  I'd
welcome ideas along those lines ]

*** STATIC-FILE-HANDLER

(defclass static-file-handler (handler)
  ((pathname :initarg :pathname :accessor static-file-pathname
	     :documentation "Root pathname for URI components to merge against")))

This serves static content

*** DISPATCHING-HANDLER

DISPATCHING-HANDLER is used for dividing the URL space up further
within the context of a single handler.  For example, in CLiki we
have CLIKI-HANDLER which deals itself with displaying pages, but
also dispatches to a subhandler for URLs starting with
<cliki-handler-url>/edit/ 

A very important dispatching-handler is the *ROOT-HANDLER*, which is
responsible for the entire URL space and under which all the other
handlers are registered.

*** CLIKI-HANDLER

This is not part of Araneida - just a common add-on :-) 

** URI

Araneida uses a home-grown URL parser.  Salient features

one class per URL scheme.  "The usual" accessors: host, port,
  directory, query info, or whatever makes sense for the scheme.
  Although these have setters, my experience is that these are
  rarely used, and it would have been a better idea to define them 
  the URL as an immutable object

represents only absolute URLs, as relative URLs are technically
  meaningless unless in the context of an absolute URL to parse them
  against

MERGE-URL therefore has arguments (BASE-URL STRING) .  This is
  backwards from merge-pathnames - though arguably the right way
  around anyway, but if I'd known then what i now know about existing
  practice in pathnames I might have dome something different

Strings are turned into URLs using PARSE-URLSTRING.  We don't 
  currently use a #u"..." reader syntax, but I'd argue strongly that
  a standard cl-webapi ought to.

a bit flakey in places, sadly.

* Conclusion

** Bits I personally really like

The redispatching requests

** Various infelicities that I don't really fit anywhere else

Dealing with a dispatching handler that also sends responses itself
  can be a bit weird.

Little support currently for parsing request bodies: we do the
  standard form encoding and return an alist of parameter . value, but
  no multiparts.  It might be better to deal with this using accessors
  that would be able to parse the body on demand instead of every time
  at startup: imagine a complicated form with a several Mb file
  upload, then imagine that after parsing it all we return a "not
  authorized" response.  Waste of resources and a potential DoS
  attack.

** More information: http://araneida.telent.net/docs/reference.html

                          - FIN -

Ok, that's Araneida in a nutshell.  Next up?  Someone want to tackle
paserve or cl-modlisp?

-dan

-- 

   http://www.cliki.net/ - Link farm for free CL-on-Unix resources