a pipes-based HTTP server (http://www.happstack.com/)


HTTP 1.1 supports pipelined connections where multiple Requests and Responses can be sent over the same connection. They do not have to be strictly interleaved either -- the client can send 10 Requests back-to-back and then start reading the Responses.

What needs to happen:

  1. there is data available at the socket
  2. we call 'await' to received some of it
  3. we parse enough of that to extract the Request HTTP headers, but not the message body
  4. we call a user supplied function with a type like 'Request -> IO Response'
  5. that user suplied function can consume the Request message body
  6. it produces a Response which contains headers + a Response message body
  7. go back to step 1

Tricky parts:

  1. 'await' returns an buffer worth of data. But we may only need to consume a portion of that data, the rest needs to go to the next step. For example, it may return the messages headers plus a bit of the message body.

  2. the user function consuming the 'Request' body should only be able to consume the message body, but not any data which comes after that.

  3. after consuming the 'Request' body we may have a half used buffer that contains a portion of the next 'Request'.

  4. if the user supplied handler does not actually use the 'Request' message body (or only consumes part of it), we need to read and discard the unused portion so we can get the next request.

  5. the data coming from the socket may be 'chunk' encoded. We want to decode the chuncking and give the Request body a simple stream of data

  6. The user supplied function has the type:

    'Request -> IO Response'.

    It is not a pipe itself.

In the 'Request' type it would be nice if we could just make the request body look like a normal 'Producer':

> data Request = Request { > ... > , rqBody :: Producer ByteString (ResourceT IO) () > }

Then to consume the body, we can just use runPipe:

> foo = runPipe $ rqBody req >-> fileWriter "message-body.txt"

  1. Likewise, in the 'Response' type it would be nice if the Response body was a 'Producer' as well:

> data Response = Response { > ... > , rsBody :: Producer ByteString (ResourceT IO) () > }

Then the 'requestLoop' can pull the 'Response' message body right from the producer.

Also, let's say we want to have an echo feature where the server sends the 'Request' body back as the 'Response' body. In theory, I think we can just set 'rsBody' to be the 'rqBody' and then we will echo with minimal buffering...

However, I do not see how to actually implement the code that generates the Producers for 'rqBody' and 'rsBody'.

Perhaps my code structure is all wrong? Or maybe I need some additional primitives that allow you to create a 'Producer' while already inside a 'Pipe' ?

One work around would be to delay the use of Pipes in my code. For example, in the parsing code, I could call the low-level network send/recv functions myself. That would allow me to then create a function like 'requestBodyReader' that works similar to 'handleReader'. That would allow end users to still use pipes, and benefit from the space usage, etc.

Another option might be to make the handler function that the user supplies be a 'Pipe'. But, that is probably not a great interface. And it is still not obvious how to prevent the end-user from accidentally consume stuff beyond the end of the message body.