Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This looks great. If you are a non-Haskeller or only a semi-Haskeller and are wondering what technical features set this apart from other existing frameworks:

The Snap framework uses a style of I/O called “iteratee I/O”. We have opted for iteratees over handle-based and lazy I/O because iteratee I/O offers better resource management, error reporting, and composability.

If you're interested, you can read some in-depth stuff here: http://okmij.org/ftp/Streams.html (these papers are somewhat famous)

The Snap tutorial has a decent summary of how they work, but it doesn't provide a lot of context for why.

If you are already familiar with functional programming, then you know about foldl, foldr, and left/right recursion. When processing a collection of some kind (like a list), usually you take an enumerator (the fold function or a map function, for example) and pass it an iteratee (the function that will be applied to each element) and then your work gets done and you get a result. Left fold is non-lazy, right fold and map are lazy.

Because Haskell is a lazy language, there is another way to process lists lazily: with explicit recursion to the right. If you are familiar with Scheme or Lisp, it would look kind of like

    (define (myfunc xs)
      (cond ((null? xs) '())
             (else (cons (+ (car xs) 1)
                         (myfunc (cdr xs))))))
which would add 1 to each element of a list, returning a new list.

In Lisp or a non-lazy Scheme, this would blow up in your face for a large list. In Haskell it's no sweat, because the list will only be processed as it is needed, and generally for control structures you have to go out of your way to make it 'space leak' (build up a bunch of unevaluated expressions and then explode when you finally evaluate it.)

If you use foldr or map in Haskell, it works like this. One interesting thing about using non-explicit recursion in Haskell is that if you put a bunch of maps or folds (or with the Stream Fusion library, many different types of list operations) next to each other, the compiler will inline them together and eliminate intermediate list allocations. So you can end up writing your process as a series of discrete logical steps (from list type a, to b, to c, to d) but still get the performance of doing only one traversal. Very nice.

One of the downsides is that you end up with a lazy control structure, which can be problematic when dealing with the real world. For example, when reading from a file handle, someone could move the file or delete it or something. With lazy IO, the programmer is not really controlling when the file will be read by the program. Often, this is not a problem. However, with networking and time-critical applications, it can be a big problem.

Another problem is that if you need to perform allocations (like going from a list of one length to a list of another length) in the middle of your transformations, it gets really tricky if you want to maintain fusion. Normally with explicit right recursion you can just cons together a larger list on the fly, but we can't do fusion on explicit recursion, and compilers are generally bad about figuring out what kinds of allocation our explicitly recursing function does.

We don't want to give up our elegant way of expressing operations of sequences, but we want to know when we are writing and reading from the real world, and we want to do it with determinate constant space. Iteratees let us do that almost as elegantly as the normal list stuff by driving the consuming of the list via the functions that do the actual processing. Sort of like a list unfold, but not quite.

Check out attoparsec-iteratee http://hackage.haskell.org/package/attoparsec-iteratee for an example of iteratee usage: it turns a fast bytestring parser into one that's based on iteratees, which will probably end up being even faster. (correction: which will allow you to process streams in constant space without needing to explicitly manage things.)

(If Dons or Gregory Collins is around to correct the numerous mistakes I'm sure I made in this post, please do so!)



> Check out attoparsec-iteratee http://hackage.haskell.org/package/attoparsec-iteratee for an example of iteratee usage: it turns a fast bytestring parser into one that's based on iteratees, which will probably end up being even faster.

Iteratees don't really make things "faster" (you end up with a sequence of imperative read()/write() calls just like everything else), but they allow you to stream in O(1) space while still being easily composable.

For example, we do gzip, buffering, & chunked transfer encoding using simple function calls, and we don't really have to worry about explicit buffer management or control flow while we're doing it.


Fixed, thanks!





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: