What exactly are row types? I believe Ermine also features them, but searching d...

mthq · on June 1, 2014

Row types are a way to encode type information of OOP object in a hindley milner type system. It allows you to type check the kind of ducked objects that are often featured in dynamic OOP languages like ruby and javascript. So you may specify a function on a object containing specific fields or methods without specifying a concrete type:

  f :: {left: Number, right: Number} -> Number
  f object = object.a + object.b

All type checked of course.

See here for more info: https://www.cs.cmu.edu/~neelk/rows.pdf

roryokane · on June 2, 2014

I assume you meant to write `object.left + object.right` instead of `a` and `b`, to match the type signature.

tikhonj · on June 1, 2014

Row types refer to row polymorphism. It's a kind of polymorphism that lets you write code that's generic over different records with different fields without using subtyping.

Lets say you want to write a function that takes anything with a .length field:

    maxByLen a b = if a.length > b.length then a else b

How would you come up with the types for this? In a Java-like system of nominative sub-typing, you would make an interface Length and require types to declare it explicitly. However, this is messy, completely unnecessary and messes up type inference, so it's not a good idea.

You could also use structural sub-typing. You could declare the function to have a type like

    byLength : { length : Int } -> { length : Int } -> { length : Int }

where any record with a field length of type Int has the type { length : Int }. In other words, the type { length : Int, name : String } is a subtype of { length : Int } without needing to declare an interface explicitly. This is a definite improvement over the nominative Java case.

There are a couple of problems with this, however. A major one is that it "throws away" type information. In particular, all you know about either argument is that they have length fields. They might otherwise be of different types, with all their other fields completely different. You could compare a Foo and a Bar, as long as they both have lengths. Also, if you return something of the argument type, all you will know about it after that is that it has a length field; you've lost its actual type statically. To cast back up would require a potentially failing runtime check.

Consider that if we have two type Foo { length : Int, name : String} and Bar { length : Int, height : Int }, we can pass one of each into the function. Then which one it returns depends on runtime data: it might be a Foo and it might be a Bar. This means we cannot know the specific return type; all we know now is that it has to have a length field.

This is very different from normal parametric polymorphism which has type variables. While you don't know anything about the variable inside the function, it doesn't throw out type information when its used. For example, imagine this code instead:

    maxBy : (a -> Int) -> a -> a -> a
    maxBy pred a b = if pred a > pred b then a else b

This one can be used the same way by passing in a length function. However, since it doesn't use subtyping, both arguments have to have the same type! This means we can't use it to compare a foo and a bar. But when we do use it on Foos, for example, we know we will get a Foo out because that's what the a variable will be unified to:

    (a -> Int) -> a -> a -> a
    (Foo -> Int) -> Foo -> Foo -> Foo

When we use the function on a Foo, a gets instantiated to Foo giving us the second type.

Row types are just a way to bring this second style of polymorphism to records with different fields. Basically, it allows you to put a type variable representing the "rest" of the record:

    maxByLen : { length : Int | r } -> { length : Int | r } -> { length : Int | r }

Now, when we use this on a Foo, r gets instantiated to { name : String }; when we use it on a Bar, it gets instantiated to { height : Int }. This means we can't mix Foos and Bars and we don't lose type inference.

It also lets us do some other nice things. For example, we could add a field, polymorphically:

    annotate : { length : Int | r } -> { length : Int, lenStr : String | r }

It's also nice to note that we don't lose anything: we can get something very similar to the old behavior by throwing information away explicitly:

    maxBy : { length : Int | a} -> { length : Int | b } -> { length : Int }
    maxBy a b = if a.length > b.length then { length = a.length }
                                       else { length = b.length }

We can get almost exactly the sub-typing behavior by using existential types:

    maxBy : (exists a. { length | a }) -> (exists b. { length | b }) -> 
               (exists c. { length | c })
    maxBy a b = if a.length > b.length then a else b

I'm not sure why we would want this though! (I also don't know if any languages support this sort of syntax exactly, so you might have to jump through some additional hoops to simulate "exists".)

As a final note, I haven't used row polymorphism or existential types very much, so there might be some mistakes above.

It's also worth noting that Ermine has a slightly more general form of row polymorphism, but I'm not sure what the specific differences are.

thinkpad20 · on June 1, 2014

Very interesting writeup! Thanks. This now gives me more ideas to use in the language I'm working on.

I independently had what seems to be a similar idea, which is sort of like "anonymous type classes." For example, Vectors, Lists, Strings, Maps, Sets, etc all have something like a "size" function. So say there exist functions like

    size : Vector a -> Int
    size : String -> Int
    size : List a -> Int
    ...

And say we wanted to make a function like:

    show_size a = show a + " has a size of " + show (size a)

To write this in Haskell, you'd have to create a `Size` type class:

    class Size a where size :: a -> Int

And then write instances for all of them. And you'd have to do the same for `show`. But that's almost as tedious as the Java example. Now a Haskell type class is essentially a set of named function signatures, so instead, you can create a type class "on the fly":

    show_size : a of {size: a -> Int, show: a -> String} -> String

The cool thing here is that you keep the type information, in a similar way to what you were describing:

    foo : a of {f: a -> b -> Int} -> b of {g: b -> Int -> b} -> b
    foo a b = g b (f a b + 10)

So here we know that as long as there exist appropriate `f`s and `g`s, then we can pass any `a` and any `b` in and get a `b` back.

tomp · on June 7, 2014

I reckon this is based on Daan Leijen's awesome (and very readable) paper "Extensible records with scoped labels" [1]. It's also quite simple to implement, I've done it here [2] (in OCaml).

[1] http://research.microsoft.com/apps/pubs/default.aspx?id=6540... [2] https://github.com/tomprimozic/type-systems/tree/master/exte...

CountHackulus · on June 1, 2014

That was an excellent writeup. Just want to say thank you. Somewhat hilariously, this is somewhat similar to some COBOL functionality I've seen.

skybrian · on June 1, 2014

It looks like Elm has this?

http://elm-lang.org/learn/Records.elm

jules · on June 1, 2014

Row types are an alternative to subtyping for record/object types. With subtyping you say that if you have a record Foo and you leave out some fields to get a record Bar then Foo is a subtype of Bar. For example the type {a:int, b:string} is a subtype of {a:int}. This means that if you have a function f : {a:int} -> ResultType, then you can pass a value of type {a:int, b:string} to it, since it has all the required fields. A problem with subtyping for records/objects is that it doesn't work very well with type inference.

With row types you use parametric polymorphism to make the extra fields explicit. You have types of the form {a:int, ...B} where B indicates that the value may have additional fields. In this system you give f the type forall B. {a:int, ...B} -> ResultType.

Row types are more powerful than subtyping in some ways but less powerful in other ways. For example with subtyping you can put two records {a:3, b:"hello"} and {a:5} in the same list. With row types you can't do that (you'd need existential types to do that). It turns out that these restrictions make type inference for row types decidable.