When are higher kinded types useful?

I've been doing dev in F# for a while and I like it. However one buzzword I know doesn't exist in F# is higher-kinded types. I've read material on higher-kinded types, and I think I understand their definition. I'm just not sure why they're useful. Can someone provide some examples of what higher-kinded types make easy in Scala or Haskell, that require workarounds in F#? Also for these examples, what would the workarounds be without higher-kinded types (or vice-versa in F#)? Maybe I'm just so used to working around it that I don't notice the absence of that feature.

(I think) I get that instead of myList |> List.map f or myList |> Seq.map f |> Seq.toList higher kinded types allow you to simply write myList |> map f and it'll return a List. That's great (assuming it's correct), but seems kind of petty? (And couldn't it be done simply by allowing function overloading?) I usually convert to Seq anyway and then I can convert to whatever I want afterwards. Again, maybe I'm just too used to working around it. But is there any example where higher-kinded types really saves you either in keystrokes or in type safety?


Solution 1:

So the kind of a type is its simple type. For instance Int has kind * which means it's a base type and can be instantiated by values. By some loose definition of higher-kinded type (and I'm not sure where F# draws the line, so let's just include it) polymorphic containers are a great example of a higher-kinded type.

data List a = Cons a (List a) | Nil

The type constructor List has kind * -> * which means that it must be passed a concrete type in order to result in a concrete type: List Int can have inhabitants like [1,2,3] but List itself cannot.

I'm going to assume that the benefits of polymorphic containers are obvious, but more useful kind * -> * types exist than just the containers. For instance, the relations

data Rel a = Rel (a -> a -> Bool)

or parsers

data Parser a = Parser (String -> [(a, String)])

both also have kind * -> *.


We can take this further in Haskell, however, by having types with even higher-order kinds. For instance we could look for a type with kind (* -> *) -> *. A simple example of this might be Shape which tries to fill a container of kind * -> *.

data Shape f = Shape (f ())

Shape [(), (), ()] :: Shape []

This is useful for characterizing Traversables in Haskell, for instance, as they can always be divided into their shape and contents.

split :: Traversable t => t a -> (Shape t, [a])

As another example, let's consider a tree that's parameterized on the kind of branch it has. For instance, a normal tree might be

data Tree a = Branch (Tree a) a (Tree a) | Leaf

But we can see that the branch type contains a Pair of Tree as and so we can extract that piece out of the type parametrically

data TreeG f a = Branch a (f (TreeG f a)) | Leaf

data Pair a = Pair a a
type Tree a = TreeG Pair a

This TreeG type constructor has kind (* -> *) -> * -> *. We can use it to make interesting other variations like a RoseTree

type RoseTree a = TreeG [] a

rose :: RoseTree Int
rose = Branch 3 [Branch 2 [Leaf, Leaf], Leaf, Branch 4 [Branch 4 []]]

Or pathological ones like a MaybeTree

data Empty a = Empty
type MaybeTree a = TreeG Empty a

nothing :: MaybeTree a
nothing = Leaf

just :: a -> MaybeTree a
just a = Branch a Empty

Or a TreeTree

type TreeTree a = TreeG Tree a

treetree :: TreeTree Int
treetree = Branch 3 (Branch Leaf (Pair Leaf Leaf))

Another place this shows up is in "algebras of functors". If we drop a few layers of abstractness this might be better considered as a fold, such as sum :: [Int] -> Int. Algebras are parameterized over the functor and the carrier. The functor has kind * -> * and the carrier kind * so altogether

data Alg f a = Alg (f a -> a)

has kind (* -> *) -> * -> *. Alg useful because of its relation to datatypes and recursion schemes built atop them.

-- | The "single-layer of an expression" functor has kind `(* -> *)`
data ExpF x = Lit Int
            | Add x x
            | Sub x x
            | Mult x x

-- | The fixed point of a functor has kind `(* -> *) -> *`
data Fix f = Fix (f (Fix f))

type Exp = Fix ExpF

exp :: Exp
exp = Fix (Add (Fix (Lit 3)) (Fix (Lit 4))) -- 3 + 4

fold :: Functor f => Alg f a -> Fix f -> a
fold (Alg phi) (Fix f) = phi (fmap (fold (Alg phi)) f)

Finally, though they're theoretically possible, I've never see an even higher-kinded type constructor. We sometimes see functions of that type such as mask :: ((forall a. IO a -> IO a) -> IO b) -> IO b, but I think you'll have to dig into type prolog or dependently typed literature to see that level of complexity in types.

Solution 2:

Consider the Functor type class in Haskell, where f is a higher-kinded type variable:

class Functor f where
    fmap :: (a -> b) -> f a -> f b

What this type signature says is that fmap changes the type parameter of an f from a to b, but leaves f as it was. So if you use fmap over a list you get a list, if you use it over a parser you get a parser, and so on. And these are static, compile-time guarantees.

I don't know F#, but let's consider what happens if we try to express the Functor abstraction in a language like Java or C#, with inheritance and generics, but no higher-kinded generics. First try:

interface Functor<A> {
    Functor<B> map(Function<A, B> f);
}

The problem with this first try is that an implementation of the interface is allowed to return any class that implements Functor. Somebody could write a FunnyList<A> implements Functor<A> whose map method returns a different kind of collection, or even something else that's not a collection at all but is still a Functor. Also, when you use the map method you can't invoke any subtype-specific methods on the result unless you downcast it to the type that you're actually expecting. So we have two problems:

  1. The type system doesn't allow us to express the invariant that the map method always returns the same Functor subclass as the receiver.
  2. Therefore, there's no statically type-safe manner to invoke a non-Functor method on the result of map.

There are other, more complicated ways you can try, but none of them really works. For example, you could try augment the first try by defining subtypes of Functor that restrict the result type:

interface Collection<A> extends Functor<A> {
    Collection<B> map(Function<A, B> f);
}

interface List<A> extends Collection<A> {
    List<B> map(Function<A, B> f);
}

interface Set<A> extends Collection<A> {
    Set<B> map(Function<A, B> f);
}

interface Parser<A> extends Functor<A> {
    Parser<B> map(Function<A, B> f);
}

// …

This helps to forbid implementers of those narrower interfaces from returning the wrong type of Functor from the map method, but since there is no limit to how many Functor implementations you can have, there is no limit to how many narrower interfaces you'll need.

(EDIT: And note that this only works because Functor<B> appears as the result type, and so the child interfaces can narrow it. So AFAIK we can't narrow both uses of Monad<B> in the following interface:

interface Monad<A> {
    <B> Monad<B> flatMap(Function<? super A, ? extends Monad<? extends B>> f);
}

In Haskell, with higher-rank type variables, this is (>>=) :: Monad m => m a -> (a -> m b) -> m b.)

Yet another try is to use recursive generics to try and have the interface restrict the result type of the subtype to the subtype itself. Toy example:

/**
 * A semigroup is a type with a binary associative operation.  Law:
 *
 * > x.append(y).append(z) = x.append(y.append(z))
 */
interface Semigroup<T extends Semigroup<T>> {
    T append(T arg);
}

class Foo implements Semigroup<Foo> {
    // Since this implements Semigroup<Foo>, now this method must accept 
    // a Foo argument and return a Foo result. 
    Foo append(Foo arg);
}

class Bar implements Semigroup<Bar> {
    // Any of these is a compilation error:

    Semigroup<Bar> append(Semigroup<Bar> arg);

    Semigroup<Foo> append(Bar arg);

    Semigroup append(Bar arg);

    Foo append(Bar arg);

}

But this sort of technique (which is rather arcane to your run-of-the-mill OOP developer, heck to your run-of-the-mill functional developer as well) still can't express the desired Functor constraint either:

interface Functor<FA extends Functor<FA, A>, A> {
    <FB extends Functor<FB, B>, B> FB map(Function<A, B> f);
}

The problem here is this doesn't restrict FB to have the same F as FA—so that when you declare a type List<A> implements Functor<List<A>, A>, the map method can still return a NotAList<B> implements Functor<NotAList<B>, B>.

Final try, in Java, using raw types (unparametrized containers):

interface FunctorStrategy<F> {
    F map(Function f, F arg);
} 

Here F will get instantiated to unparametrized types like just List or Map. This guarantees that a FunctorStrategy<List> can only return a List—but you've abandoned the use of type variables to track the element types of the lists.

The heart of the problem here is that languages like Java and C# don't allow type parameters to have parameters. In Java, if T is a type variable, you can write T and List<T>, but not T<String>. Higher-kinded types remove this restriction, so that you could have something like this (not fully thought out):

interface Functor<F, A> {
    <B> F<B> map(Function<A, B> f);
}

class List<A> implements Functor<List, A> {

    // Since F := List, F<B> := List<B>
    <B> List<B> map(Function<A, B> f) {
        // ...
    }

}

And addressing this bit in particular:

(I think) I get that instead of myList |> List.map f or myList |> Seq.map f |> Seq.toList higher kinded types allow you to simply write myList |> map f and it'll return a List. That's great (assuming it's correct), but seems kind of petty? (And couldn't it be done simply by allowing function overloading?) I usually convert to Seq anyway and then I can convert to whatever I want afterwards.

There are many languages that generalize the idea of the map function this way, by modeling it as if, at heart, mapping is about sequences. This remark of yours is in that spirit: if you have a type that supports conversion to and from Seq, you get the map operation "for free" by reusing Seq.map.

In Haskell, however, the Functor class is more general than that; it isn't tied to the notion of sequences. You can implement fmap for types that have no good mapping to sequences, like IO actions, parser combinators, functions, etc.:

instance Functor IO where
    fmap f action =
        do x <- action
           return (f x)

 -- This declaration is just to make things easier to read for non-Haskellers 
newtype Function a b = Function (a -> b)

instance Functor (Function a) where
    fmap f (Function g) = Function (f . g)  -- `.` is function composition

The concept of "mapping" really isn't tied to sequences. It's best to understand the functor laws:

(1) fmap id xs == xs
(2) fmap f (fmap g xs) = fmap (f . g) xs

Very informally:

  1. The first law says that mapping with an identity/noop function is the same as doing nothing.
  2. The second law says that any result that you can produce by mapping twice, you can also produce by mapping once.

This is why you want fmap to preserve the type—because as soon as you get map operations that produce a different result type, it becomes much, much harder to make guarantees like this.