Is there a nice way to make function signatures more informative in Haskell?
I realize that this could potentially be considered a subjective or maybe an off-topic question, so I hope that rather than have it closed it would get migrated, maybe to Programmers.
I'm starting to learn Haskell, mostly for my own edification, and I like a lot of the ideas and principles backing the language. I became fascinated with functional languages after taking a language theory class where we played around with Lisp, and I had been hearing a lot of good things about how productive Haskell could be, so I figured I'd investigate it myself. So far, I like the language, except for one thing that I can't just get away from: Those mother effing function signatures.
My professional background is mostly doing OO, especially in Java. Most of the places that I've worked for have hammered in a lot of the standard modern dogmas; Agile, Clean Code, TDD, etc. After a few years of working this way, It has definitely become my comfort zone; especially the idea that "good" code should be self documenting. I've become used to working in an IDE, where long and verbose method names with very descriptive signatures are a non-issue with intelligent auto completion and a huge array of analytical tools for navigating packages and symbols; if I can hit Ctrl+Space in Eclipse, then deduce what a method is doing from looking at its name and the locally scoped variables associated with its arguments instead of pulling up the JavaDocs, I'm as happy as a pig in poop.
This is, decidedly, not part of the community best practices in Haskell. I've read through plenty of different opinions on the matter, and I understand that the Haskell community considers its succinctness to be a "pro". I've gone through How To Read Haskell, and I understand the rationale behind a lot of the decisions, but it doesn't mean that I like them; one letter variable names, etc. aren't fun for me. I acknowledge that I'll have to get used to that if I want to keep hacking with the language.
But I can't get over the function signatures. Take this example, as pulled from Learn you a Haskell[...]'s section on function syntax:
bmiTell :: (RealFloat a) => a -> a -> String
bmiTell weight height
| weight / height ^ 2 <= 18.5 = "You're underweight, you emo, you!"
| weight / height ^ 2 <= 25.0 = "You're supposedly normal. Pffft, I bet you're ugly!"
| weight / height ^ 2 <= 30.0 = "You're fat! Lose some weight, fatty!"
| otherwise = "You're a whale, congratulations!"
I realize that this is a silly example that was only created for the purpose of explaining guards and class constraints, but if you were to examine just the signature of that function, you would have no idea which of its arguments was intended to be the weight or the height. Even if you were to use Float
or Double
instead of any type, it would still not be immediately discernible.
At first, I thought I would be cute and clever and brilliant and try to spoof it using longer type variable names with multiple class constraints:
bmiTell :: (RealFloat weight, RealFloat height) => weight -> height -> String
This spat out an error (as an aside, if anyone can explain the error to me, I'd be grateful):
Could not deduce (height ~ weight)
from the context (RealFloat weight, RealFloat height)
bound by the type signature for
bmiTell :: (RealFloat weight, RealFloat height) =>
weight -> height -> String
at example.hs:(25,1)-(27,27)
`height' is a rigid type variable bound by
the type signature for
bmiTell :: (RealFloat weight, RealFloat height) =>
weight -> height -> String
at example.hs:25:1
`weight' is a rigid type variable bound by
the type signature for
bmiTell :: (RealFloat weight, RealFloat height) =>
weight -> height -> String
at example.hs:25:1
In the first argument of `(^)', namely `height'
In the second argument of `(/)', namely `height ^ 2'
In the first argument of `(<=)', namely `weight / height ^ 2'
Not understanding completely why that didn't work, I started Googling around, and I even found this little post that suggests named parameters, specifically, spoofing named parameters via newtype
, but that seems to be a bit much.
Is there no acceptable way to craft informative function signatures? Is "The Haskell Way" simply to Haddock the crap out of everything?
A type signature is not a Java-style signature. A Java-style signature will tell you which parameter is the weight and which is the height only because it mingles the parameter names with the parameter types. Haskell can't do this as a general rule, because functions are defined using pattern matching and multiple equations, as in:
map :: (a -> b) -> [a] -> [b]
map f (x:xs) = f x : map f xs
map _ [] = []
Here the first parameter is named f
in the first equation and _
(which pretty much means "unnamed") in the second. The second parameter doesn't have a name in either equation; in the first parts of it have names (and the programmer will probably think of it as "the xs list"), while in the second it's a completely literal expression.
And then there's point-free definitions like:
concat :: [[a]] -> [a]
concat = foldr (++) []
The type signature tells us it takes an parameter which is of type [[a]]
, but no name for this parameter appears anywhere in the system.
Outside an individual equation for a function, the names it uses to refer to its arguments are irrelevant anyway except as documentation. Since the idea of a "canonical name" for a function's parameter isn't well defined in Haskell, the place for the information "the first parameter of bmiTell
represents weight while the second represents height" is in documentation, not in the type signature.
I agree absolutely that what a function does should be crystal clear from the "public" information available about it. In Java, that is the function's name, and the parameter types and names. If (as is common) the user will need more information than that, you add it in the documentation. In Haskell the public information about a function is the function's name and the parameter types. If the user will need more information than that, you add it in the documentation. Note IDEs for Haskell such as Leksah will easily show you Haddock comments.
Note that the preferred thing to do in a language with a strong and expressive type system like Haskell's is often to try make as many errors as possible detectable as type errors. Thus, a function like bmiTell
immediately sets off warning signs to me, for the following reasons:
- It takes two parameters of the same type representing different things
- It will do the wrong thing if passed parameters in the wrong order
- The two types don't have a natural position (as the two
[a]
arguments to++
do)
One thing that is often done to increase type safety is indeed to make newtypes, as in the link that you found. I don't really think of this as having much to do with named parameter passing, more that it is about making a datatype that explicitly represents height, rather than any other quantity you might want to measure with a number. So I wouldn't have the newtype values appearing only at the call; I would be using the newtype value wherever I got the height data from as well, and passing it around as height data rather than as a number, so that I get the type-safety (and documentation) benefit everywhere. I would only unwrap the value into a raw number when I need to pass it to something that operates on numbers and not on height (such as the arithmetic operations inside bmiTell
).
Note that this has no runtime overhead; newtypes are represented identically to the data "inside" the newtype wrapper, so the wrap/unwrap operations are no-ops on the underlying representation and are simply removed during compilation. It adds only extra characters in the source code, but those characters are exactly the documentation you're seeking, with the added benefit of being enforced by the compiler; Java-style signatures tell you which parameter is weight and which is height, but the compiler still won't be able to tell if you accidentally passed them the wrong way around!
There are other options, depending on how silly and/or pedantic you want to get with your types.
For example, you could do this...
type Meaning a b = a
bmiTell :: (RealFloat a) => a `Meaning` weight -> a `Meaning` height -> String
bmiTell weight height = -- etc.
...but that's incredibly silly, potentially confusing, and doesn't help in most cases. The same goes for this, which additionally requires using language extensions:
bmiTell :: (RealFloat weight, RealFloat height, weight ~ height)
=> weight -> height -> String
bmiTell weight height = -- etc.
Slightly more sensible would be this:
type Weight a = a
type Height a = a
bmiTell :: (RealFloat a) => Weight a -> Height a -> String
bmiTell weight height = -- etc.
...but that's still kinda goofy and tends to get lost when GHC expands type synonyms.
The real problem here is that you're attaching additional semantic content to different values of the same polymorphic type, which is going against the grain of the language itself and, as such, usually not idiomatic.
One option, of course, is to just deal with uninformative type variables. But that's not very satisfying if there's a significant distinction between two things of the same type that's not obvious from the order they're given in.
What I'd recommend you try, instead, is using newtype
wrappers to specify semantics:
newtype Weight a = Weight { getWeight :: a }
newtype Height a = Height { getHeight :: a }
bmiTell :: (RealFloat a) => Weight a -> Height a -> String
bmiTell (Weight weight) (Height height)
Doing this is nowhere near as common as deserves to be, I think. It's a bit of extra typing (ha, ha) but not only does it make your type signatures more informative even with type synonyms expanded, it lets the type checker catch if you mistakenly use a weight as a height, or such. With the GeneralizedNewtypeDeriving
extension you can even get automatic instances even for type classes that can't normally be derived.
Haddocks and/or also looking at the function equation (the names you bound things to) are the ways that I tell what's going on. You can Haddock individual parameters, like so,
bmiTell :: (RealFloat a) => a -- ^ your weight
-> a -- ^ your height
-> String -- ^ what I'd think about that
so it's not just a blob of text explaining all the stuff.
The reason your cute type variables didn't work is that your function is:
(RealFloat a) => a -> a -> String
But your attempted change:
(RealFloat weight, RealFloat height) => weight -> height -> String
is equivalent to this:
(RealFloat a, RealFloat b) => a -> b -> String
So, in this type signature you have said that the first two arguments have different types, but GHC has determined that (based on your use) they must have the same type. So it complains that it cannot determine that weight
and height
are the same type, even though they must be (that is, your proposed type signature is not strict enough and would allow invalid uses of the function).