Difference between `data` and `newtype` in Haskell
What is the difference when I write this?
data Book = Book Int Int
versus
newtype Book = Book (Int, Int) -- "Book Int Int" is syntactically invalid
Solution 1:
Great question!
There are several key differences.
Representation
- A
newtype
guarantees that your data will have exactly the same representation at runtime, as the type that you wrap. - While
data
declares a brand new data structure at runtime.
So the key point here is that the construct for the newtype
is guaranteed to be erased at compile time.
Examples:
data Book = Book Int Int
newtype Book = Book (Int, Int)
Note how it has exactly the same representation as a (Int,Int)
, since the Book
constructor is erased.
data Book = Book (Int, Int)
Has an additional Book
constructor not present in the newtype
.
data Book = Book {-# UNPACK #-}!Int {-# UNPACK #-}!Int
No pointers! The two Int
fields are unboxed word-sized fields in the Book
constructor.
Algebraic data types
Because of this need to erase the constructor, a newtype
only works when wrapping a data type with a single constructor. There's no notion of "algebraic" newtypes. That is, you can't write a newtype equivalent of, say,
data Maybe a = Nothing
| Just a
since it has more than one constructor. Nor can you write
newtype Book = Book Int Int
Strictness
The fact that the constructor is erased leads to some very subtle differences in strictness between data
and newtype
. In particular, data
introduces a type that is "lifted", meaning, essentially, that it has an additional way to evaluate to a bottom value. Since there's no additional constructor at runtime with newtype
, this property doesn't hold.
That extra pointer in the Book
to (,)
constructor allows us to put a bottom value in.
As a result, newtype
and data
have slightly different strictness properties, as explained in the Haskell wiki article.
Unboxing
It doesn't make sense to unbox the components of a newtype
, since there's no constructor. While it is perfectly reasonable to write:
data T = T {-# UNPACK #-}!Int
yielding a runtime object with a T
constructor, and an Int#
component. You just get a bare Int
with newtype
.
References:
- "Newtype" on the Haskell wiki
- Norman Ramsey's answer about the strictness properties