Specialization with Constraints

Solution 1:

GHC also gives an option to SPECIALIZE a type-class instance declaration. I tried this with the (expanded) code of Foo.hs, by putting the following:

instance (Num r, V.Vector v r, Factored m r) => Num (VT v m r) where 
    {-# SPECIALIZE instance ( Factored m Int => Num (VT U.Vector m Int)) #-}
    VT x + VT y = VT $ V.zipWith (+) x y

This change, though, did not achieve the desired speedup. What did achieve that performance improvement was manually adding a specialized instance for the type VT U.Vector m Int with the same function definitions, as follows:

instance (Factored m Int) => Num (VT U.Vector m Int) where 
    VT x + VT y = VT $ V.zipWith (+) x y

This requires adding OverlappingInstances and FlexibleInstances in LANGUAGE.

Interestingly, in the example program, the speedup obtained with the overlapping instance remains even if you remove every SPECIALIZE and INLINABLE pragma.