Is the Scala 2.8 collections library a case of "the longest suicide note in history"? [closed]

I have just started to look at the Scala collections library re-implementation which is coming in the imminent 2.8 release. Those familiar with the library from 2.7 will notice that the library, from a usage perspective, has changed little. For example...

> List("Paris", "London").map(_.length)
res0: List[Int] List(5, 6)

...would work in either versions. The library is eminently useable: in fact it's fantastic. However, those previously unfamiliar with Scala and poking around to get a feel for the language now have to make sense of method signatures like:

def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That

For such simple functionality, this is a daunting signature and one which I find myself struggling to understand. Not that I think Scala was ever likely to be the next Java (or /C/C++/C#) - I don't believe its creators were aiming it at that market - but I think it is/was certainly feasible for Scala to become the next Ruby or Python (i.e. to gain a significant commercial user-base)

  • Is this going to put people off coming to Scala?
  • Is this going to give Scala a bad name in the commercial world as an academic plaything that only dedicated PhD students can understand? Are CTOs and heads of software going to get scared off?
  • Was the library re-design a sensible idea?
  • If you're using Scala commercially, are you worried about this? Are you planning to adopt 2.8 immediately or wait to see what happens?

Steve Yegge once attacked Scala (mistakenly in my opinion) for what he saw as its overcomplicated type-system. I worry that someone is going to have a field day spreading FUD with this API (similarly to how Josh Bloch scared the JCP out of adding closures to Java).

Note - I should be clear that, whilst I believe that Joshua Bloch was influential in the rejection of the BGGA closures proposal, I don't ascribe this to anything other than his honestly-held beliefs that the proposal represented a mistake.


Despite whatever my wife and coworkers keep telling me, I don't think I'm an idiot: I have a good degree in mathematics from the University of Oxford, and I've been programming commercially for almost 12 years and in Scala for about a year (also commercially).

Note the inflammatory subject title is a quotation made about the manifesto of a UK political party in the early 1980s. This question is subjective but it is a genuine question, I've made it CW and I'd like some opinions on the matter.


Solution 1:

I hope it's not a "suicide note", but I can see your point. You hit on what is at the same time both a strength and a problem of Scala: its extensibility. This lets us implement most major functionality in libraries. In some other languages, sequences with something like map or collect would be built in, and nobody has to see all the hoops the compiler has to go through to make them work smoothly. In Scala, it's all in a library, and therefore out in the open.

In fact the functionality of map that's supported by its complicated type is pretty advanced. Consider this:

scala> import collection.immutable.BitSet
import collection.immutable.BitSet

scala> val bits = BitSet(1, 2, 3)
bits: scala.collection.immutable.BitSet = BitSet(1, 2, 3)

scala> val shifted = bits map { _ + 1 }
shifted: scala.collection.immutable.BitSet = BitSet(2, 3, 4)

scala> val displayed = bits map { _.toString + "!" }
displayed: scala.collection.immutable.Set[java.lang.String] = Set(1!, 2!, 3!)

See how you always get the best possible type? If you map Ints to Ints you get again a BitSet, but if you map Ints to Strings, you get a general Set. Both the static type and the runtime representation of map's result depend on the result type of the function that's passed to it. And this works even if the set is empty, so the function is never applied! As far as I know there is no other collection framework with an equivalent functionality. Yet from a user perspective this is how things are supposed to work.

The problem we have is that all the clever technology that makes this happen leaks into the type signatures which become large and scary. But maybe a user should not be shown by default the full type signature of map? How about if she looked up map in BitSet she got:

map(f: Int => Int): BitSet     (click here for more general type)

The docs would not lie in that case, because from a user perspective indeed map has the type (Int => Int) => BitSet. But map also has a more general type which can be inspected by clicking on another link.

We have not yet implemented functionality like this in our tools. But I believe we need to do this, to avoid scaring people off and to give more useful info. With tools like that, hopefully smart frameworks and libraries will not become suicide notes.

Solution 2:

I do not have a PhD, nor any other kind of degree neither in CS nor math nor indeed any other field. I have no prior experience with Scala nor any other similar language. I have no experience with even remotely comparable type systems. In fact, the only language that I have more than just a superficial knowledge of which even has a type system is Pascal, not exactly known for its sophisticated type system. (Although it does have range types, which AFAIK pretty much no other language has, but that isn't really relevant here.) The other three languages I know are BASIC, Smalltalk and Ruby, none of which even have a type system.

And yet, I have no trouble at all understanding the signature of the map function you posted. It looks to me like pretty much the same signature that map has in every other language I have ever seen. The difference is that this version is more generic. It looks more like a C++ STL thing than, say, Haskell. In particular, it abstracts away from the concrete collection type by only requiring that the argument is IterableLike, and also abstracts away from the concrete return type by only requiring that an implicit conversion function exists which can build something out of that collection of result values. Yes, that is quite complex, but it really is only an expression of the general paradigm of generic programming: do not assume anything that you don't actually have to.

In this case, map does not actually need the collection to be a list, or being ordered or being sortable or anything like that. The only thing that map cares about is that it can get access to all elements of the collection, one after the other, but in no particular order. And it does not need to know what the resulting collection is, it only needs to know how to build it. So, that is what its type signature requires.

So, instead of

map :: (a → b) → [a] → [b]

which is the traditional type signature for map, it is generalized to not require a concrete List but rather just an IterableLike data structure

map :: (IterableLike i, IterableLike j) ⇒ (a → b) → i → j

which is then further generalized by only requiring that a function exists that can convert the result to whatever data structure the user wants:

map :: IterableLike i ⇒ (a → b) → i → ([b] → c) → c

I admit that the syntax is a bit clunkier, but the semantics are the same. Basically, it starts from

def map[B](f: (A) ⇒ B): List[B]

which is the traditional signature for map. (Note how due to the object-oriented nature of Scala, the input list parameter vanishes, because it is now the implicit receiver parameter that every method in a single-dispatch OO system has.) Then it generalized from a concrete List to a more general IterableLike

def map[B](f: (A) ⇒ B): IterableLike[B]

Now, it replaces the IterableLike result collection with a function that produces, well, really just about anything.

def map[B, That](f: A ⇒ B)(implicit bf: CanBuildFrom[Repr, B, That]): That

Which I really believe is not that hard to understand. There's really only a couple of intellectual tools you need:

  1. You need to know (roughly) what map is. If you gave only the type signature without the name of the method, I admit, it would be a lot harder to figure out what is going on. But since you already know what map is supposed to do, and you know what its type signature is supposed to be, you can quickly scan the signature and focus on the anomalies, like "why does this map take two functions as arguments, not one?"
  2. You need to be able to actually read the type signature. But even if you have never seen Scala before, this should be quite easy, since it really is just a mixture of type syntaxes you already know from other languages: VB.NET uses square brackets for parametric polymorphism, and using an arrow to denote the return type and a colon to separate name and type, is actually the norm.
  3. You need to know roughly what generic programming is about. (Which isn't that hard to figure out, since it's basically all spelled out in the name: it's literally just programming in a generic fashion).

None of these three should give any professional or even hobbyist programmer a serious headache. map has been a standard function in pretty much every language designed in the last 50 years, the fact that different languages have different syntax should be obvious to anyone who has designed a website with HTML and CSS and you can't subscribe to an even remotely programming related mailinglist without some annoying C++ fanboy from the church of St. Stepanov explaining the virtues of generic programming.

Yes, Scala is complex. Yes, Scala has one of the most sophisticated type systems known to man, rivaling and even surpassing languages like Haskell, Miranda, Clean or Cyclone. But if complexity were an argument against success of a programming language, C++ would have died long ago and we would all be writing Scheme. There are lots of reasons why Scala will very likely not be successful, but the fact that programmers can't be bothered to turn on their brains before sitting down in front of the keyboard is probably not going to be the main one.

Solution 3:

Same thing in C++:

template <template <class, class> class C,
          class T,
          class A,
          class T_return,
          class T_arg
              >
C<T_return, typename A::rebind<T_return>::other>
map(C<T, A> &c,T_return(*func)(T_arg) )
{
    C<T_return, typename A::rebind<T_return>::other> res;
    for ( C<T,A>::iterator it=c.begin() ; it != c.end(); it++ ){
        res.push_back(func(*it));
    }
    return res;
}

Solution 4:

Well, I can understand your pain, but, quite frankly, people like you and I -- or pretty much any regular Stack Overflow user -- are not the rule.

What I mean by that is that... most programmers won't care about that type signature, because they'll never see them! They don't read documentation.

As long as they saw some example of how the code works, and the code doesn't fail them in producing the result they expect, they won't ever look at the documentation. When that fails, they'll look at the documentation and expect to see usage examples at the top.

With these things in mind, I think that:

  1. Anyone (as in, most people) who ever comes across that type signature will mock Scala to no end if they are pre-disposed against it, and will consider it a symbol of Scala's power if they like Scala.

  2. If the documentation isn't enhanced to provide usage examples and explain clearly what a method is for and how to use it, it can detract from Scala adoption a bit.

  3. In the long run, it won't matter. That Scala can do stuff like that will make libraries written for Scala much more powerful and safer to use. These libraries and frameworks will attract programmers atracted to powerful tools.

  4. Programmers who like simplicity and directness will continue to use PHP, or similar languages.

Alas, Java programmers are much into power tools, so, in answering that, I have just revised my expectation of mainstream Scala adoption. I have no doubt at all that Scala will become a mainstream language. Not C-mainstream, but perhaps Perl-mainstream or PHP-mainstream.

Speaking of Java, did you ever replace the class loader? Have you ever looked into what that involves? Java can be scary, if you look at the places framework writers do. It's just that most people don't. The same thing applies to Scala, IMHO, but early adopters have a tendency to look under each rock they encounter, to see if there's something hiding there.