Scala: Why mapValues produces a view and is there any stable alternatives?

Just now I am surprised to learn that mapValues produces a view. The consequence is shown in the following example:

case class thing(id: Int)
val rand = new java.util.Random
val distribution = Map(thing(0) -> 0.5, thing(1) -> 0.5)
val perturbed = distribution mapValues { _ + 0.1 * rand.nextGaussian }
val sumProbs = perturbed.map{_._2}.sum
val newDistribution = perturbed mapValues { _ / sumProbs }

The idea is that I have a distribution, which is perturbed with some randomness then I renormalize it. The code actually fails in its original intention: since mapValues produces a view, _ + 0.1 * rand.nextGaussian is always re-evaluated whenever perturbed is used.

I am now doing something like distribution map { case (s, p) => (s, p + 0.1 * rand.nextGaussian) }, but that's just a little bit verbose. So the purpose of this question is:

  1. Remind people who are unaware of this fact.
  2. Look for reasons why they make mapValues output views.
  3. Whether there is an alternative method that produces concrete Map.
  4. Are there any other commonly-used collection methods that have this trap.

Thanks.


Solution 1:

There's a ticket about this, SI-4776 (by YT).

The commit that introduces it has this to say:

Following a suggestion of jrudolph, made filterKeys and mapValues transform abstract maps, and duplicated functionality for immutable maps. Moved transform and filterNot from immutable to general maps. Review by phaller.

I have not been able to find the original suggestion by jrudolph, but I assume it was done to make mapValues more efficient. Give the question, that may come as a surprise, but mapValues is more efficient if you are not likely to iterate over the values more than once.

As a work-around, one can do mapValues(...).view.force to produce a new Map.

Solution 2:

The scala doc say:

a map view which maps every key of this map to f(this(key)). The resulting map wraps the original map without copying any elements.

So this should be expected, but this scares me a lot, I'll have to review bunch of code tomorrow. I wasn't expecting a behavior like that :-(

Just an other workaround:

You can call toSeq to get a copy, and if you need it back to map toMap, but this unnecessary create objects, and have a performance implication over using map

One can relatively easy write, a mapValues which doesn't create a view, I'll do it tomorrow and post the code here if no one do it before me ;)

EDIT:

I found an easy way to 'force' the view, use '.map(identity)' after mapValues (so no need of implementing a specific function):

scala> val xs = Map("a" -> 1, "b" -> 2)
xs: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1, b -> 2)

scala> val ys = xs.mapValues(_ + Random.nextInt).map(identity)
ys: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1315230132, b -> 1614948101)

scala> ys
res7: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1315230132, b -> 1614948101)

It's a shame the type returned isn't actually a view! othewise one would have been able to call 'force' ...