Scala: Why mapValues produces a view and is there any stable alternatives?
Just now I am surprised to learn that mapValues
produces a view. The consequence is shown in the following example:
case class thing(id: Int)
val rand = new java.util.Random
val distribution = Map(thing(0) -> 0.5, thing(1) -> 0.5)
val perturbed = distribution mapValues { _ + 0.1 * rand.nextGaussian }
val sumProbs = perturbed.map{_._2}.sum
val newDistribution = perturbed mapValues { _ / sumProbs }
The idea is that I have a distribution, which is perturbed with some randomness then I renormalize it. The code actually fails in its original intention: since mapValues
produces a view
, _ + 0.1 * rand.nextGaussian
is always re-evaluated whenever perturbed
is used.
I am now doing something like distribution map { case (s, p) => (s, p + 0.1 * rand.nextGaussian) }
, but that's just a little bit verbose. So the purpose of this question is:
- Remind people who are unaware of this fact.
- Look for reasons why they make
mapValues
outputview
s. - Whether there is an alternative method that produces concrete
Map
. - Are there any other commonly-used collection methods that have this trap.
Thanks.
Solution 1:
There's a ticket about this, SI-4776 (by YT).
The commit that introduces it has this to say:
Following a suggestion of jrudolph, made
filterKeys
andmapValues
transform abstract maps, and duplicated functionality for immutable maps. Movedtransform
andfilterNot
from immutable to general maps. Review by phaller.
I have not been able to find the original suggestion by jrudolph, but I assume it was done to make mapValues
more efficient. Give the question, that may come as a surprise, but mapValues
is more efficient if you are not likely to iterate over the values more than once.
As a work-around, one can do mapValues(...).view.force
to produce a new Map
.
Solution 2:
The scala doc say:
a map view which maps every
key
of this map tof(this(key))
. The resulting map wraps the original map without copying any elements.
So this should be expected, but this scares me a lot, I'll have to review bunch of code tomorrow. I wasn't expecting a behavior like that :-(
Just an other workaround:
You can call toSeq
to get a copy, and if you need it back to map toMap
, but this unnecessary create objects, and have a performance implication over using map
One can relatively easy write, a mapValues
which doesn't create a view, I'll do it tomorrow and post the code here if no one do it before me ;)
EDIT:
I found an easy way to 'force' the view, use '.map(identity)' after mapValues (so no need of implementing a specific function):
scala> val xs = Map("a" -> 1, "b" -> 2)
xs: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1, b -> 2)
scala> val ys = xs.mapValues(_ + Random.nextInt).map(identity)
ys: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1315230132, b -> 1614948101)
scala> ys
res7: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1315230132, b -> 1614948101)
It's a shame the type returned isn't actually a view! othewise one would have been able to call 'force' ...