Interpreting a benchmark in C, Clojure, Python, Ruby, Scala and others [closed]
Disclaimer
I know that artificial benchmarks are evil. They can show results only for very specific narrow situation. I don't assume that one language is better than the other because of the some stupid bench. However I wonder why results is so different. Please see my questions at the bottom.
Math benchmark description
Benchmark is simple math calculations to find pairs of prime numbers which differs by 6 (so called sexy primes)
E.g. sexy primes below 100 would be: (5 11) (7 13) (11 17) (13 19) (17 23) (23 29) (31 37) (37 43) (41 47) (47 53) (53 59) (61 67) (67 73) (73 79) (83 89) (97 103)
Results table
In table: calculation time in seconds Running: all except Factor was running in VirtualBox (Debian unstable amd64 guest, Windows 7 x64 host) CPU: AMD A4-3305M
Sexy primes up to: 10k 20k 30k 100k
Bash 58.00 200.00 [*1] [*1]
C 0.20 0.65 1.42 15.00
Clojure1.4 4.12 8.32 16.00 137.93
Clojure1.4 (optimized) 0.95 1.82 2.30 16.00
Factor n/a n/a 15.00 180.00
Python2.7 1.49 5.20 11.00 119
Ruby1.8 5.10 18.32 40.48 377.00
Ruby1.9.3 1.36 5.73 10.48 106.00
Scala2.9.2 0.93 1.41 2.73 20.84
Scala2.9.2 (optimized) 0.32 0.79 1.46 12.01
[*1] - I'm afraid to imagine how much time will it take
Code listings
C:
int isprime(int x) {
int i;
for (i = 2; i < x; ++i)
if (x%i == 0) return 0;
return 1;
}
void findprimes(int m) {
int i;
for ( i = 11; i < m; ++i)
if (isprime(i) && isprime(i-6))
printf("%d %d\n", i-6, i);
}
main() {
findprimes(10*1000);
}
Ruby:
def is_prime?(n)
(2...n).all?{|m| n%m != 0 }
end
def sexy_primes(x)
(9..x).map do |i|
[i-6, i]
end.select do |j|
j.all?{|j| is_prime? j}
end
end
a = Time.now
p sexy_primes(10*1000)
b = Time.now
puts "#{(b-a)*1000} mils"
Scala:
def isPrime(n: Int) =
(2 until n) forall { n % _ != 0 }
def sexyPrimes(n: Int) =
(11 to n) map { i => List(i-6, i) } filter { _ forall(isPrime(_)) }
val a = System.currentTimeMillis()
println(sexyPrimes(100*1000))
val b = System.currentTimeMillis()
println((b-a).toString + " mils")
Scala opimized isPrime
(the same idea like in Clojure optimization):
import scala.annotation.tailrec
@tailrec // Not required, but will warn if optimization doesn't work
def isPrime(n: Int, i: Int = 2): Boolean =
if (i == n) true
else if (n % i != 0) isPrime(n, i + 1)
else false
Clojure:
(defn is-prime? [n]
(every? #(> (mod n %) 0)
(range 2 n)))
(defn sexy-primes [m]
(for [x (range 11 (inc m))
:let [z (list (- x 6) x)]
:when (every? #(is-prime? %) z)]
z))
(let [a (System/currentTimeMillis)]
(println (sexy-primes (* 10 1000)))
(let [b (System/currentTimeMillis)]
(println (- b a) "mils")))
Clojure optimized is-prime?
:
(defn ^:static is-prime? [^long n]
(loop [i (long 2)]
(if (= (rem n i) 0)
false
(if (>= (inc i) n) true (recur (inc i))))))
Python
import time as time_
def is_prime(n):
return all((n%j > 0) for j in xrange(2, n))
def primes_below(x):
return [[j-6, j] for j in xrange(9, x+1) if is_prime(j) and is_prime(j-6)]
a = int(round(time_.time() * 1000))
print(primes_below(10*1000))
b = int(round(time_.time() * 1000))
print(str((b-a)) + " mils")
Factor
MEMO:: prime? ( n -- ? )
n 1 - 2 [a,b] [ n swap mod 0 > ] all? ;
MEMO: sexyprimes ( n n -- r r )
[a,b] [ prime? ] filter [ 6 + ] map [ prime? ] filter dup [ 6 - ] map ;
5 10 1000 * sexyprimes . .
Bash(zsh):
#!/usr/bin/zsh
function prime {
for (( i = 2; i < $1; i++ )); do
if [[ $[$1%i] == 0 ]]; then
echo 1
exit
fi
done
echo 0
}
function sexy-primes {
for (( i = 9; i <= $1; i++ )); do
j=$[i-6]
if [[ $(prime $i) == 0 && $(prime $j) == 0 ]]; then
echo $j $i
fi
done
}
sexy-primes 10000
Questions
- Why Scala is so fast? Is it because of static typing? Or it is just using JVM very efficiently?
-
Why such a huge difference between Ruby and Python? I thought these two are not somewhat totally different. Maybe my code is wrong. Please enlighten me! Thanks.UPD Yes, that was error in my code. Python and Ruby 1.9 are pretty equal. - Really impressive jump in productivity between Ruby versions.
- Can I optimize Clojure code by adding type declarations? Will it help?
Rough answers:
- Scala's static typing is helping it quite a bit here - this means that it uses the JVM pretty efficiently without too much extra effort.
- I'm not exactly sure on the Ruby/Python difference, but I suspect that
(2...n).all?
in the functionis-prime?
is likely to be quite well optimised in Ruby (EDIT: sounds like this is indeed the case, see Julian's answer for more detail...) - Ruby 1.9.3 is just much better optimised
- Clojure code can certainly be accelerated a lot! While Clojure is dynamic by default, you can use type hints, primitive maths etc. to get close to Scala / pure Java speed in many cases when you need to.
Most important optimisation in the Clojure code would be to use typed primitive maths within is-prime?
, something like:
(set! *unchecked-math* true) ;; at top of file to avoid using BigIntegers
(defn ^:static is-prime? [^long n]
(loop [i (long 2)]
(if (zero? (mod n i))
false
(if (>= (inc i) n) true (recur (inc i))))))
With this improvement, I get Clojure completing 10k in 0.635 secs (i.e. the second fastest on your list, beating Scala)
P.S. note that you have printing code inside your benchmark in some cases - not a good idea as it will distort the results, especially if using a function like print
for the first time causes initialisation of IO subsystems or something like that!
Here's a fast Clojure version, using the same basic algorithms:
(set! *unchecked-math* true)
(defn is-prime? [^long n]
(loop [i 2]
(if (zero? (unchecked-remainder-int n i))
false
(if (>= (inc i) n)
true
(recur (inc i))))))
(defn sexy-primes [m]
(for [x (range 11 (inc m))
:when (and (is-prime? x) (is-prime? (- x 6)))]
[(- x 6) x]))
It runs about 20x faster than your original on my machine. And here's a version that leverages the new reducers library in 1.5 (requires Java 7 or JSR 166):
(require '[clojure.core.reducers :as r]) ;'
(defn sexy-primes [m]
(->> (vec (range 11 (inc m)))
(r/filter #(and (is-prime? %) (is-prime? (- % 6))))
(r/map #(list (- % 6) %))
(r/fold (fn ([] []) ([a b] (into a b))) conj)))
This runs about 40x faster than your original. On my machine, that's 100k in 1.5 seconds.
I'll answer just #2, since it's the only one I've got anything remotely intelligent to say, but for your Python code, you're creating an intermediate list in is_prime
, whereas you're using .map
in your all
in Ruby which is just iterating.
If you change your is_prime
to:
def is_prime(n):
return all((n%j > 0) for j in range(2, n))
they're on par.
I could optimize the Python further, but my Ruby isn't good enough to know when I've given more of an advantage (e.g., using xrange
makes Python win on my machine, but I don't remember if the Ruby range you used creates an entire range in memory or not).
EDIT: Without being too silly, making the Python code look like:
import time
def is_prime(n):
return all(n % j for j in xrange(2, n))
def primes_below(x):
return [(j-6, j) for j in xrange(9, x + 1) if is_prime(j) and is_prime(j-6)]
a = int(round(time.time() * 1000))
print(primes_below(10*1000))
b = int(round(time.time() * 1000))
print(str((b-a)) + " mils")
which doesn't change much more, puts it at 1.5s for me, and, with being extra silly, running it with PyPy puts it at .3s for 10K, and 21s for 100K.
You can make the Scala a lot faster by modifying your isPrime
method to
def isPrime(n: Int, i: Int = 2): Boolean =
if (i == n) true
else if (n % i != 0) isPrime(n, i + 1)
else false
Not quite as concise but the program runs in 40% of the time!
We cut out the superfluous Range
and anonymous Function
objects, the Scala compiler recognizes the tail-recursion and turns it into a while-loop, which the JVM can turn into more or less optimal machine code, so it shouldn't be too far off the C version.
See also: How to optimize for-comprehensions and loops in Scala?