Why is sum so much faster than inject(:+)?
So I was running some benchmarks in Ruby 2.4.0 and realized that
(1...1000000000000000000000000000000).sum
calculates immediately whereas
(1...1000000000000000000000000000000).inject(:+)
takes so long that I just aborted the operation. I was under the impression that Range#sum
was an alias for Range#inject(:+)
but it seems like that is not true. So how does sum
work, and why is it so much faster than inject(:+)
?
N.B. The documentation for Enumerable#sum
(which is implemented by Range
) does not say anything about lazy evaluation or anything along those lines.
Short answer
For an integer range :
-
Enumerable#sum
returns(range.max-range.min+1)*(range.max+range.min)/2
-
Enumerable#inject(:+)
iterates over every element.
Theory
The sum of integers between 1 and n
is called a triangular number, and is equal to n*(n+1)/2
.
The sum of integers between n
and m
is the triangular number of m
minus the triangular number of n-1
, which is equal to m*(m+1)/2-n*(n-1)/2
, and can be written (m-n+1)*(m+n)/2
.
Enumerable#sum in Ruby 2.4
This property in used in Enumerable#sum
for integer ranges :
if (RTEST(rb_range_values(obj, &beg, &end, &excl))) {
if (!memo.block_given && !memo.float_value &&
(FIXNUM_P(beg) || RB_TYPE_P(beg, T_BIGNUM)) &&
(FIXNUM_P(end) || RB_TYPE_P(end, T_BIGNUM))) {
return int_range_sum(beg, end, excl, memo.v);
}
}
int_range_sum
looks like this :
VALUE a;
a = rb_int_plus(rb_int_minus(end, beg), LONG2FIX(1));
a = rb_int_mul(a, rb_int_plus(end, beg));
a = rb_int_idiv(a, LONG2FIX(2));
return rb_int_plus(init, a);
which is equivalent to:
(range.max-range.min+1)*(range.max+range.min)/2
the aforementioned equality!
Complexity
Thanks a lot to @k_g and @Hynek-Pichi-Vychodil for this part!
sum
(1...1000000000000000000000000000000).sum
requires three additions, a multiplication, a substraction and a division.
It's a constant number of operations, but multiplication is O((log n)²), so Enumerable#sum
is O((log n)²) for an integer range.
inject
(1...1000000000000000000000000000000).inject(:+)
requires 999999999999999999999999999998 additions!
Addition is O(log n), so Enumerable#inject
is O(n log n).
With 1E30
as input, inject
with never return. The sun will explode long before!
Test
It's easy to check if Ruby Integers are being added :
module AdditionInspector
def +(b)
puts "Calculating #{self}+#{b}"
super
end
end
class Integer
prepend AdditionInspector
end
puts (1..5).sum
#=> 15
puts (1..5).inject(:+)
# Calculating 1+2
# Calculating 3+3
# Calculating 6+4
# Calculating 10+5
#=> 15
Indeed, from enum.c
comments :
Enumerable#sum
method may not respect method redefinition of"+"
methods such asInteger#+
.