How to select unique elements
Solution 1:
Helper method
This method uses the helper:
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
This method is similar to Array#-. The difference is illustrated in the following example:
a = [3,1,2,3,4,3,2,2,4]
b = [2,3,4,4,3,4]
a - b #=> [1]
c = a.difference b #=> [1, 3, 2, 2]
As you see, a
contains three 3's and b
contains two, so the first two 3's in a
are removed in constructing c
(a
is not mutated). When b
contains as least as many instances of an element as does a
, c
contains no instances of that element. To remove elements beginning at the end of a
:
a.reverse.difference(b).reverse #=> [3, 1, 2, 2]
Array#difference!
could be defined in the obvious way.
I have found many uses for this method: here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here and here.
I have proposed that this method be added to the Ruby core.
When used with Array#-
, this method makes it easy to extract the unique elements from an array a
:
a = [1,3,2,4,3,4]
u = a.uniq #=> [1, 2, 3, 4]
u - a.difference(u) #=> [1, 2]
This works because
a.difference(u) #=> [3,4]
contains all the non-unique elements of a
(each possibly more than once).
Problem at Hand
Code
class Array
def uniq_elements(&prc)
prc ||= ->(e) { e }
a = map { |e| prc[e] }
u = a.uniq
uniques = u - a.difference(u)
select { |e| uniques.include?(prc[e]) ? (uniques.delete(e); true) : false }
end
end
Examples
t = [1,2,2,3,4,4,5,6,7,7,8,9,9,9]
t.uniq_elements
#=> [1,3,5,6,8]
t = [1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
t.uniq_elements { |z| z.round }
# => [2.0, 5.1]
Solution 2:
Here's another way.
Code
require 'set'
class Array
def uniq_elements(&prc)
prc ||= ->(e) { e }
uniques, dups = {}, Set.new
each do |e|
k = prc[e]
((uniques.key?(k)) ? (dups << k; uniques.delete(k)) :
uniques[k] = e) unless dups.include?(k)
end
uniques.values
end
end
Examples
t = [1,2,2,3,4,4,5,6,7,7,8,9,9,9]
t.uniq_elements #=> [1,3,5,6,8]
t = [1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
t.uniq_elements { |z| z.round } # => [2.0, 5.1]
Explanation
- if
uniq_elements
is called with a block, it is received as the procprc
. - if
uniq_elements
is called without a block,prc
isnil
, so the first statement of the method setsprc
equal to the default proc (lambda). - an initially-empty hash,
uniques
, contains representations of the unique values. The values are the unique values of the arrayself
, the keys are what is returned when the procprc
is passed the array value and called:k = prc[e]
. - the set
dups
contains the elements of the array that have found to not be unique. It is a set (rather than an array) to speed lookups. Alternatively, if could be a hash with the non-unique values as keys, and arbitrary values. - the following steps are performed for each element
e
of the arrayself
:-
k = prc[e]
is computed. - if
dups
containsk
,e
is a dup, so nothing more needs to be done; else - if
uniques
has a keyk
,e
is a dup, sok
is added to the setdups
and the element with keyk
is removed fromuniques
; else - the element
k=>e
is added touniques
as a candidate for a unique element.
-
- the values of
unique
are returned.