Select one element from each set but the selected element should not be repeated

I have a few sets, say 5 of them

{1,2,3}

{2,4}

{1,2}

{4,5}

{2,3,5}

Here, I need to choose at least 3 elements from any three sets(One element per set). Given that if an element is already selected, then it cannot be selected again.

Also check if any solution exists or not.

Eg

set {1,2,3} -> choose 1

set {2,4} -> choose 2

set {1,2} -> cannot choose since both 1 and 2 are chosen.

set {2,5} -> can only choose 5

Is there a way to achieve this? Simple explanation would be appreciated.


If you only need 3 elements, then the algorithm is quite simple. Just repeat the following procedure:

  1. Select the set with the lowest heuristic. The heuristic is the length of the set, divided by the total occurrences of that set. If the set has zero elements, remove the set, and go to step 4. If there are two or more, you can choose any one of them.
  2. Pick an element from that set. This is the element you'll choose.
  3. Remove this element from every set.
  4. If you have picked 3 elements or there are no more sets remaining, exit. Otherwise go to step 1.

This algorithm gives at least 3 elements whenever it's possible, even in the presence of duplicates. Here's the proof.

  1. If the heuristic for a set is <= 1, picking an element from that set is basically free. It doesn't hurt the ability to use other sets at all.
  2. If we are in a situation with 2 or more sets with heuristic >1, and we have to pick at least two elements, this is easy. Just pick one from the first, and the second one will have an element left, because it's length is >1 because it's heuristic is >1.
  3. If we are in a situation with 3 or more sets with heuristic >1, we can pick from the first set. After this we are left with at least two sets, where at least one of them has more than one element. We can't be left with two size one sets, because that would imply that the 3 sets we started with contain a duplicate length 2 set, which has heuristic 1. Thus we can pick all 3 elements.

Here is python code for this algorithm. The generator returns as many elements as it can manage. If it's possible to return at least 3 elements, it will. However after that, it doesn't always return the optimal solution.

def choose(sets):
    # Copy the input, to avoid modification of the input
    s = [{*e} for e in sets]
    while True:
        # If there are no more sets remaining
        if not s:return
        # Remove based on length and number of duplicates
        m = min(s,key=lambda x:(len(x)/s.count(x)))
        s.remove(m)
        # Ignore empty sets
        if m:
            # Remove a random element
            e = m.pop()
            # Yield it
            yield e
            # Remove the chosen element e from other sets
            for i in range(len(s)):s[i].discard(e)

print([*choose([{1,2,3}, {2,4}, {1,2}, {4,5}, {2,3,5}])])
print([*choose([{1}, {2,3}, {2,4}, {1,2,4}])])
print([*choose([{1,2}, {2}, {2,3,4}])])
print([*choose([{1,2}, {2}, {2,1}])])
print([*choose([{1,2}, {1,3}, {1,3}])])
print([*choose([{1}, {1,2,3}, {1,2,3}])])
print([*choose([{1,3}, {2,3,4}, {2,3,4}, {2,3,4}, {2,3,4}])])
print([*choose([{1,5}, {2,3}, {1,3}, {1,2,3}])])

Try it online!


Something like this

given your sets
0: {1,2,3}
1: {2,4}
2: {1,2}
3: {4,5}
4: {2,3,5}
A array of sets
set A[1] = { 0, 2}       // all sets containing 1
    A[2] = { 0, 1, 2, 4} // all sets containing 2
    A[3] = { 0, 4 }      // all sets containing 3
    A[4] = { 1, 3 }      // all sets containing 4
    A[5] = { 3, 4 }      // all sets containing 5
set<int> result;
for(i = 0; i < 3; i++) {
  find k such that A[k] not empty
  if no k exist then "no solution"
  result.add(k)
  A[k] = empty
}
return result