Algorithm to separate items of the same type

Solution 1:

First, you don't have a well-defined optimization problem yet. If you want to maximized the minimum distance between two items of the same type, that's well defined. If you want to maximize the minimum distance between two A's and between two B's and ... and between two Z's, then that's not well defined. How would you compare two solutions:

  1. A's are at least 4 apart, B's at least 4 apart, and C's at least 2 apart
  2. A's at least 3 apart, B's at least 3 apart, and C's at least 4 apart

You need a well-defined measure of "good" (or, more accurately, "better"). I'll assume for now that the measure is: maximize the minimum distance between any two of the same item.

Here's an algorithm that achieves a minimum distance of ceiling(N/n(A)) where N is the total number of items and n(A) is the number of items of instance A, assuming that A is the most numerous.

  • Order the item types A1, A2, ... , Ak where n(Ai) >= n(A{i+1}).
  • Initialize the list L to be empty.
  • For j from k to 1, distribute items of type Ak as uniformly as possible in L.

Example: Given the distribution in the question, the algorithm produces:

F
E, F
D, E, D, F
D, C, E, D, C, F
B, D, C, E, B, D, C, F, B
A, B, D, A, C, E, A, B, D, A, C, F, A, B

Solution 2:

This sounded like an interesting problem, so I just gave it a try. Here's my super-simplistic randomized approach, done in Python:

def optimize(items, quality_function, stop=1000):
    no_improvement = 0
    best = 0
    while no_improvement < stop:
        i = random.randint(0, len(items)-1)
        j = random.randint(0, len(items)-1)
        copy = items[::]
        copy[i], copy[j] = copy[j], copy[i]
        q = quality_function(copy)
        if q > best:
            items, best = copy, q
            no_improvement = 0
        else:
            no_improvement += 1
    return items

As already discussed in the comments, the really tricky part is the quality function, passed as a parameter to the optimizer. After some trying I came up with one that almost always yields optimal results. Thank to pmoleri, for pointing out how to make this a whole lot more efficient.

def quality_maxmindist(items):
    s = 0
    for item in set(items):
        indcs = [i for i in range(len(items)) if items[i] == item]
        if len(indcs) > 1:
            s += sum(1./(indcs[i+1] - indcs[i]) for i in range(len(indcs)-1))
    return 1./s

And here some random result:

>>> print optimize(items, quality_maxmindist)
['A', 'B', 'C', 'A', 'D', 'E', 'A', 'B', 'F', 'C', 'A', 'D', 'B', 'A']

Note that, passing another quality function, the same optimizer could be used for different list-rearrangement tasks, e.g. as a (rather silly) randomized sorter.

Solution 3:

Here is an algorithm that only maximizes the minimum distance between elements of the same type and does nothing beyond that. The following list is used as an example:

AAAAA BBBBB CCCC DDDD EEEE FFF GG
  • Sort element sets by number of elements of each type in descending order. Actually only largest sets (A & B) should be placed to the head of the list as well as those element sets that have one element less (C & D & E). Other sets may be unsorted.
  • Reserve R last positions in the array for one element from each of the largest sets, divide the remaining array evenly between the S-1 remaining elements of the largest sets. This gives optimal distance: K = (N - R) / (S - 1). Represent target array as a 2D matrix with K columns and L = N / K full rows (and possibly one partial row with N % K elements). For example sets we have R = 2, S = 5, N = 27, K = 6, L = 4.
  • If matrix has S - 1 full rows, fill first R columns of this matrix with elements of the largest sets (A & B), otherwise sequentially fill all columns, starting from last one.

For our example this gives:

AB....
AB....
AB....
AB....
AB.

If we try to fill the remaining columns with other sets in the same order, there is a problem:

ABCDE.
ABCDE.
ABCDE.
ABCE..
ABD

The last 'E' is only 5 positions apart from the first 'E'.

  • Sequentially fill all columns, starting from last one.

For our example this gives:

ABFEDC
ABFEDC
ABFEDC
ABGEDC
ABG

Returning to linear array we have:

ABFEDCABFEDCABFEDCABGEDCABG

Here is an attempt to use simulated annealing for this problem (C sources): http://ideone.com/OGkkc.