Efficient algorithm for ordering different types of objects

Solution 1:

This is similar to a problem I encountered a couple of years ago: mixing liquids to avoid stratification. The idea is that if you're mixing liquids A, B, and C into a container, you don't want to just pour them into the container one after the other. Rather, you want to add some A, some B, some C, etc., in the relative proportions.

It's the same problem as evenly distributing items in a list.

Say you have 30 of type A, 20 of type B, and 10 of type C, for a total of 60 videos. Every other video, then, has to be an A. Every third video is a B, and every sixth video is a C.

So the A's are at 0,2,4,6,8,etc. B's are at 0,3,6,9,12,etc. And the C's are at 0,6,12,18,etc.

Obviously, you have collisions that you must resolve.

The way I did this is to build a min heap that contains the video type and its frequency, and its current position, which starts at frequency/2. So the heap contains: {A,2,1},{B,3,1},{C,6,3}.

To generate your list, remove the lowest item from the heap and add it to your list. Then, add its frequency to the current position, and put it back on the heap. So after the first time through, you've output the A, and your heap now contains: {B,3,1},{A,2,2},{C,6,3}.

Output the B, and then add it back, giving you {A,2,2},{C,6,3},{B,3,4}

You'll of course also want to keep a count of each item, which you decrement each time that item is output, and you don't add it back to the heap if the count goes to 0.

I wrote about this at some length in my blog about a year ago. See Evenly distributing items in a list.

As far as efficiency is concerned, the algorithm has complexity O(n log k), where n is the total number of videos, and k is the number of video types.

Solution 2:

Here's my algorithm that works for any number of types, not just 2:

  • Call a type (e.g. A, B, C, ...) T.
  • Call the number of items of type T N(T).

Algorithm in pseudocode:

var size = 0;
for_each (T)
  size += N(T);

var output = array(size); // Initialised to null, to mean gap (no item)
var gapsRemaining = size;

for_each (T)
{
  var itemsRemaining = N(T);
  var i = 0;
  var limit = itemsRemaining / gapsRemaining;
  while (itemsRemaining > 0)
  {
    if (itemsRemaining / (gapsRemaining - i) >= limit)
    { output[{i}th_gap] = next_item_of_type(T)
      gapsRemaining--;
      itemsRemaining--;
    }
    else
      i++;
  }
}

Where the {i}th_gap is zero-based, like the array indexes.

If you can work out {i}th_gap in constant time (which can be done by just using another counter), then the algorithm is linear time, i.e. O(size) O(size * numTypes).

For your example, it gives output a b a b a a b a.


Edit

Re-think: it doesn't need to be so complicated if you just maintain counts of each type.

Working JS code (http://js.do/code/96801)

var numItems = [5,3]; // for AAAAABBB
var numItems = [6,3,5]; // for AAAAAABBBCCCCC
var totalNumItems = 0;
for (i=0; i<numItems.length; i++)
    totalNumItems += numItems[i];
var limits = [];
for (i=0; i<numItems.length; i++)
    limits[i] = numItems[i] / totalNumItems;
var numGaps = totalNumItems;
var output = [];
for (i=0; i<totalNumItems; i++)
{   var bestValue = 0;
    var bestType;
    for (j=0; j<numItems.length; j++)
    {   var value = numItems[j] / numGaps - limits[j];
        if (value >= bestValue)
        {   bestValue = value;
            bestType = j;
    }   }
    output[i] = bestType;
    numItems[bestType]--;
    numGaps--;
}
for (i=0; i<totalNumItems; i++)
    document.writeln(output[i]);
document.writeln("<br>");

But as @Jim says, it's O(n * k), where n is totalNumItems and k is numItems.length. So his O(n log k) solution has better complexity.


Edit 2

Tweak to break ties better, so more frequent items are preferred. Previous code output for [10,1,1] was caaabaaaaaaa, now is abaaaaacaaaa.

http://js.do/code/96848

var numItems = [10,1,1];
var totalNumItems = 0;
for (i=0; i<numItems.length; i++)
    totalNumItems += numItems[i];
var limits = [];
for (i=0; i<numItems.length; i++)
    limits[i] = numItems[i] / totalNumItems;
var numGaps = totalNumItems;
var output = [];
for (i=0; i<totalNumItems; i++)
{   var bestValue = 0;
    var bestNumItems = 0;
    var bestType;
    for (j=0; j<numItems.length; j++)
    {   var value = numItems[j] / numGaps - limits[j];
        if (value >= bestValue && numItems[j] > bestNumItems)
        {   bestValue = value;
            bestNumItems = numItems[j];
            bestType = j;
    }   }
    output[i] = bestType;
    numItems[bestType]--;
    numGaps--;
}
for (i=0; i<totalNumItems; i++)
    document.writeln(output[i]);
document.writeln("<br>");

Solution 3:

The question doesn't seem so easy as the number of combinations is large. If I am right, for Na, Nb and Nc videos of three types, there are (Na+Nb+Nc-1)!/Na!Nb!Nc! possibilities. (The -1 in the numerator coming from the fact that sequences that are a cyclic permutation of each other are considered identical.)

Not having a clear understanding of the combinatorial structure, I would try as follows:

  • define a merit figure that rates how well a playlist is dispersed. That could be the sum of (cyclic) distances between videos from the same set.

For example

A1, B1, A2, B2, A3, B3, A4, A5

gives

2+2+2+2+2+4+1+1 = 16

and

A1, B1, A2, A3, B2, A4, B4, A5

gives

2+3+1+2+2+2+3+1 = 16

(this is probably an unsufficient metric, short distance should be more penalized.)

  • try random sequences by choosing among the available types until they are exhausted, and rate the sequences. After a number of random trials, keep the best score. [I would simulate drawing without replacement so that the different types are consumed in a balanced way.]

For small N, exhaustive trials are possible.

Update:

Suprizingly, the simple metric that I suggested gives a constant value !

Solution 4:

Lets say you have A1, A2, ..., An and B1, B2, ..., Bm.

If n>m, then at least 2 A items will be played one after another (if the playlist is cyclic [keeps repeating all]).

You should place the A items on a cycle first. Then place one B item in between every two consecutive A items. This will separate the next-to-next A items. And then place the remaining B items if there is any left.

If you want to make sure that the first and the last items won't be both A, then place an A item at the beginning and place a B item at the end if there are enough B items.

As a computational algorithm, assign numbers (in double type to allow them to be rational numbers) to each A item and order these numbers from smallest to the biggest. Then assign each B item the average of consecutive A items. Ex:

A1=3 A2=5 A3=10

ArrayA(0)=3

ArrayA(1)=5

ArrayA(2)=10

Then let's say you have 4 B items.

 On Error Resume Next

 For n=0 to 3

 ArrayB(n)=(ArrayA(n)+ArrayA(n+1))/2

 Loop

This loop will try to call ArrayA(3) and give an error, we will skip that with on error resume next. You can then assign random numbers to unassigned B items.

At the end, combine two arrays, sort them. You will get sorted numbers. By those numbers, call back the items in an optimally sorted arrangement.