How can I interleave or create unique permutations of two strings (without recursion)

The question is to print all possible interleavings of two given strings. So I wrote a working code in Python which runs like this:

def inter(arr1,arr2,p1,p2,arr):
    thisarr = copy(arr)
    if p1 == len(arr1) and p2 == len(arr2):
        printarr(thisarr)
    elif p1 == len(arr1):
        thisarr.extend(arr2[p2:])
        printarr(thisarr)
    elif p2 == len(arr2):
        thisarr.extend(arr1[p1:])
        printarr(thisarr)
    else:
        thisarr.append(arr1[p1])
        inter(arr1,arr2,p1+1,p2,thisarr)
        del thisarr[-1]
        thisarr.append(arr2[p2])
        inter(arr1,arr2,p1,p2+1,thisarr)
    return

It comes at each point in a string, then for one recursive call, it considers the current element as belonging to the first array, and in the next call, belonging to the other array. So if the input strings are ab and cd, it prints out abcd, acbd, cdab, cabd, etc. p1 and p2 are pointers to the arrays (Because Python strings are immutable, I am using arrays!). Can anybody tell me, what is this code's complexity, and if it can be improved or not? I have written a similar code to print all combinations of length k from a given array:

def kcomb(arr,i,thisarr,k):
     thisarr = copy(thisarr)
     j,n = len(thisarr),len(arr)
     if n-i<k-j or j >k:
        return
     if j==k:
        printarr(thisarr)
        return
     if i == n:
         return
     thisarr.append(arr[i])
     kcomb(arr,i+1,thisarr,k)
     del thisarr[-1]
     kcomb(arr,i+1,thisarr,k)
     return

This too, works on the same principle. So in general, how to find the complexity of such functions, and how do I optimize them? Is it possible to do these by DP? Sample input-output for the first problem:

>>> arr1 = ['a','b','c']
>>> arr2 = ['d','e','f']
>>> inter(arr1,arr2,0,0,[])
abcdef
abdcef
abdecf
abdefc
adbcef
adbecf
adbefc
adebcf
adebfc
adefbc
dabcef
dabecf
dabefc
daebcf
daebfc
daefbc
deabcf
deabfc
deafbc
defabc

Solution 1:

Your problem can be reduced to that of creating all unique permutations of a particular list. Say A and B are the lengths of the strings arr1 and arr2, respectively. Then construct a list like this:

[0] * A + [1] * B

There exists a one-to-one correspondence (a bijection) from the unique permutations of this list to all the possible interleavings of the two strings arr1 and arr2. The idea is to let each value of the permutation specify which string to take the next character from. Here is an example implementation showing how to construct an interleaving from a permutation:

>>> def make_interleave(arr1, arr2, permutation):
...     iters = [iter(arr1), iter(arr2)]
...     return "".join(iters[i].next() for i in permutation)
... 
>>> make_interleave("ab", "cde", [1, 0, 0, 1, 1])
'cabde'

I found this question in the python mailing list which asks how to solve this problem in an efficient manner. The answers suggest using an algorithm which is described in Knuth's The Art of Computer Programming, Volume 4, Fascicle 2: Generating All Permutations. I found an online pdf of the draft here. The algorithm is also described in this wikipedia article.

Here's my own annotated implementation of the next_permutation algorithm, as a python generator function.

def unique_permutations(seq):
    """
    Yield only unique permutations of seq in an efficient way.

    A python implementation of Knuth's "Algorithm L", also known from the 
    std::next_permutation function of C++, and as the permutation algorithm 
    of Narayana Pandita.
    """

    # Precalculate the indices we'll be iterating over for speed
    i_indices = list(range(len(seq) - 1, -1, -1))
    k_indices = i_indices[1:]

    # The algorithm specifies to start with a sorted version
    seq = sorted(seq)

    while True:
        yield seq

        # Working backwards from the last-but-one index,           k
        # we find the index of the first decrease in value.  0 0 1 0 1 1 1 0
        for k in k_indices:
            if seq[k] < seq[k + 1]:
                break
        else:
            # Introducing the slightly unknown python for-else syntax:
            # else is executed only if the break statement was never reached.
            # If this is the case, seq is weakly decreasing, and we're done.
            return

        # Get item from sequence only once, for speed
        k_val = seq[k]

        # Working backwards starting with the last item,           k     i
        # find the first one greater than the one at k       0 0 1 0 1 1 1 0
        for i in i_indices:
            if k_val < seq[i]:
                break

        # Swap them in the most efficient way
        (seq[k], seq[i]) = (seq[i], seq[k])                #       k     i
                                                           # 0 0 1 1 1 1 0 0

        # Reverse the part after but not                           k
        # including k, also efficiently.                     0 0 1 1 0 0 1 1
        seq[k + 1:] = seq[-1:k:-1]

Each yield of the algorithm has an amortized complexity of O(1), according to this question, but according to rici who commented below, this is only the case if all numbers are unique, which they are definately not in this case.

In any case, the number of yields provides a lower bound for the time complexity, and it is given by

(A + B)! / (A! * B!)

Then to find the real time complexity we need to sum the average complexity of each yield with the complexity of constructing the resulting string based on the permutation. If we multiply this sum with the above formula we get the total time complexity.

Solution 2:

okay, after some work, and usign advice from other answers. mainly lazyr. (and have now converted it into a class) the __all_perms is from: https://stackoverflow.com/a/104436/1561176

class Interleave():

    def __init__(self, A, B):
        self.A = A
        self.B = B
        self.results = list(self.__interleave())

    # from https://stackoverflow.com/a/104436/1561176
    def __all_perms(self, elements):
        if len(elements) <=1:
            yield elements
        else:
            for perm in self.__all_perms(elements[1:]):
                for i in range(len(elements)):
                    #nb elements[0:1] works in both string and list contexts
                    yield perm[:i] + elements[0:1] + perm[i:]

    def __sequences(self):
        return list( sorted( set(
            ["".join(x) for x in self.__all_perms(['a'] * len(self.A) + ['b'] * len(self.B))] ) ) )

    def __interleave(self):
        for sequence in self.__sequences():
            result = ""
            a = 0
            b = 0
            for item in sequence:
                if item == 'a':
                    result+=self.A[a]
                    a+=1
                else:
                    result+=self.B[b]
                    b+=1
            yield result

    def __str__(self):
        return str(self.results)

    def __repr__(self):
        return repr(self.results)

and here is the usage:

>>> A = ['a', 'b', 'c']
>>> B = ['d', 'e', 'f']
>>> Interleave(A, B)
['abcdef', 'abdcef', 'abdecf', 'abdefc', 'adbcef', 'adbecf', 'adbefc', 'adebcf', 'adebfc', 'adefbc', 'dabcef', 'dabecf', 'dabefc', 'daebcf', 'daebfc', 'daefbc', 'deabcf', 'deabfc', 'deafbc', 'defabc']

also, you can access the Class members like:

>>> inter = Interleave(A, B)
>>> inter.results
['abcdef', 'abdcef', 'abdecf', 'abdefc', 'adbcef', 'adbecf', 'adbefc', 'adebcf', 'adebfc', 'adefbc', 'dabcef', 'dabecf', 'dabefc', 'daebcf', 'daebfc', 'daefbc', 'deabcf', 'deabfc', 'deafbc', 'defabc']
>>> inter.A
['a', 'b', 'c']
>>> inter.B
['d', 'e', 'f']