Number of classes of k-digit strings when digit order and identity doesn't matter

Suppose we look at $k$-digit strings with digits between $1$ and $n$. How many distinct classes of strings are there when digit identity and order doesn't matter?

More formally, what is the number of equivalence classes over strings $d_1,\ldots,d_k$ with $d_i\in \{1,\ldots, n\}$ where strings $d,\hat{d}$ are considered equivalent iff there exist bijections $\sigma$ over $\{1,\ldots, k\} $ and $\delta$ over $\{1,\ldots, n\}$ such that the following holds for all $j \in \{1,\ldots,k\}$

$$d_{\sigma(j)}=\delta(\hat{d}_j)$$

Edit: this seems to correspond to the number of ways to partition integer $k$ into at most $n$ parts


Solution 1:

If I've read this right, you're looking to count k-tuples of integers in {1,..,n} that are inequivalent under the operations (a) permute the coordinates and (b) permute the integers.

So, for example, when k=7 and n=2, 1122212 would be equivalent to 2211121, 1112222 and 1111222 (and many others).

I believe it would be useful to construct a canonical form for each class: In this case, we can take the lexicographic first element in the equivalence class.

So these would be in canonical form:

1111222
1111122
1111112
1111111

whereas 1112222 would not be since it belongs to the same class as 1111222.

These canonical forms are equivalent to ordered partitions of k into n parts (some of which can be zero). For example 1111222 <-> 4+3 since there are four 1's and three 2's. [If instead n=3, then 1111222 <-> 4+3+0 since there are no 3's in the string.]

The number of these is the number $p_n(k)$ of partitions of k into at most n parts (afterwards, we can append the zeroes to achieve n parts).

We can compute $p_n(k)$ using the recurrence relation $p_n(k)=p_n(k-n)+p_{n-1}(k-1)$ (with some appropriate boundary conditions). [see: Enumerative combinatorics, Volume 1 By Richard P. Stanley p. 28]

Solution 2:

I would like to contribute an experimental confirmation of the above result that we have the number $p_n(k)$ of partitions of $k$ into at most $n$ parts.

This is done by a particularly simple form of Power Group Enumeration as presented by e.g. Harary and (in a different publication) Fripertinger. The present case does not involve generating functions and can in fact be done with Burnside's lemma. The scenario is that we have two groups permuting the slots and the elements being placed into these slots. We have $k$ slots and the symmetric group $S_k$ acting on them. Furthermore $n$ digits go into the slots and these digits have the symmetric group $S_n$ acting on them. We can compute the number of configurations by Burnside's lemma which says to average the number of assignments fixed by the elements of the power group, which has $n!\times k!$ elements. (The cycle indices have lower complexity.) But this number is easy to compute. Suppose we have a permutation $\alpha$ from $S_k$ and a permutation $\beta$ from $S_n.$ If we place the appropriate number of complete, directed and consecutive copies of a cycle from $\beta$ on a cycle from $\alpha$ then this assignment is fixed under the power group action for $(\alpha,\beta)$, and this is possible iff the length of the cycle from $\beta$ divides the length of the cycle from $\alpha$. The process yields as many assignments as the length of the cycle from $\beta.$

This yields the following extremely straightforward Maple program.

pet_cycleind_symm :=
proc(n)
        option remember;

        if n=0 then return 1; fi;

        expand(1/n*add(a[l]*pet_cycleind_symm(n-l), l=1..n));
end;

pet_flatten_term :=
proc(varp)
        local terml, d, cf, v;

        terml := [];

        cf := varp;
        for v in indets(varp) do
            d := degree(varp, v);
            terml := [op(terml), seq(v, k=1..d)];
            cf := cf/v^d;
        od;

        [cf, terml];
end;


strs :=
proc(k, n)
    option remember;
    local idx_coord, idx_digits, res, a, b,
    flat_a, flat_b, cyc_a, cyc_b, len_a, len_b, p, q;

    idx_coord := pet_cycleind_symm(k);
    idx_digits := pet_cycleind_symm(n);

    res := 0;

    for a in idx_coord do
        flat_a := pet_flatten_term(a);

        for b in idx_digits do
            flat_b := pet_flatten_term(b);

            p := 1;
            for cyc_a in flat_a[2] do
                len_a := op(1, cyc_a);
                q := 0;

                for cyc_b in flat_b[2] do
                    len_b := op(1, cyc_b);

                    if len_a mod len_b = 0 then
                        q := q + len_b;
                    fi;
                od;

                p := p*q;
            od;

            res := res + p*flat_a[1]*flat_b[1];
        od;
    od;

    res;
end;

This is some sample output from the program.

> seq(strs(k, 4), k=1..18);
    1, 2, 3, 5, 6, 9, 11, 15, 18, 23, 27, 34, 39, 47, 54, 64, 72, 84

> seq(strs(k, 5), k=1..18);
  1, 2, 3, 5, 7, 10, 13, 18, 23, 30, 37, 47, 57, 70, 84, 101, 119, 141

> seq(strs(k, 6), k=1..18);
  1, 2, 3, 5, 7, 11, 14, 20, 26, 35, 44, 58, 71, 90, 110, 136, 163, 199

Looking these up in the OEIS we can confirm that we indeed have the number $p_n(k)$ of partitions into at most $n$ parts. These are the sequences OEIS A001400, OEIS A001401 and OEIS A001402.

This MSE link points to some sophisticated power group enumeration examples.

Addendum Wed Oct 21 2015. Let me point out that the fact that we have partitions of $k$ into at most $n$ parts follows by inspection. The fact that the order does not matter creates multisets and the digit permutations render multisets with the same shape indistinguishable from one another, leaving partitions.

Addendum Sat Apr 21 2018. It is quite interesting to see PGE correctly reproduce the values from the partition function here. Nonetheless the algorithm admits of a considerable improvement, namely that there is no need to flatten the permutations because we can compute the contribution from the cycles of a pair $(\alpha, \beta)$ by multiplying the number of coverings of a cycle type from $\alpha$ by the number of instances of the cycle type from $\beta$, raising the result to the power of the number of instances of the cycle type from $\alpha.$ This yields the following code, where we have included a routine to verify the results using the partition function.

pet_cycleind_symm :=
proc(n)
option remember;

    if n=0 then return 1; fi;

    expand(1/n*add(a[l]*pet_cycleind_symm(n-l), l=1..n));
end;

strs :=
proc(k, n)
option remember;
local idx_coord, idx_digits, res, term_a, term_b,
    v_a, v_b, inst_a, inst_b, len_a, len_b, p, q;

    if k = 1 then
        idx_coord := [a[1]];
    else
        idx_coord := pet_cycleind_symm(k);
    fi;

    if n = 1 then
        idx_digits := [a[1]];
    else
        idx_digits := pet_cycleind_symm(n);
    fi;

    res := 0;

    for term_a in idx_coord do
        for term_b in idx_digits do
            p := 1;

            for v_a in indets(term_a) do
                len_a := op(1, v_a);
                inst_a := degree(term_a, v_a);

                q := 0;

                for v_b in indets(term_b) do
                    len_b := op(1, v_b);
                    inst_b := degree(term_b, v_b);

                    if len_a mod len_b = 0 then
                        q := q + len_b*inst_b;
                    fi;
                od;

                p := p*q^inst_a;
            od;

            res := res +
            lcoeff(term_a)*lcoeff(term_b)*p;
        od;
    od;

    res;
end;

part :=
proc(n, k)
option remember;
local res;

    if n=0 or k=0 then
        return `if`(n=0 and k=0, 1, 0);
    fi;

    if k=1 then return 1 fi;

    res := 0;
    if n >= 1 then
        res := res + part(n-1, k-1);
    fi;
    if n >= k then
        res := res + part(n-k, k);
    fi;

    res;
end;

strs_verif := (k, n) -> add(part(k, m), m=1..n);