What is the number of bijections between two multisets?

Let $P$ and $Q$ be two finite multisets of the same cardinality $n$.

Question: How many bijections are there from $P$ to $Q$?

I will define a bijection between $P$ and $Q$ as a multiset $\Phi \subseteq P \times Q$ satisfying the properties:

  • $\Phi=\{(p_1,q_1),(p_2,q_2),\ldots,(p_n,q_n)\}$,
  • $P=\{p_1,p_2,\ldots,p_n\}$, and
  • $Q=\{q_1,q_2,\ldots,q_n\}$.

For example, suppose $P=\{1,1,1,3\}$ and $Q=\{1,1,2,2\}$. Then there are two bijections between these multisets: $\{(1,1),(1,1),(1,2),(3,2)\}$ and $\{(1,1),(3,1),(1,2),(1,2)\}$.

There seems to be two natural ways we could attempt to do this:

  • Use $S_{|P|}$ to act on $P$.
  • Use $S_{|P|} \times S_{|Q|}$ to act on $P \times Q$.

In either case, the action induces an action on the set of bijections from $P$ to $Q$. However, it seems difficult to find the size of the stabiliser subgroup (in either case).

[Motivation: A formula for the number of bijections would help me with a program I'm writing, in which I'm summing over all bijections from $P$ to $Q$ (which happen to be number partitions). I hope to use symmetry breaking to reduce the redundant iterations.]


Addendum (motivated by hardmath's comment, and so I can add the [graph-theory] tag):

A rephrasing of this question is:

Question: What is the number of bipartite multigraphs, with vertex bipartition $V_1 \cup V_2$, whose vertices have prescribed degrees?

For the above question to be non-zero, we require $\sum_{v \in V_1} \mathrm{deg}(v)=\sum_{v \in V_2} \mathrm{deg}(v)$, where $\mathrm{deg}(v)$ is the degree of vertex $v$,counting multi-edges with their multiplicity.

In the above example, $V_1=\{1,3\}$, with degrees $3$ and $1$, respectively, and $V_2=\{1',2'\}$, with degrees $2$ and $2$, respectively. The two bipartite multigraphs have edge multisets $\{\{1,1'\},\{1,1'\},\{1,2'\},\{3,2'\}\}$ and $\{\{1,1'\},\{3,1'\},\{1,2'\},\{1,2'\}\}$.


Solution 1:

Here's a sketch of how one can uniquely traverse ("search") the bijections between two multisets of equal cardinality, say $P$ and $Q$ both of size $n$.

Consider the partitions of $n$ represented by both multisets. In the given example, $P$ represents $1+3 = 4$, and $Q$ represents $2+2 = 4$.

Now any equal summands can be swapped, so with $Q$ of the example, transposing the roles of $q_1 = 1$ and $q_2 = 2$ gives distinct bijections unless a bijection treats them identically. This initially sounds like something that would be hard to get a handle on, but as we shall see it can be managed as an inner step of the search algorithm.

Let's focus on counting the bipartite graphs with multiple edges on $U \cup V$ where $U$ the underlying set of $P$ and $V$ the underlying set of $Q$ are assumed disjoint, and where vertex degrees counting multiplicity of edges agree as to the multiplicity of items in $P$ and $Q$ resp.

That is, let $a_1 \leq a_2 ... \leq a_m$ be the summands of $n$ in the partition represented by $P$, and let $b_1 \leq b_2 ... \leq b_k$ be the respective summands for $Q$. We ask for the number $T((a_1,a_2,...,a_m),(b_1,b_2,...,b_k))$ of bipartite graphs with multiple edges that realize the prescribed vertex degrees on $U \cup V$.

To efficiently compute $T(A,B)$ and construct these distinct graphs, it will be helpful to refer to yet another representation, namely the $m \times k$ biadjacency matrix of the graph, whose rows correspond to the elements of $U$ and columns to the elements of $V$, ordered consistently as to their respective summands. Note that unlike the 0/1 entries of an adjacency matrix of a simple graph, here our entries are nonnegative entries such that the rows sums $a_i$ and column sums $b_j$ are as prescribed.

The idea is to decide the largest row sum's distribution $a_m$ into row $m$, consistent with the available columns sums. Having done so the column sums are adjusted (and row sum $a_m$ is dropped), and one then proceeds with recursive computation of $T(A',B')$. Note that $T(A,B) = T(B,A)$ is symmetric, and that at each stage it is preferable to work with whichever tuple has shortest length. In the recursion, we will drop one entry of A, but we might drop more than one entry of B due to decrementing to zero some column sums.

When we eventually reach a single row sum, that distribution is forced as the remaining column sums are but single entries that must by construction agree as to the row sum required. At this point the biadjacency matrix has been populated fully, and it remains only to see if some rows or columns corresponding to equal summands in the original partitions may be permuted in a way that gives distinct solutions.

Let's illustrate with the given example, which asks us to find ways to populate a $2 \times 2$ biadjacency matrix such that the row sums are $1,3$ and the columns sums are $2,2$. It turns out this can be done with nonnegative integers in only one way:

$$\begin{pmatrix} 1 & 0 \\ 1 & 2 \end{pmatrix} $$

So here there is only one bipartite graph (with multiple edges) having the required vertex degrees, but since the two columns having equal sums are distinct, these can therefore be swapped to give two multiset bijections.

This recursive idea was applied to counting simple bipartite graphs by A. Guénoche (1990):

Counting and selecting at random bipartite graphs with fixed degrees

Addendum:

The approach sketched above simplifies. Instead of slavishly constructing exactly the nonisomorphic bipartite graphs (with multiple edges) having specified vertex degrees, it is simpler and more natural just to choose rows in the manner outlined without worrying about their identity (or the identity/nonidentity of columns having equal column sums). This avoids a need for post-facto processing with row and column permutations.

I kept having a nagging suspicion that this was the case even as I wrote it up, but the epiphany came only after posting. The simpler approach finds the nonisomorphic bipartite graphs with labelled vertices, and so identifies the multiset bijections that are wanted.

Moreover this clears the way to recognize that what we're counting are commonly called (at least in statistics) contingency tables, i.e. rectangular arrays of nonnegative integers whose row sums and column sums are specified in a compatible way, $\sum a_i = n = \sum b_j$. Counting them, given m row and k column sums, seems to be a difficult problem if high "degrees of freedom" are involved.

Barvinok (2008) gives bounds for the number of contingency tables and survey the literature:

Asymptotic Estimates for the Number of Contingency Tables, Integer Flows, and Volumes of Transportation Polytopes

but the ratio of upper to lower bounds is some positive power of $n^{m+k}$ in terms of our notation.