Average Number of Blocks in a Partition Containing a Large Block

Notation

  • Let $[n]$ denote the set of integers from 1 up to $n$, inclusive.
  • Let $A_n$ denote the average number of blocks over all partitions of $[n]$.
  • Let $B_n$ denote the number of partitions of $[n]$ (the $n$th Bell number).

Question

For any given $n \geq 1$, how small can $m$ be (as a function of $n$) and still ensure $$ A_{n-m} + 1 < A_n$$ (or, equivalently, $$ \frac{B_{n+1-m}}{B_{n-m}} < \frac{B_{n+1}}{B_n}-1)? $$

Remarks

The lefthand side of the inequality comes from fixing a particular block of size $m$ and considering all partitions of $[n]$ containing that fixed block. The idea is that if a partition contains a single large block, the expected number of blocks must fall. For my purposes, smaller values of $m$ are superior, but I would be interested to see anything.

The following reasoning led me to believe $m = \lceil n / A_n \rceil$ would be big enough. The average size of a block in a partition of $[n]$ is at most $\lceil n / A_n \rceil$. Having a single larger-than-average block should reduce the average number of blocks appearing in such a partition. Exploring with Maple shows $\lceil n / A_n \rceil$ is not quite big enough to establish the inequality, but $\lceil n / A_n \rceil + 1$ works for $n \leq 500$.

I keep mentioning averages because I'm hoping there is a quick proof of the kind I attempted in the preceding paragraph. If that vantage point is not helpful, there is the alternative statement of the inequality in terms of Bell numbers. This expression can possibly be analyzed asymptotically (though the Moser-Wyman expansion fails here), but I was rather hoping to avoid it. Still, any suggestions in this direction are welcome.


Solution 1:

We can think of a partition of $[n+1]$ as being formed by taking a parent partition of $[n]$ and either adding $n+1$ to an existing set or putting it in as a new set (so each parent with $k$ blocks has $k$ children with $k$ blocks and $1$ child with $k+1$ blocks; this is the same coupling used in Mike Spivey's answer to this question). If we let $X_n$ denote the number of blocks in a random partition of $[n]$ (so $A_n=E(X_n)$), this gives

\begin{eqnarray*} A_{n+1} &=& \frac{\sum_k (k^2+(k+1)) P(X_n=k)}{\sum_k (k+1) P(X_n=k)} \\ &=& \frac{E(X_n^2)+ A_n+1}{A_n+1} \\ &\geq& \frac{A_n^2+A_n+1}{A_n+1} \textrm{ (since } E(X_n^2) \geq (E(X_n))^2 \textrm{)}\\ &=& A_n+\frac{1}{A_n+1} \end{eqnarray*}

So inductively we have $$A_{n}-A_{n-m} \geq \sum_{j=n-m}^{n-1} \frac{1}{A_j+1} \geq \frac{m}{A_{n-1}+1}$$ since $A_j$ is non-decreasing (as was shown in the linked question). It follows we can take $m=\lceil A_{n-1}+1 \rceil$.


As was noted in the comments, this bound is far from tight. If you're interested in asymptotic bounds, then one possibility may be to leave in the term I left out, writing $$A_{n+1}=A_n+\frac{1}{A_n+1}+\frac{Var(X_n)}{A_n+1}.$$ Both $A_n$ and the variance seem to be well-studied -- Theorem 8.60 of Toufik Mansour's recent book (referencing Theorem VIII.41 in Flajolet and Sedgwick's Analytic Combinatorics) gives their values as $\frac{n}{\log n}(1+o(1))$ and $\frac{n}{\log^2 n} (1+o(1))$ respectively, which would give you $m=\log n$ asymptotically.