Blocking directed paths on a DAG with a linear number of vertex defects.
Let $G=(V,E)$ be a directed acyclic graph. Define the set of all directed paths in $G$ by $\Gamma$. Given a subset $W\subseteq V$, let $\Gamma_W\subseteq \Gamma$ be the set of all paths $\gamma\in\Gamma$ supported on $V\backslash W$ (i.e all vertices in $\gamma$ belong to $V\backslash W$). Now define $l(W)$ to be: $$l(W)=\max_{\gamma\in \Gamma_W} |\gamma|$$ Where $|\gamma|$ is the number of vertices in $\gamma$.
I want to prove (or disprove) the following claim:
${\bf Claim:}$ For every $\epsilon>0$ and every $k>0$, there are constants $L$ and $N$ such that for any directed acyclic graph $G=(V, E)$ satisfying $|V|>N$ with the sum of incoming and outgoing degrees bounded by $k$, there exists a subset $W\subseteq V$ such that $\frac{|W|}{|V|}<\epsilon$ and $l(W)<L$.
The claim is true for directed trees (see edit 1 for a proof) but the same proof idea fails to work in more general DAGs. Moreover, the statement fails to be true if we remove the constant degree requirement for $G$. Indeed, the maximal topological order on vertices indexed from 1 to n can not be "blocked" for any $\epsilon>0$ by any set $W$ of size linear in $n$.
Any direction or idea would be welcome.
Edit 1:
For trees, a standard proof would go like this: For $0\leq i\leq L-1$, Define $W_L^i$ to be the set of all vertices reachable from the root with a directed path of length $i \pmod L$. Since the graph is a tree, any such path is uniquely defined for every vertex and therefore, for a given $L$, the set $\{W_L^i\}_{0\leq i\leq L-1}$ gives a partition of $V$. Therefore choosing $L=\frac{1}{\epsilon}$, there is some $i_0$ such that $|W_L^{i_0}|$ is at most $\frac{|V|}{L}=\epsilon |V|$. It is left to show that every $W_L^i$ is indeed $L$-blocking, but this is trivial since any step in a directed path down the tree increases the distance from the root by exactly 1, so the longest path containing no vertices from $W_L^i$ has to be of length at most $L-1$ (Connecting 2 adjacent floors in $W_L^i$)
Edit 2:
In general, the claim is true for any DAG for the special case of $\epsilon = \frac{2k}{2k+1}$ and $L=1$. To see that consider the following algorithm:
1- choose a vertex $v$ in the graph that still has neighbours. Keep $v$, and remove all of its neighbours (in both directions) from graph
2 - if any non isolated vertex is left, go back to 1. Otherwise exit.
The resulting graph is completely disconnected ($L=1$) and we removed at most an $\epsilon=\frac{2k}{2k+1}$ fraction of vertices from the graph. The claim follows.
Edit 3:
As Misha Lavrov showed, the previous bound can be made tighter and we can prove the claim for $\epsilon=\frac{k}{k+1}$. I discovered that this bound is not tight when the DAG has total degree bounded by 3. In this case, I will prove the claim for any $\epsilon>\frac{1}{2}$ where the previous bound is only $\epsilon=\frac{2}{3}$. Define the in-degree and out-degree of a vertex $v$ in $G$ by $in(v)$ and $out(v)$ respectively. From the assumption, for all $v \in V$, $in(v)+out(v)\leq 3$. Define 4 sets: $\{V_i\}_{i=0}^3$ by: $$V_i=\{v\in V | in(v)=i\}$$
Obviously, $\{V_i\}_{i=0}^3$ forms a partition of $V$. Therefore, either of the sets $V_1$ or $V_2$ has cardinality at most $\frac{n}{2}$. W.l.o.g, assume it is $V_1$. Let $G'$ be the subgraph of $G$ induced by $V_2$. Obviously, for all $v\in V_2$, $out(v)\leq 1$ and therefore, $G'$ is a disjoint union of directed trees with one vertex sink. Using the proof for trees, for any $\epsilon>0$, we can find a subset $W\subseteq V_2$ such that $\frac{|W|}{|V_2|}<\epsilon$ and $W$ is $\frac{1}{\epsilon}$-blocking in $G'$. Finally, define $W'= V_1 \cup W$. On one hand, $|W'|$ is upper bounded by $(\frac{1}{2}+\epsilon)n$ and on the other hand $W'$ is $(\frac{1}{\epsilon}+2)$-blocking since a directed path in $G$ can either stay in $V_2$ and get blocked by $W$ or get outside of $V_2$ and either be blocked by $V_1$ or do one last step from $V_0$, to $V_3$ or both.
This proves the claim.
PS. Crossposted at MO.
Solution 1:
I know it is a bad style but I have no time for proper typing now (maybe I'll do it later), so here is a set of handwritten notes with a counterexample. Questions are welcome, as usual :)
Solution 2:
The following is a writeup of Fedja's solution (with some details added).
$\Large{\text{The Problem}}$
For every $\epsilon > 0$ and $d \ge 1$, are there $L,N$ so that for any directed acyclic graph with at least $N$ vertices and max-degree at most $d$, one can remove at most $\epsilon N$ vertices to destroy all paths of length $L$?
Here, "max-degree" is the maximum over all the out-degrees and the in-degrees of the vertices.
$\Large{\text{The Counterexample}}$
Let $k = 10^{100}$. We disprove the problem for $\epsilon := 10^{-5}$ and $d := 2k/\epsilon = 2\cdot 10^{105}$.
Take $L \ge 1$. We shall find arbitrarily large directed acyclic graphs with max degree at most $k$ so that removing any $\epsilon N$ vertices does not destroy all paths of length $L$.
Take $N$ a large power of $2$. Put a random graph on vertices $1,2,\dots,N$ by including the directed edge $u \to v$, for $u < v$, with probability $\frac{\alpha}{v-u}$, where $\alpha$ satisfies $\sum_{v=2}^N \frac{\alpha}{v-1} = k$. Note $\alpha \approx \frac{k}{\log N}$ (so indeed $\frac{\alpha}{v-u} \in [0,1]$). Call the formed random graph $\mathcal{G}_N$.
Note the expected number of edges in the graph is $\sum_{u=1}^N \sum_{v = u+1}^N \frac{\alpha}{v-u} \approx \alpha N\log(N) \approx kN$. So with probability at least $\frac{1}{3}$, say, the number of edges in our random graph is at most $2kN$. Assume that our random graph has at most $2kN$ edges. Remove all vertices whose in-degree is at least $\frac{k}{\epsilon}$, and remove all vertices whose out-degree is at least $\frac{k}{\epsilon}$. We removed at most $4\epsilon N$ vertices.
The remaining (random) graph, denoted $\overline{\mathcal{G}}_N$, is such that all vertices have in-degree plus out-degree upper bound by $\frac{2k}{\epsilon} = d$. We will prove that, with probability $1-o_{N \to \infty}(1)$ (conditioned on the at-least-$1/3$ probability event of the initial graph having at most $2kN$ edges) we cannot remove $\epsilon N$ vertices to block all paths of length $L$.
$\Large{\text{Proof that the Counterexample Works}}$
For the proof that the counterexample works, we may keep the vertices with in-degree or out-degree at least $\frac{k}{\epsilon}$ (call them for a brief moment $\textit{popular}$ vertices). Indeed, if we can block all paths of length $L$ by removing $\epsilon N$ vertices in the graph with the popular vertices removed, then we can block all paths of length $L$ by removing $5\epsilon N$ vertices in the graph with the popular vertices not removed. Let $c = 10^{-3}$. Note $c > 30\epsilon$.
$\textbf{Definition}$: Call a graph \textit{$L$-tame} if its vertices can be partitioned into $L$ sets $V_0,\dots,V_{L-1}$ such that (1) the number of edges $u \to v$ with $u \in V_i, v \in V_j$ for $i \le j$ is at most $ckN$ and (2) for every $1 \le i \le L-1$ and every $u \in V_i$, there exists an edge $u \to v$ with $v \in V_{i-1}$.
We prove the following lemma in the next section.
$\textbf{Main Lemma}$: For every $L \in \mathbb{N}$, $\Pr[\mathcal{G}_N$ is $L$-tame$] \to 0$ as $N \to \infty$.
Assuming the Main Lemma, we now go on to prove that the counterexample works (with high probability). We begin with a lemma.
$\textbf{Lemma 1}$: In $\mathcal{G}_N$, the probability that any fixed $\epsilon N$ vertices have more than $ckN$ outgoing edges is at most $e^{3k\epsilon N}e^{-ckN}$.
Proof: By Markov's inequality, letting $E_o$ denote the number of outgoing edges from our fixed $\epsilon N$ vertices, it suffices to show that $\mathbb{E}[e^{E_0}] \le e^{3k \epsilon N}$. And this follows from the fact that $1$ has the most out edges in expectation: \begin{align*} \mathbb{E}[e^{E_0}] &\le \left(\prod_{v=2}^N (1-\frac{\alpha}{v-1}+\frac{\alpha}{v-1}e)\right)^{\epsilon N} \\ &\le e^{\left(\sum_{v=2}^N \frac{2\alpha}{v-1}\right)\epsilon N} \\ &\le e^{3k\epsilon N}.\end{align*} $\square$
$\textbf{Lemma 2}$: The probability that $\mathcal{G}_N$ has $5\epsilon N$ vertices so that removing them destroys all paths of length $L$ is at most $o(1)$ as $N \to \infty$.
Proof: Suppose that we can block $5\epsilon N$ vertices so that the longest path is of length at most $L-1$. By Lemma 1, the probability that there are $5\epsilon N$ vertices that have more than $ckN$ outgoing edges is at most ${N \choose 5\epsilon N} e^{15k\epsilon N}e^{-ckN} \le e^Ne^{15k\epsilon N}e^{-ckN} \le e^{-\frac{1}{2}ckN}$, which is $o(1)$, so we may suppose there are not $5\epsilon N$ such vertices. We then argue that our graph is $L$-tame, which will finish the proof of Lemma 2, in light of the Main Lemma.
Define $V_0$ to be the blocked vertices and the vertices of outdegree $0$, and, for $1 \le i \le L-1$, $V_i$ to be the vertices $v$ such that the longest unblocked path starting at $v$ has length $i$ (i.e., the number of edges in the longest path is $i$). Note that since the graph is acyclic, there can be no edges from $V_i$ to $V_j$ if $1 \le i \le j$. Therefore, the only edges from $V_i$ to $V_j$ with $i \le j$ are out-edges from the blocked points. Hence, property (1) of an $L$-tame graph is satisfied. And property (2) is satisfied by the definition of the $V_i$'s. $\square$
The proof that the counterexample works now immediately follows, since conditioning on an event that has probability at least $1/3$ still makes it so that the probability there are $5\epsilon N$ vertices that can be removed to destroy all paths of length $L$ is $o(1)$ as $N \to \infty$.
$\Large{\text{Proof of Main Lemma}}$
We use the crude union bound $$\Pr[\mathcal{G}_N \text{ is } \text{$L$-tame}] \le \sum_{[N] = V_0\sqcup\dots\sqcup V_{L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfies (1),(2)}].$$ For fixed $V_0,\dots,V_{L-1}$, the conditions (1) and (2) in the definition of $L$-tame refer to disjoint sets of edges, so by independence, we have $$\Pr[V_0,\dots,V_{L-1} \text{ works for } \mathcal{G}_N] = \Pr[V_0,\dots,V_{L-1} \text{ satisfy (1)}]\times \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}].$$
In the next section, we show $$\Pr[V_0,\dots,V_{L-1} \text{ satisfy (1)}] \le e^{-kN/160}$$ for each (fixed) partition $V_0,\dots,V_{L-1}$ of $[N]$. In the section after, we show that $$\sum_{[N] = V_0\sqcup\dots\sqcup V_{L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}] \le (N+1)^L e^{2N} k^N.$$ Combining the results of the two sections gives that $$\Pr[\mathcal{G}_N \text{ is } \text{$L$-tame}] \le e^{-kN/160}(N+1)^L e^{2N}k^N,$$ which clearly goes to $0$ as $N \to \infty$.
$\Large{\text{Bounding $\Pr[V_0,\dots,V_{L-1} \text{ satisfy (1)}]$}}$
In this section, fix a partition $[N] = V_0\sqcup\dots\sqcup V_{L-1}$. We shall show $$\Pr[V_0,\dots,V_{L-1} \text{ satisfy (1)}] \le e^{-kN/160}.$$ To this end, define an edge $\vec{e}$ to be \textit{bad} if it connects $u \to v$ with $u \in V_i$ and $v \in V_j$ where $i \le j$. Let $p_{\vec{e}} = \frac{\alpha}{v-u}$ be the probability of inclusion of the edge $\vec{e}$.
\vs
$\textbf{Lemma 3}$: It holds that $$\sum_{\vec{e} \text{ bad}} p_{\vec{e}} \ge \left[\frac{1}{16}\log N + 3\sum_{i=0}^{L-1} P_i\log P_i\right]\alpha N,$$ where $P_i := \frac{|V_i|}{N}$.
Proof: We induct on $N$ (for $N$ a power of $2$). For $N=1$, the inequality becomes $0 \ge 0$. Now suppose the inequality holds for $N$ (for any partition), and let's work with a partition of $[2N]$. Let $$P_i^- = \frac{|V_i\cap[1,N]|}{N}$$ $$P_i^+ = \frac{|V_i\cap[N+1,2N]|}{N},$$ so that, by the induction hypothesis, $$\sum_{\substack{\vec{e} \text{ bad} \\ \vec{e} \text{ internal}}} p_{\vec{e}} \ge \left[\frac{2}{16}\log N+3\sum_{i=0}^{L-1}\left(P_i^-\log P_i^- + P_i^+\log P_i^+\right)\right]\alpha N,$$ where an edge $\vec{e}$ is \textit{internal} if it lies in $[1,N]$ or in $[N+1,2N]$. Above, we just critically used the fact that $\frac{1}{j-i}$ weights all scales the same. Now, using the trivial $\frac{\alpha}{j-i} \ge \frac{\alpha}{2N}$, we have \begin{align*} \sum_{\substack{\vec{e} \text{ bad} \\ \vec{e} \text{ external}}} p_{\vec{e}} &\ge \frac{\alpha}{2N}\sum_{i \le j} |V_i\cap [1,N]|\cdot |V_j\cap [N+1,2N]| \\ &= \frac{1}{2}\alpha N\sum_{i \le j} P_i^-P_j^+.\end{align*} Thus, $$\sum_{\vec{e} \text{ bad}} p_{\vec{e}} \ge \left[\frac{1}{16}\log N + \frac{3}{2}\sum_{i=0}^{L-1}\left(P_i^-\log P_i^- + P_i^+\log P_i^+\right)+\frac{1}{4}\sum_{i \le j} P_i^-P_j^+\right]\alpha(2N),$$ and we wish to obtain $$\sum_{\vec{e} \text{ bad}} p_{\vec{e}} \ge \left[\frac{1}{16}\log(2N)+3\sum_{i=0}^{L-1} P_i\log P_i\right]\alpha(2N).$$ So, multiplying through by $2$, we just need to show $$3\sum_{i=0}^{L-1}\left(P_i^-\log P_i^-+P_i^+\log P_i^+-2 P_i\log P_i\right)+\frac{1}{2}\sum_{i \le j} P_i^-P_j^+ \ge \frac{1}{8}\log 2.$$ Note, for $x \in [-1,1]$, $(1+x)\log(1+x)+(1-x)\log(1-x) \ge x^2$ (Taylor expand), so, for any $x,y \ge 0$, we have \begin{align*} x\log x+y\log y-2(\frac{x+y}{2})\log(\frac{x+y}{2}) &= \frac{x+y}{2}\left[\frac{2x}{x+y}\log(\frac{2x}{x+y})+\frac{2y}{x+y}\log(\frac{2y}{x+y})\right] \\ &\ge \frac{x+y}{2}\left(\frac{x-y}{x+y}\right)^2 \\ &= \frac{(x-y)^2}{2(x+y)}.\end{align*} We thus obtain $$P_i^-\log P_i^-+P_i^+\log P_i^+ - 2P\log P \ge \frac{(P^--P^+)^2}{4P},$$ so it suffices to show $$3\sum_i \frac{(P_i^--P_i^+)^2}{4P_i}+\frac{1}{2}\sum_{i \le j} P_i^-P_j^+ \ge \frac{1}{8}\log 2.$$ Let $$I_1 = \{i : P_i^+ \le \frac{1}{2}P_i^-\}$$ $$I_2 = \{i : P_i^+ > \frac{1}{2} P_i^-\}.$$ For $i \in I_1$, we have $$P_i^--P_i^+ \ge \frac{1}{2} P_i^-$$ and $$P_i = \frac{P_i^-+P_i^+}{2} \le \frac{3}{4}P_i^-,$$ so $$3\sum_i \frac{(P_i^--P_i^+)^2}{4P_i} \ge 3\sum_{i \in I_1} \frac{\frac{1}{4}(P_i^-)^2}{3P_i^-} = \frac{1}{4}\sum_{i \in I_1} P_i^-.$$ Also, $$\frac{1}{2}\sum_{i \le j} P_i^- P_j^+ \ge \frac{1}{2}\sum_{\substack{i \le j \\ i,j \in I_2}} P_i^-P_j^+ \ge \frac{1}{4}\sum_{\substack{i \le j \\ i,j \in I_2}} P_i^-P_j^- \ge \frac{1}{8}\left(\sum_{i \in I_2} P_i^-\right)^2.$$ Letting $x = \sum_{i \in I_1} P_i^-$ and noting $\sum_{i \in I_2} P_i^- = 1-x$, we obtain $$3\sum_i \frac{(P_i^--P_i^+)^2}{4P_i}+\frac{1}{2}\sum_{i \le j} P_i^-P_j^+ \ge \frac{1}{4}x+\frac{1}{8}(1-x)^2 = \frac{1}{8}(1+x^2) \ge \frac{1}{8} \ge \frac{\log 2}{8}.$$ We are done. $\square$
$\textbf{Corollary}$: The probability that there are at most $ckN$ bad edges is at most $e^{-\frac{kN}{160}}$.
Proof: Let $\mathcal{P} = \sum_{\vec{e} \text{ bad}} p_{\vec{e}}$ denote the expected number of bad edges. By Lemma 3, we have $$\mathcal{P} \ge \left[\frac{1}{16}\log N-3\log L\right]\alpha N,$$ so for $N$ large enough, we have $$\mathcal{P} \ge \frac{1}{20}(\log N)\alpha N = \frac{kN}{20}.$$ Now, let $|E_b|$ denote the (random) number of bad edges. For $\tau > 0$, using independence, we have $$\mathbb{E} e^{\tau(\mathcal{P}-|E_b|)} = \prod_{\vec{e} \text{ bad}} \left[(1-p_{\vec{e}})e^{p_{\vec{e}}\tau}+p_{\vec{e}}e^{(p_{\vec{e}}-1)\tau}\right].$$
We claim that $f(\tau) := (1-p)e^{p\tau}+pe^{(p-1)\tau} \le e^{\frac{p}{2}\tau^2}$ for all $\tau \in [0,1]$, for any $p \in [0,1]$. Indeed, \begin{align*} f(\tau) &\le 1+\frac{1}{2}\left(\max_{\xi \in [0,\tau]} f''(\xi)\right)\tau^2 \\ &= 1+\frac{1}{2}\max_{\xi \in [0,\tau]} \left[(1-p)p^2e^{p\xi}+p(1-p)^2e^{(p-1)\xi}\right]\tau^2 \\ &\le 1+\frac{1}{2}\left(\max_{\xi \in [0,\tau)} p(1-p)e^{p\xi}\right)\tau^2 \\ &\le 1+\frac{p}{2}\tau^2 \\ &\le e^{\frac{p}{2}\tau^2}.\end{align*}
Therefore, for $\tau \in [0,1]$, we have $$\mathbb{E} e^{\tau(\mathcal{P}-|E_b|)} \le e^{\frac{\mathcal{P}}{2}\tau^2},$$ so $$\Pr\left[|E_b| \le ckN\right] \le \Pr\left[|E_b| \le \frac{\mathcal{P}}{2}\right] \le e^{\frac{\mathcal{P}}{2}(\tau^2-\tau)},$$ and taking $\tau = \frac{1}{2}$ gives an upper bound of $e^{-\frac{\mathcal{P}}{8}} \le e^{-\frac{kN}{160}}$. $\square$
$\Large{\text{Bounding $\sum_{[N] = V_0\sqcup\dots\sqcup V_{L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}]$}}$
We wish to show $$\sum_{[N] = V_0\sqcup\dots\sqcup V_{L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}] \le (N+1)^L e^{2N} k^N.$$ Fix some $N_0,\dots, N_{L-1} \in \{0,\dots,N\}$ that add up to $N$. Since there are at most $(N+1)^L$ choices for $(N_0,\dots,N_{L-1})$, it suffices to show $$\sum_{\substack{[N] = V_0\sqcup\dots\sqcup V_{L-1} \\ |V_i| = N_i \hspace{1.5mm} \forall 0 \le i \le L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}] \le e^{2N} k^N.$$ Note that we may rewrite $$\sum_{\substack{[N] = V_0\sqcup\dots\sqcup V_{L-1} \\ |V_i| = N_i \hspace{1.5mm} \forall 0 \le i \le L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}]$$ $$ = \sum_{V_0 \subseteq [N], |V_0| = N_0} \sum_{\substack{V_1 \subseteq [N], |V_1| = N_1 \\ V_1 \cap V_0 = \emptyset}} \dots \sum_{\substack{V_{L-1} \subseteq [N], |V_{L-1}| = N_{L-1} \\ V_{L-1} \cap V_j = \emptyset \hspace{1.5mm} \forall 0 \le j \le L-2}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}].$$
We write $V_{j+1} \rightrightarrows V_j$ if each vertex in $V_{j+1}$ has an edge to $V_j$, so that the above becomes $$\sum_{V_0 \subseteq [N], |V_0| = N_0} \sum_{\substack{V_1 \subseteq [N], |V_1| = N_1 \\ V_1 \cap V_0 = \emptyset}} \dots \sum_{\substack{V_{L-1} \subseteq [N], |V_{L-1}| = N_{L-1} \\ V_{L-1} \cap V_j = \emptyset \hspace{1.5mm} \forall 0 \le j \le L-2}} \Pr[V_{j+1} \rightrightarrows V_j \hspace{1.5mm} \hspace{1.5mm} \forall 0 \le j \le L-2],$$ which by repeated application of independence is equal to $$\sum_{V_0 \subseteq [N], |V_0| = N_0} \sum_{\substack{V_1 \subseteq [N], |V_1| = N_1 \\ V_1 \cap V_0 = \emptyset}} \dots \sum_{\substack{V_{L-1} \subseteq [N], |V_{L-1}| = N_{L-1} \\ V_{L-1} \cap V_j = \emptyset \hspace{1.5mm} \forall 0 \le j \le L-2}} \prod_{j=0}^{L-2} \Pr[V_{j+1} \rightrightarrows V_j].$$
$\textbf{Lemma 4}$: Suppose $V_0,\dots,V_j$ are fixed, pairwise-disjoint subsets of $[N]$ with $|V_i| = N_i$ for $0 \le i \le j$. Then $$\sum_{\substack{V_{j+1} \subseteq [N], |V_{j+1}| = N_{j+1} \\ V_{j+1} \cap V_i = \emptyset \hspace{1.5mm} \forall 0 \le i \le j}} \Pr[V_{j+1} \rightrightarrows V_j] \le e^{N_{j+1}}k^{N_{j+1}}\left(\frac{N_j}{N_{j+1}}\right)^{N_{j+1}}.$$
Proof: Note $$\Pr[V_{j+1} \rightrightarrows V_j] = \prod_{u \in V_{j+1}}1\left[u \text{ has an edge to } V_j\right] \le \prod_{u \in V_{j+1}}\left(\sum_{v \in V_j} 1\left[u \text{ has an edge to } v\right]\right).$$ For $u \in V_{j+1}$, let $\sigma(u) = \sum_{v \in V_j} 1[u \text{ has an edge to } v]$. Then, \begin{align*}\sum_{\substack{V_{j+1} \subseteq [N], |V_{j+1}| = N_{j+1} \\ V_{j+1} \cap V_i = \emptyset \hspace{1.5mm} \forall 0 \le i \le j}} \prod_{u \in V_{j+1}} \sigma(u) &\le \frac{1}{N_{j+1}!}\left(\sum_{u \in \{1,\dots,N\}} \sigma(u)\right)^{N_{j+1}} \\ &\le \frac{1}{N_{j+1}!}\left(kN_j\right)^{N_{j+1}}.\end{align*} The Lemma then follows from Stirling's approximation. $\square$
Repeated application of Lemma 4 thus gives \begin{align*}\sum_{\substack{[N] = V_0\sqcup\dots\sqcup V_{L-1} \\ |V_i| = N_i \hspace{1.5mm} \forall 0 \le i \le L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}] &\le e^{N_1+\dots+N_{L-1}} k^{N_1+\dots+N_{L-1}}\prod_{j=0}^{L-2} \left(\frac{N_j}{N_{j+1}}\right)^{N_{j+1}} \\ &\le e^N k^N \prod_{j=0}^{L-2} e^{N_j} \\ &\le e^{2N}k^N,\end{align*} where we used $N_j/N_{j+1} \le e^{N_j/N_{j+1}}$.