Ballot counting when ties occurs exactly $r$ times

Ballot Problem with Fixed Number of Ties:

Problem Statement: In an election, candidate A receives $m$ votes and candidate B receives $n$ notes. Let $m \ge n$. In how many ways can the ballots be counted so that ties occur exactly $r$ times ($r \le n$)?

Example: $m = 3, n = 2, r = 1$. There are 4 such scenarios: for order ABAAB, BAAAB the tie occurs at the 2nd vote; for order AABBA, BBAAA the tie occurs at the 4th vote.

Call this number $Z( m,n, r)$. My practical goal is to answer the following: Given the total number of votes $N$, how many ways can we order the ballots to have exactly $r$ ties? It is \begin{equation} Z(N,r) = \sum_{n = r}^{N/ 2} Z( N-n, n ,r ) + \sum_{n = r}^{N/ 2} Z( n, N-n ,r ) \end{equation} once we figure out $Z(m,n,r)$.

It would be nice if one can get $Z(N,r)$ or its generating function directly, but I think the "$r$-tie" ballot problem is interesting on its own.

Mapping to lattice path enumeration

Here is an equivalent form of the problem:

Consider lattice path from $(0,0)$ to point $(m,n)$ with only up $(0,1)$ or right $(1,0)$ movement in $m +n $ steps. Excluding the starting point $(0,0)$, how many lattice paths hit the (lattice point on the) diagonal line $y = x$ exactly $r$ times?

The following figure shows a lattice path that reaches (4,3) and its image under Andre's reflection. It hits the diagonal 3 times.

lattice path

$r = 0$ is the strictly monotonic path

The $r = 0$ problem can be solved by Andre's reflection principle. (method used here is similar to this ME post)

First of all $m > n$, otherwise the end point $(n,n)$ will be one touching point. Since no touching of the diagonal is allowed, the first step must go right. There are ${ m + n - 1 \choose m-1}$ such paths but we need to exclude the ones that touches the diagonal.

Suppose there is one such path (first step going right) that touches the diagonal, we can do reflection about the $ y= x$ line for the path between the origin and the first intersection (see Figure). The reflected path is one that starts with up movement. Since such paths will definite cross the diagonal line to reach $(m,n)$, it is easy to see that the map is one-to-one. There are ${m+n -1 \choose n - 1}$ such paths.

In sum, the result is \begin{equation} Z( m, n, 0) = { m + n - 1 \choose m - 1 } - {m+n -1 \choose n - 1} = { m + n - 1 \choose n } - {m+n -1 \choose n - 1} \quad m > n \end{equation} which is very similar to the ballot theorem.


Any idea of approaching the general $r$?

Thanks!


Update 1: $m = n, r = 1$ is the strictly monotonic Dyck path

The only intersection is at the end point $(n,n)$, so this is basically a Dyck path that hits the diagonal once. This ME post gives the general result for a Dyck path to hit the diagonal $k$ times. Setting $k$ = 1, the result is $\frac{1}{n}{2n - 2 \choose n-1 }$. Here the path can be either above or below the diagonal line, hence a factor of $2$, \begin{equation} Z( n, n, 1 ) = \frac{2}{n}{2n - 2 \choose n-1 } \end{equation}


Update 2: number of lattice paths that crosses the diagonal $k$ times

Call this number $C( m, n, r )$.

Kern, Malcolm; Walter, Stanley, Ballot theorem and lattice path crossings, Can. J. Stat. 6, 87-90 (1978). ZBL0387.60018. gives the result \begin{equation} C( m,n, r ) = \left\lbrace \begin{aligned} &\frac{2(r+1)}{n} {2n \choose n - r -1 } & m = n \\ &\frac{m - n + 2r + 1}{m + n + 1} { m + n +1 \choose n - r } & m > n \\ \end{aligned} \right. \end{equation}

This is a relevant but different problem. Notice the difference between "touch" and "cross". Only going from one side of $y=x$ to the other is considered as a "cross". In the ballot problem, this is a change of the leading candidate, not a tie. One can check that $C( n, n, 0 )$ is twice the Catalan number.


Solution 1:

I got it! It can be solved by using generating function.

solution(updated)

\begin{equation} Z( m,n, r ) = 2^r \frac{m - n + r}{m + n -r} {m + n - r \choose n - r} \quad m \ge n \end{equation}

check:

$m = 3, n = 2, r = 1$, \begin{equation} Z( 3, 2, 1 ) = 2^1 \frac{3 - 2 +1}{3+2-1} {3 + 2 -1 \choose 2 - 1} = 2 \frac{2}{4} { 4 \choose 1 } = 4 \rightarrow \text{correct} \end{equation}

$m = 5, n = 3, r = 2$, this has been enumerated by @MarkusScheuer (thanks!) \begin{equation} Z( 5, 3, 2 ) = 2^2 \frac{5 - 3 + 2}{5 + 3 - 2} { 5 + 3 - 2 \choose 3 - 2} = 4 \frac{4}{6} { 6 \choose 1 } = 16 \rightarrow \text{correct} \end{equation}

diagonal case

enter image description here

Consider $Z(n,n,2 )$. Now we have two hits on the diagonal. Let the first hit be $(k,k)$, there are $Z( k, k, 1 ) Z( n-k, n-k, 1)$ possibilities. By enumerating the first hit, we have \begin{equation} Z( n,n, 2 ) = \sum_{k=1}^{n-1} Z( k, k ,1 ) Z( n -k, n-k ,1 ) \end{equation} By using the generating function \begin{equation} f(x) = \sum_{k=0}^{\infty} Z( k, k, 1 ) x^k = \sum_{k=0}^{\infty} 2 \frac{1}{k} {2k -2 \choose k - 1} = 1 - \sqrt{ 1- 4x} \end{equation} we have \begin{equation} Z( n, n, 2 ) = \text{coefficient of }x^n \text{ in } [f(x)]^2 \end{equation}

Similarly, $Z( n,n, r )$ is the coefficient of $x^n$ in $[f(x)]^r$.

This result is a special case of the more general result in Spivey, Michael Z., Enumerating lattice paths touching or crossing the diagonal at a given number of lattice points, Electron. J. Comb. 19, No. 3, Research Paper P24, 6 p. (2012). ZBL1253.05020., \begin{equation} Z( n, n, r ) = 2^r \left[{ 2n - r - 1 \choose n - r } - { 2n - r - 1 \choose n - r -1 }\right] = \frac{r 2^r}{2n - r }{ 2n - r \choose n - r } \end{equation}

Let me check it against my result. The generating function for $Z(n ,n, r )$ is \begin{equation} \begin{aligned} f_r(x) &= \sum_{k =0}^{\infty} 2^r \left[{ 2k - r - 1 \choose k - r } - { 2k - r - 1 \choose k - r - 1 }\right] x^k \\ &= 2^r x^r \sum_{k=0}^{\infty} { 2k + r - 1 \choose k } - 2^r x^{r+1} \sum_{k=0}^{\infty} { 2k + r +1 \choose k }\\ &= 2^r x^r [ {}_2 F_1( \frac{r}{2}, \frac{r+1}{2}; r; 4x ) - x {}_2 F_1( \frac{r}{2}+1, \frac{r+3}{2}; r+2; 4x )]\\ &= \left[\frac{4x}{1 + \sqrt{1-4x}} \right]^r \text{ by Mathematica}\\ &= [ 1 - \sqrt{ 1- 4x} ]^r = [f(x)]^r \rightarrow \text{ correct! } \end{aligned} \end{equation} where I used the relation \begin{equation} \sum_{k=0}^{\infty} { 2k + a \choose k } x^k = {}_2 F_1( \frac{a+1}{2}, \frac{a+2}{2}; a+1; 4x ) \end{equation}

non-diagonal case

enter image description here

For $m >n$, we just need to enumerate the last hitting point \begin{equation} Z( m, n, r ) = \sum_{k= r}^{n} Z( k, k, r ) Z( m -k, n-k , 0 ) \end{equation}

The suitable choice of generating function for the last segment (without hitting the diagonal) is \begin{equation} g(x, a) = \sum_{k=0}^{\infty} Z( k, k-a, 0 ) x^k \end{equation} so that $Z( m,n, r) $ is the coefficient of $x^m$ in $[f(x)]^r g(x, m -n ) $.

The $g(x, a )$ function can be calculated similarly to the diagonal case, \begin{equation} g(x, a ) = \sum_{k = 0}^{\infty} { 2k - a -1 \choose k-a} x^k - \sum_{k = 0}^{\infty} { 2k - a -1 \choose k -a - 1 }x^k = \frac{(1 - \sqrt{ 1- 4x })^a}{2^a } \end{equation} hence $Z( m,n, r )$ is the coefficient of $x^m$ in \begin{equation} \frac{( 1 -\sqrt{ 1- 4x })^{r + m - n} }{2^{m-n}} \end{equation} which can be constructed from the diagonal case $$\begin{equation} \begin{aligned} Z( m,n, r ) &= \frac{1}{2^{m-n} }Z( m, m, r+ m-n ) \\ &= 2^r \left[{m + n - r -1 \choose n - r} - {m + n - r -1 \choose n - r - 1}\right]\\ &= 2^r \frac{m - n + r }{m + n -r} {m + n - r \choose n - r} \end{aligned} \end{equation}$$

This also works for the diagonal case.

elementary check

For fixed point $(m, n)$, the number of lattice path without restriction is \begin{equation} \begin{aligned} \sum_{r=0}^{n} Z( m,n,r ) &= \sum_{r=0}^n 2^r \left[{m + n - r -1 \choose n - r} - {m + n - r -1 \choose n - r - 1}\right] \\ &= {m + n -1 \choose n } - {m + n -1 \choose n - 1} + 2 \left[{m + n -2 \choose n-1 } - {m + n -2 \choose n - 2} \right] + \cdots \\ &= {m + n -1 \choose n } + {m + n -2 \choose n-1 } \\ & \quad - 3 {m + n -2 \choose n - 2} + 4 \left[{m + n -3 \choose n-2 } - {m + n -3 \choose n - 4} \right]\\ &= \sum_{r=0}^{n} {m + n -r -1 \choose n - r } = \sum_{k=0}^{n}{ m-1 +k \choose m-1 } = { m \choose n } \rightarrow \text{correct} \end{aligned} \end{equation}

fixed length problem

For simplicity, let the total length $n + m $ to be an odd integer $2k+1$. Since the $ m < n$ case is the reflection of $ m > n$, we can just count the later case and multiply by $2$. The number of all such lattice paths that hits the diagonal $r$ times is \begin{equation} \begin{aligned} Z( 2k+1, r ) &= 2\sum_{n = r}^{k} Z( 2k +1 -n, n, r ) = 2^{r+1} \sum_{n=r}^k\left[ { 2k - r \choose n - r } - { 2k -r \choose n - r -1 } \right] \\ &= 2^{r+1} { 2k -r \choose k - r } \end{aligned} \end{equation}

Solution 2:

Note: [2017-09-17] Essential simplification due to usage of change of variable formula.


We consider the problem in terms of lattice paths with up-steps $(1,1)$ and down-steps $(1,-1)$. We use as basic building blocks Dyck paths which are paths of length $2t (t\geq 0)$ from $(0,0)$ to $(2t,0)$ which do not go beyond the $x$-axis.

The number of Dyck paths of length $2t$ is the Catalan-number $C_t=\frac{1}{t+1}\binom{2t}{t}$ with generating function \begin{align*} c(z)&=\frac{1}{2z}\left(1-\sqrt{1-4z}\right)\\ &=\sum_{j=0}^\infty C_j z^j=\sum_{j=0}^\infty \frac{1}{j+1}\binom{2j}{j}z^j\\ &=1+z+2z^2+5z^3+14z^4+42z^5+132z^6+429z^7+\cdots \end{align*} For example the coefficient $[z^5]c(z)=42$ tells us there are $42$ paths from $(0,0)$ to $(10,0)$ which do not go beyond the $x$-axis.

We want to count the number $N(r,m,n)$ of paths starting from $(0,0)$ with $m$ up-steps, $n$ down-steps which touch (besides $(0,0)$) the $x$-axis $r\geq 1$ times with \begin{align*} 1\leq r\leq n\leq m \end{align*}

Special case: $r=1$ and $n=m$

Since $m=n$ the number of up-steps is equal the number of down-steps. We consider paths of length $2n$ starting at $(0,0)$ and ending at $(2n,0)$ which do not touch the $x$-axis in between.

We find a generating function based upon Dyck-paths:

  • We start with an up-step
  • then we take all Dyck-paths of length $2n-2$ and

  • finish with a down-step,

  • or we start with a down-step, take all Dyck-paths of length $2n-2$ and finish with an up-step.

This way it is guaranteed that we never touch or cross the $x$-axis between start and end point.

The corresponding generating function is $2zc(z)$, with $z$ representing the starting up-, resp. starting down-step. The factor $2$ respects two possibilities up, resp. down. The number $N(1,n,n)$ is therefore \begin{align*} \color{blue}{N(1,n,n)}&=[z^n](2zc(z))=2[z^{n-1}]c(z)=2C_{n-1}\\ &=\frac{2}{n}\binom{2n-2}{n-1}\\ &\color{blue}{=\frac{2}{2n-1}\binom{2n-1}{n}} \end{align*}

Special case: $1\leq r\leq n=m$

Concatenation of $r$ paths as described in the step before gives paths from $(0,0)$ to $(2n,0)$ which touch the $x$-axis $r$ times, the last time at $(2n,0)$. Concatenation means multiplication of the corresponding generating functions and we so obtain

\begin{align*} \color{blue}{N(r,n,n)}&=[z^n](2zc(z))^r=[z^n](1-\sqrt{1-4z})^r\tag{1}\\ \end{align*}

In order to derive the coefficients of the generating function in (1) we use the following

Change of variable formula: Let $f(z)$ be a Laurent series and $g(w)$ be a power series, $g(w)=g_1w+g_2w^2+\cdots$, where $g_1\ne 0$. Then \begin{align*} \color{blue}{[z^{-1}]f(z)=[w^{-1}]f(g(w))g^\prime(w)}\tag{2} \end{align*}

See e.g. p.12 of this presentation by Ira Gessel.

We use the transformation \begin{align*} z=\frac{w}{(1+w)^2}\qquad\qquad\frac{dz}{dw}=\frac{1-w}{(1+w)^3} \end{align*} and get \begin{align*} 1-\sqrt{1-4z}=\frac{2w}{1+w} \end{align*}

We obtain from (2) \begin{align*} \color{blue}{[z^p]\left(1-\sqrt{1-4z}\right)^q} &=[z^{-1}]z^{-p-1}\left(1-\sqrt{1-4z}\right)^q\\ &=[w^{-1}]\left(\frac{w}{(1+w)^2}\right)^{-p-1}\left(\frac{2w}{1+w}\right)^q\frac{1-w}{(1+w)^3}\\ &=2^q[w^{p-q}](1+w)^{2p-q-1}(1-w)\\ &=2^q\left[\binom{2p-q-1}{p-q}-\binom{2p-q-1}{p-q-1}\right]\\ &=2^q\left[\binom{2p-q}{p-q}-2\binom{2p-q-1}{p-q-1}\right]\\ &=2^q\left[\binom{2p-q}{p-q}-2\cdot\frac{p-q}{2p-q}\binom{2p-q}{p-q}\right]\\ &=\color{blue}{\frac{2^qq}{2p-q}\binom{2p-q}{p-q}}\tag{3} \end{align*}

In case $N(r,n,n)$ we set in (3) $p=n$ and $q=r$ and obtain

We obtain with (2) \begin{align*} \color{blue}{N(r,n,n)}&=[z^n]\left(1-\sqrt{1-4z}\right)^r\\ &=\color{blue}{\frac{2^rr}{2n-r}\binom{2n-r}{n-r}}\tag{4} \end{align*}

One additional aspect is to consider for the general case.

General case: $1\leq r\leq n\leq m$

The general case $1\leq r \leq n< m$ means we have $r$ blocks as above which uses at least $r$ up-steps, one for each block and at most $n$ up-steps within the $r$ blocks, i.e. before we touch the $x$-axis the last time.

In general we use $r\leq t\leq n$ up-steps within the $r$ blocks, leaving $m-t$ up-steps and $n-t$ down-steps. Since $m\geq n$ the number $m-t$ of up-steps is greater or equal than the number of $n-t$ down-steps.

The $m-t$ up-steps and $n-t$ down steps are used to build paths from $(0,0)$ to \begin{align*} (m-t)(1,1)+(n-t)(1,-1)=(m+n-2t,m-n) \end{align*} which start with an up-step and stay above the $x$-axis. According to Andre's reflection lemma is the number of paths from $(0,0)$ to $(m+n-2t,m-n)$ which do not cross the $x$-axis given as \begin{align*} \frac{m-n}{m+n-2t}\binom{m+n-2t}{m-t}\tag{5} \end{align*}

In (4) we do not have a factor $2^{m-n}$ since we consider only paths which are beyond the $x$-axis.

Note:

  • Observe the structure of the binomial coefficient in (5) is the same as in (2) with $p=m-t$ and $q=m-n$.

  • Combinatorially this means the number of paths of length $m+n$ which start in $(0,0)$ with an up-step and end in $(m,m-n)$ at height $m-n=r$ and never touch the $x$-axis is the same as the number of paths of length $2m$ from $(0,0)$ to $(2n,0)$ which start with an up-step go never beyond the $x$-axis and touch the $x$ axis $r$ times, the last time at $(2n,0)$.

  • Alebraically this implies \begin{align*} [z^{m-t}]\left(\frac{1-\sqrt{1-4z}}{2}\right)^{m-n} =\frac{m-n}{m+n-2t}\binom{m+n-2t}{m-t}\tag{6} \end{align*}

We finally conclude from (4) and (6):

The number $N(r,m,n)$ of valid paths is \begin{align*} \color{blue}{N(r,m,n)}&=\sum_{t=r}^n[z^t]\left(2zc(z)\right)^r [z^{m-t}](zc(z))^{m-n}\\ &=\frac{1}{2^{m-n}}\sum_{t=r}^n[z^t]\left(2zc(z)\right)^r [z^{m-t}](2zc(z))^{m-n}\\ &=\frac{1}{2^{m-n}}[z^m](2zc(z))^{m-n+r}\\ &=\frac{1}{2^{m-n}}[z^m](1-\sqrt{1-4z})^{m-n+r}\\ &\color{blue}{=2^r\frac{m-n+r}{m+n-r}\binom{m+n-r}{n-r}}\\ \end{align*}

Example: $N(1,3,2)$

Let's calculate OPs example $N(1,3,2)$ counting the number of paths which touch the $x$-axis $r=1$ time consisting of $m=3$ up-steps and $n=2$ down-steps. We obtain from (3) \begin{align*} \color{blue}{N(1,3,2)}&=2\cdot\frac{2}{4}\binom{4}{1} \color{blue}{= 4} \end{align*} Encoding up-steps with $U$ and down-steps with $D$ the valid paths are

\begin{align*} D\ D\ U\ \color{blue}{U}\ U\\ D\ \color{blue}{U}\ U\ U\ D\\ U\ \color{blue}{D}\ U\ U\ D\\ U\ U\ D\ \color{blue}{D}\ U\\ \end{align*}

Blue characters mark touching the $x$-axis.

Example: $N(2,5,3)$

One more example with paths consisting of $5$ up-steps and $3$ down-steps touching the $x$-axis twice. We obtain from (3) \begin{align*} \color{blue}{N(2,5,3)}&=2^2\cdot\frac{4}{6}\binom{6}{1} \color{blue}{= 16} \end{align*}

The $16$ valid paths

\begin{align*} \ D\ D\ U\ \color{blue}{U}\ D\ \color{blue}{U}\ U\ U\qquad&\qquad U\ \color{blue}{D}\ D\ D\ U\ \color{blue}{U}\ U\ U\\ D\ D\ U\ \color{blue}{U}\ U\ \color{blue}{D}\ U\ U\qquad&\qquad U\ \color{blue}{D}\ D\ \color{blue}{U}\ U\ U\ D\ U\\ D\ \color{blue}{U}\ D\ D\ U\ \color{blue}{U}\ U\ U\qquad&\qquad U\ \color{blue}{D}\ D\ \color{blue}{U}\ U\ U\ U\ D\\ D\ \color{blue}{U}\ D\ \color{blue}{U}\ U\ U\ D\ U\qquad&\qquad U\ \color{blue}{D}\ U\ \color{blue}{D}\ U\ U\ D\ U\\ D\ \color{blue}{U}\ D\ \color{blue}{U}\ U\ U\ U\ D\qquad&\qquad U\ \color{blue}{D}\ U\ \color{blue}{D}\ U\ U\ U\ D\\ D\ \color{blue}{U}\ U\ \color{blue}{D}\ U\ U\ D\ U\qquad&\qquad U\ \color{blue}{D}\ U\ U\ D\ \color{blue}{D}\ U\ U\\ D\ \color{blue}{U}\ U\ \color{blue}{D}\ U\ U\ U\ D\qquad&\qquad U\ U\ D\ \color{blue}{D}\ D\ \color{blue}{U}\ U\ U\\ D\ \color{blue}{U}\ U\ U\ D\ \color{blue}{D}\ U\ U\qquad&\qquad U\ U\ D\ \color{blue}{D}\ U\ \color{blue}{D}\ U\ U\\ \end{align*}

Blue characters mark touching the $x$-axis.

Solution 3:

I came up with a visual proof of \begin{equation} Z(m,n, r ) = 2^r \frac{m - n + r}{m + n - r} { m + n - r \choose n- r } \end{equation} that is mush easier than my generating function approach. I think it deserves a separate answer.

The $2^r$ factor

Divide a lattice path of $r$ intersections with the diagonal into $r$ segments where the $i$-th segment is between the $i$-th intersection point and the $i+1$-th.

Each segment is a lattice path that intersects the diagonal only at the end point and there are two possibilities: going above or blow the $ y = x$ line. Hence we can consider only the lattice path below $y = x$ (shown in left figure below). Call the corresponding number of lattice paths to be $Z_{-}( m,n, r )$, we have \begin{equation} Z( m, n, r ) = 2^r Z_{-} ( m, n, r ) \end{equation}

enter image description here

Reduction to $r = 0$

Take a lattice path below the line $ y= x$. At all the intersection points, the preceding step must be going up (labeled blue in the left figure).

If we remove those blue steps, we end up with a path on the right figure that starts from $(0,0)$ and ends at $(m, n - r)$. This path will not hit the diagonal $y = x$. On the other hand, we can also reconstruct the original path by drawing backwards and complementing a blue step when hitting diagonal. For the example in the right figure, we first move the last red step "going right" to the place between $(m-1,n)$ and $(m,n)$. If it hits the diagonal(it does), we complement a blue step; otherwise we pick up the previous red step.

This procedure establish a one-to-one correspondence between the path from $(0,0)$ to $(m,n)$ below $ y= x$ that hits the diagonal $r$ times, with the path from $(0,0)$ to $( m, n-r)$ that does not hit the diagonal at all.

Equivalently, \begin{equation} Z_{-} ( m, n, r ) = Z( m, n -r, 0) \end{equation}

Conclusion

We therefore conclude that \begin{equation} Z(m,n, r ) = 2^r Z(m,n-r, 0) = 2^r \left[ { m + n - r - 1\choose n- r }-{m + n -r -1 \choose n- r - 1 }\right] = 2^r \frac{m - n + r}{m + n - r} { m + n - r \choose n- r } \end{equation}