Calculating the Expectation and Variance of having pairs after selecting N people

n pairs of parents attended a parent’s meeting ( 2n people in total). In the meeting, a delegation of 2k people was randomly chosen $2n \ge 2k \ge 4$ out of the 2n people. Calculate the expectation and variance of the number of pairs of parents that were chosen for the delegation.

I'm trying to solve this question and I'm not confident at all about my attempt.. it feels like I'm shooting in the dark and doing things I don't really understand fully, plus I'm having a hard time with the algebra of calculating the variance so I thought I'll put it on hold until I get the expectation right

My attempt:

Let $X,X_{i}$ such that

$X = $Numer of pairs of parents that were chosen for the deligation

$X_{i} = \left\{ \begin{array}{lcl} 1 \ \ \ if \ person \ i's \ > partner \ is \ selected\\ 0 \ \ \ else\\ \end{array} \right.$

Note that $X\ =\frac{1}{2}\ \sum_{i=1}^{2k}X_{i}$, also note that $X_{i\ }\sim HG\left(2n,2,2k\right)$, as we are choosing 2k times and are looking for 2 specials (a pair) out of a population of 2n.

$E\left(X\right)=E\left(\frac{1}{2}\sum_{i=1}^{2k}X_{i}\right)=\frac{1}{2}\sum_{i=1}^{2k}E\left(X_{i}\right)=\frac{1}{2}\sum_{i=1}^{2k}2k\cdot\frac{2}{2n}=\frac{2}{4n}\cdot\left(2k\right)^{2}$

first of all, I'm not sure that $X=X_{i}$, the whole notion of indicators is very confusing to me.. I'm also not confident that $X_{i}$ has an HG distribution. I was wondering if I could get tips on how to approach these type of questions, as I feel quiet lost


The following is a derivation of the expected value of the number of pairs, using the indicator method.

Number the pairs from $1$ to $n$, and define $$X_i = \begin{cases} 1 \qquad \text{if pair i is included in the delegation} \\ 0 \qquad \text{otherwise} \end{cases}$$ for $1 \le i \le n$. Then $$P(X_i = 1) = \frac{(2k)(2k-1)}{(2n)(2n-1)}$$ so applying linearity of expectation, the expected number of pairs in the delegation is $$E \left( \sum_{n=1}^n X_i \right) = \sum_{n=1}^n E(X_i) = n \cdot \frac{(2k)(2k-1)}{(2n)(2n-1)} = \frac{k(2k-1)}{2n-1}$$ Just a hint on finding the variance by the indicator method: the first step is to find $E(\sum_{i<j} X_i X_j)$.


I think an approach based on induction would work here. The trick is that even though the size of the delegation is specified to be even in the question, the delegation could in principle have any size from $0$ to $2n$. I'll let $d$ stand for the size of the delegation, so $d=2k$ for the original question.

Let $X$ be the number of pairs of parents in the delegation. For arbitrary $n, d$, we'll define the mean of $X$ to be $M(n,d)$. I'll stick with computing the mean here, getting the variance should be similar, but harder.

We have some base cases from observing that a delegation of 1 contains no pairs, and a delegation of $2n-1$ must contain all pairs but one, i.e. $n-1$ pairs. In equations:

$$ \begin{matrix} M(n,0) = 0 \\ M(n,1) = 0 \\ M(n,2n-1) = n-1 \\ M(n,2n) = n \end{matrix} $$

Those are the base cases.

Suppose $1 < d < n-1$. Then consider one random pair of parents. There are 3 cases:

Case 1: Both parents are in the delegation. This has probability $p=\frac{d}{2n}\frac{d-1}{2n-1}$

Case 2: Both parents are not in the delegation. This has probability $p=\frac{2n-d}{2n}\frac{2n-d-1}{2n-1}$

Case 3: One parent is in the delegation, and one is not in the delegation. This has probability $p=\frac{d}{2n}\frac{2n-d}{2n-1} + \frac{2n-d}{2n}\frac{d}{2n-1} = 2\frac{d}{2n}\frac{2n-d}{2n-1}$

This gives a recurrent relation for $M(n,d)$. We just weight the 3 cases by their probabilities, and use the result previously computed for $n-1$ pairs of parents. In Case 1, we add 1 to the previous result, since the current pair of parents is in the delegation.

$$ M(n,d) = \frac{1}{2n(2n-1)} \begin{pmatrix} d(d-1)(1+M(n-1, d-2)) \\ + 2d(2n-d)M(n-1, d-1) \\ + (2n-d)(2n-d-1)M(n-1,d) \end{pmatrix} $$

This allows us to build up a table for $M$ starting from $n=0,d=0$. in the top left, and with $n$ increasing down the table and $d$ increasing across.

$$ \begin{matrix} 0 \\ 0 & 0 & 1 \\ 0 & 0 & & 1 & 2 \\ \end{matrix} $$

Here we must find the first nontrivial value of the table, $M(2,2)$.

$$ M(2,2) = \frac{1}{4\times 3}(2\times1(1+M(1,0)) + 2\times2\times 2 M(1,1) + 2\times1M(1,2)) = \frac{1}{12}(2+0+2) = \frac{1}{3} $$

Resulting table:

$$ \begin{matrix} 0 \\ 0 & 0 & 1 \\ 0 & 0 & \frac{1}{3} & 1 & 2 \\ \end{matrix} $$

Through the magic of python, we can generate a much larger table and look for patterns:

$$ \begin{matrix} 0 \\ 0 & 0 & 1 \\ 0 & 0 & \frac{1}{3} & 1 & 2 \\ 0 & 0 & \frac{1}{5} & \frac{3}{5} & \frac{6}{5} & 2 & 3 \\ 0 & 0 & \frac{1}{7} & \frac{3}{7} & \frac{6}{7} & \frac{10}{7} & \frac{15}{7} & 3 & 4 \\ 0 & 0 & \frac{1}{9} & \frac{1}{3} & \frac{2}{3} & \frac{10}{9} & \frac{5}{3} & \frac{7}{3} & \frac{28}{9} & 4 & 5 \\ \end{matrix} $$

The formula here is fortunately easy to guess: it's just triangular numbers divided by odd numbers:

$$ M(n,d) = \frac{d(d-1)}{2(2n-1)} $$

Going back and proving that this satisfies the recurrence relation is left as an exercise.