Probability every hyperplane contains at most $m/2$ vectors

Solution 1:

I will give several bounds on the desired probability $p(m,n)$ and sketch their proofs.

Note: I assume that vectors in $X$ are counted with “repetitions/multiplicities”.

Claim 1. $p(m,n) \leq p(m, n-1)$.

Proof: Note that if $X$ satisfies the condition for given $m$ and $n$ then the restriction of vectors in $X$ to the first $n-1$ coordinates satisfies the condition for $m$ and $n' = n-1$. Thus $p(m,n) \leq p(m, n-1)$. It's not hard to see that the inequality is strict when $p(m,n-1) > 0$. QED

Claim 2. Suppose that $X$ satisfies the condition of the problem. Let $i$ and $j$ be distinct indices from $1$ to $n$. Then $$\sum_{u\in X} u_i u_j = 0$$ here, $u_i$ and $u_j$ are $i$-th and $j$-th coordinates of $u$, respectively.

Proof: Assume to the contrary that $\sum_{u\in X} u_i u_j \neq 0$ for some $i$ and $j$. Let us say that $\sum_{u\in X} u_i u_j > 0$. Note that $u_i u_j \in \{\pm 1\}$ for all $u\in \{\pm 1\}^n$. Let $Y$ be the set of vectors $u \in X$ with $u_i u_j = 1$; that is, $Y$ is the set of vector $u\in X$ such that either $u_i=u_j = 1$ or $u_i = u_j = -1$. Then $$0 < \sum_{u\in X} u_i u_j = |Y| - (|X| - |Y|) = 2|Y| -m$$ and thus $|Y| > m/2$.

Now consider $v=e_i - e_j$, where $e_i$ and $e_j$ are standard basis vectors. Note that $\langle v, u\rangle = 0$ for every $u\in Y$. Thus $v$ is orthogonal to more than $m/2$ vectors in $X$. We get a contradiction. The case when $\sum_{u\in X} u_i u_j < 0$ is similar: we consider $Y=\{u\in X: u_i u_j = -1\}$ and let $v=e_i + e_j$.

QED

Remark. Write an $n\times m$ matrix $U$ formed by vectors $u\in X$: $$U = \left(x_1 x_2 \cdots x_m\right), \text{ where } X = \{x_1, \dots, x_m\},$$ Claim 2 says that if $X$ satisfies the condition of the problem then rows of $U$ are mutually orthogonal. Thus $p(m,n)$ is upper bounded by the probability that all rows of a random $n\times m$ matrix with $\pm 1$ entries are mutually orthogonal.

The product of each two rows has binomial distribution since it is the sum of $m$ independent $\pm 1$-Bernoulli random variables. Thus the probability that two rows are orthogonal is $\binom{m}{m/2}/2^m \approx \sqrt{\frac{2}{\pi\,m}}$ if $m$ is even and $0$ otherwise. We know that if $X$ satisfies the condition of the problem then every two rows are orthogonal. If the events for different pairs were independent, we would get an upper bound of $$\left(\binom{m}{m/2}/2^m\right)^{n(n-1)/2} \approx 1/m^{n(n-1)/4},$$ where $n(n-1)/2$ is the number of pairs of rows. But these events are not independent. So we use below a slightly different argument.

Claim 3. $$p(m,n) \leq C_n /m^{n(n-1)/4}$$ where $C_n$ depends only on $n$.

Proof: For $1 \leq i < j \leq n$, define $\xi_{ij}(u) = u_i u_j$. Then $\xi(u)$ is a random vector with $\binom{n}{2}$ coordinates. Consider vector $S = \sum_{u\in X} \xi(u)$. As we proved in Claim 2, if $X$ satisfies the condition then $S_{ij} = 0$ for every $1\leq i < j \leq n$; that is, vector $S$ equals $0$.

It is easy to see that the covariance matrix of $\xi$ is the identity matrix $I_{n(n-1)/2}$. Thus by the Central Limit Theorem $S$ is asymptotically distributed as $\cal{N}(0, m \, I_{n(n-1)/2})$ (multivariate normal distribution). Moreover, the Local Central Limit Theorem gives us a bound on the probability that $S=0$. Recall that the Local Central Limit Theorem shows that for every $S_0$, $$\Pr(S=S_0) \leq C_n' \left(\frac{1}{\sqrt{2\pi m}}\right)^{n(n-1)/2}\left(e^{-|S_0|^2/(2m)} + o(1)\right),$$ where $C_n'$ is a number that depends on $n$ (more precisely, on the distribution of $\xi$), $\left(\frac{1}{\sqrt{2\pi m}}\right)^{n(n-1)/2}\cdot e^{-|S_0|^2/(2m)}$ is the density of the multivariate distribution, and $o(1)$ is an error term.

Plugging $S_0 = 0$, we get that the probability is at most $C_n \cdot (1/\sqrt{m})^{n(n-1)/2} = C_n /m^{n(n-1)/4}$. Thus

$$p(m,n) \leq C_n /m^{n(n-1)/4}.$$

QED

Claim 4. We have:

  1. $p(m,2) = \binom{m}{m/2}/2^m$ if $m$ is even; $p(m,2) = 0$ if $m$ is odd.
  2. $p(m,3) = m!/(4^m ((m/4)!)^4)$ if $m$ is divisive by 4; $p(m,3) = 0$ otherwise.

Proof: Note that the condition on the set $X$ is equivalent to the following:

Condition $\star$: Every hyperplane of dimension $n-1$ (passing through the origin) contains at most $n/2$ vectors from $X$.

Let us say that a hyperplane $H$ is maximal if there is no hyperplane $H'$ such that $H\cap \{\pm 1\}^n \subset H'\cap \{\pm 1\}^n$ and the inclusion is strict. It is clear that it suffices to consider only maximal hyperplanes in Condition $\star$. For $n=2$ the only maximal hyperplanes are $$H_1 = \{(u_1,u_2):u_1=u_2\} = \{(1,1),(-1,-1)\} \text{ and } H_2=\{(u_1,u_2):u_1=-u_2\}=\{(1,-1),(-1,1)\};$$ for $n=3$, every maximal hyperplane is of of the form $$H_{ij}^+ = \{(u_1,u_2,u_3):u_i=u_j\} \text{ and } H_{ij}^-=\{(u_1,u_2,u_3):u_i=-u_j\}$$

enter image description here

Let $f(v)$ be the number of vectors in $X$ (with multiplicities) equal to $v$ or $-v$. It is easy to check that $X$ satisfies Condition $\star$ if and only if $f(v) = m/2$ for every $v\in \{\pm 1 \}^2$ when $n=2$ and $f(v) = m/4$ for every $v\in \{\pm 1 \}^3$ when $n=3$. The statement of the claim follows.

QED