Subsets sharing at most $i$ elements

Assume I have a set $S$ of $N$ elements and I create subsets with $k$ elements from it. With no additional property the number of possible such subsets will be $N \choose k$.

Now I want my subsets to satisfy some additional rules $R_i$:

Given a set $X$ of subsets of $S$ with $k$ elements (i.e. $\forall A \in X, A \subset S , \#A=k$), $X$ satisfies $R_i$ if and only if for all $A$ and $B$ in $X$, $A$ and $B$ share at most $i$ elements, i.e. $\forall A , B \in X, \#(A \cap B) \le i$. Let's call $S_i$ the set of all such $X$ satisfying $R_i$.

I want to know how large a set satisfying $R_i$ can be, i.e. what is $M_i = \max_{X \in S_i} \#X $

For example if $N=1000$ and $k=10$, trivially we have $M_9 = {1000 \choose 10}$: the largest ensemble of subsets of $10$ elements that share at most $9$ with one another is exactly the set of all possible combinations of $10$ elements out of $1000$.

Similarly at the other extreme, still for $N=1000$ and $k=10$, we have $M_0 = 1000/10=100$.

Is there a formula or clever (not brute force) algorithm to compute other less trivial cases for $i=1,2,\dots8$?

BTW even if there is a nice formula, I'll ultimately want to implement a way to get a maximal $X$...


What you are looking for is known in the literature as a constant weight binary code (Wikipedia). Specifically, if $A(n,d,w)$ is the maximum number of binary vectors, each with $w$ ones and $n-w$ zeroes, such that the Hamming distance between any two codewords is $d$, then the largest possible size of your set of sets would be $A(N,2(k-i) ,k)$.

Finding the maximal set is an open problem, and the best known known constructions for $X$ and best bounds are listed on Andries E. Brouwer's website, along with known upper bounds on what is possible.

In general, if you want to find a set $X$ which is "good enough," and you have a comptuer you can program, then lexcodes are the way to go. Simply iterate through the list of all subsets of size $k$ in lexicographic order, and when you find a subset $S$ which shares no more than $i$ elements with everything in your current code, then add $S$ to the code.