Partition a set into g groups, k different ways, such that no pair of elements is ever in the same group together more than M times

What you are looking for is block designs, precisely $BIBD$s. This answer can be viewed as a comment. In the first part, it states the main question and its relation to group testing and design theory. In the second part, it tries to analyze the given solution to the puzzle and gives one generalization. And in the third part, it states the definition of balanced incomplete block designs and references a book containing a comprehensive analysis.


First of all, let's state the original puzzle:

Question: Given a large population of items (sheep) containing a small set of defectives (Wolves), Assuming you can perform a test on a group of items and get a positive result if and only if the group contains a defective. How many tests are required to identify all the defectives?

Mathematically, given integers $d<n$, and $X$ a set of size $n$, we call a family $\mathcal{F}$ of subsets of $X$ $d$-distinguishing, if the function $f : {n \choose X} \to 2^ \mathcal{F}$ defined by $A \mapsto \{X\in \mathcal{F}, X\cap A \neq \varnothing \}$ is injective. What is the minimum size of a $d$-distinguishing family $\mathcal{F}$ denoted by $t(n,d)=|\mathcal{F}|$ ?

In Combinatorial design theory, one of the most beautiful areas of mathematics, this is called the Group Testing Problem. It is a very well studied subject because it has a lot of applications. It's very difficult to navigate all the research done on it but for the interested, I can refer to the survey Combinatorial Group Testing and Its Applications. by D. Du and F. Hwang.

To summarise, the only value for which $t(n,d)$ is known completely is $d=1$ for which $t(n,1)=\lceil \log n\rceil$, the exact value even for $d=2$ is not known. Generally, the solutions to this problem are called testing designs, and the main considered ones are:

  1. Set-packing designs or block designs.
  2. Transversal designs.
  3. Designs whose $d$-disjunct or $d$-separable matrices are directly constructed.

Using these methods, It's known that : $$\frac{d^2}{2\log(d)}\log(n) \leq t(n,d) \leq d^2 \log e(1+o(1)) \log(n) $$

This basically suggests that for your puzzle, the solution would be close to $30$.


Now returning to the answer given for the puzzle. Basically, the design that is proposed is the following :

Solution: First of all, put all the items in a grid $11\times 11$ and then create a set of tests, each test corresponds to a line in that grid. Additionally, every line contains every item exactly $6$ times and every two lines intersect in only one point. then every $5$ items (points) subset is a member of a nonshared set of lines.

Mathematically, Given a field $\Bbb F$, and $X \subset \Bbb F^2$ a subset of pairs of size $n$. For any $A \subset \Bbb F$ define the following family $$\mathcal{L}_{A}=\Big\{L(a,b)=\{x+ay=b, (x,y)\in X\}, a\in A , b\in \Bbb F \Big\} $$ then $\mathcal{L}_{A}$ is $d$-distinguishing for $X$ of size $|A||\Bbb F|$ for any $d<|A|$.

Proof: This formalizes the set of lines in a $2$ dimensional space over $\Bbb F$. Because of the fact that all the matrices $\begin{pmatrix}1&a\\1&a'\end{pmatrix}, a\neq a'\in A$ are invertible (thus fixes $b$), each two lines intersect only in one point. and of course, the definition ensures that every item is a member of exactly |A| groups.

Remarks:

  • In the puzzle $|X| = 100$, $|\Bbb F|=11$, $|A|=6$ and $d=5$.
  • The design applies to up to $|X|=121$ without changing the number of tests
  • This implies that $t(n,d) \leq (d+1)p^k \sim (d+1)\sqrt n$ where $p^k$ is the smallest prime power $\geq \sqrt n$
  • Of course this is not the optimal solution.

One first generalization of this solution is to consider higher dimensions, but a very difficult constraint arizes, mainly that of No-three-in-line problem and more (this is because of the argument of matrix invertible above).

First generalization (dimension $=3$), Given a field $\Bbb F$, and $X \subset \Bbb F^3$. For any $A \subset \Bbb F^2$ with no three colinear points, define the following family $$\mathcal{L}_{A}=\Big\{L(a,b,c)=\{x+ay+bz=c, (x,y,z)\in X\}, (a,b)\in A , c\in \Bbb F \Big\} $$ then $\mathcal{L}_{A}$ is $d$-distinguishing for $X$ of size $|A||\Bbb F|$ for any $d$ such that $2d<|A|$.

Proof: This is the set of planes in a $3$ dimensional space over $\Bbb F$. Because of the fact that all the matrices $\begin{pmatrix}1&a&b\\1&a'&b'\\1&a''&b''\end{pmatrix}$ are invertible for distinct $(a,b),(a',b'),(a'',b'')\in A$, each three planes intersect only in one point. and of course, the definition ensures that every item is a member of exactly |A| groups. The proof is the same as before, in fact, it is just an application of the generalized pigeonhole principle. This is why we need the last inequality $2d<|A|$.

Remarks:

  • The non-$3$-colinearity condition is very strong, and the existence of such points requires $|A| \leq \sqrt {|F|} $, which in our cases is not applicable.
  • This implies for $n>> d^6$ that $t(n,d) \leq (2d+1)p^k \sim (2d+1)\sqrt[3] n$ where $p^k$ is the smallest prime power $\geq \sqrt[3] n$
  • Of course this remains not the optimal solution.

This two types of solutions are called grid designs (or affine planes, projective planes in combinatorial design theory) and they are also studied by for example Hwang in the book I referenced above. In their constructions, the property that no two points are in the same line twice is called the unique collinearity condition.


These methods - when not randomized - don't give us better bounds. The main methods used in the field are block designs and direct constructions of the matrices. Your generalization is equivalent to the following

Second generalization. Let $n$, $g$, and $M$ be positive integers such that $n > g \geq 2$. A $(n, g, M)$-balanced incomplete block design (which we abbreviate to $(n, g, M)$-BIBD) is a design $(X, \mathcal A)$ such that the following properties are satisfied:

  1. $|X| = n$,
  2. each block contains exactly $g$ points, and
  3. every pair of distinct points is contained in exactly $M$ blocks.

then the dual of every $(t, g, 1)$-BIBD is $(g-1)$-distinguishing of size $t$ for a set $Y$ of size $\frac{t^2-t}{k^2-k}$.

The third property above is called the balance property, which you do not require in your definition. If it's relaxed to "every pair of distinct points is contained in no more then $M$ blocks", the design will be called an incomplete block design. Also, we can relax the second property by allowing multiple block sizes in a set of integers $G$. We obtain a $\bf{(t,G,M)}$-pairwise block design. This is exactly what you are looking for.

All the results I know are focused on $BIBD$s (and $PDB$s, ...) as combinatorial structures. This is a very largely studied subject and in any book on Design Theory (for example Combinatorial Designs: Constructions and Analysis) you will find a lot of examples, constructions, and conditions of existence.

The book contains an example of $(37, 4, 1)$-BIBD (resp. $(19, 3, 1)$-BIBD) which means that we can identify $3$ (resp. $2$) defective items in a group of size $\frac{37^2-37}{4^2-4}=111$ (resp. $58$) using $37$ (resp. $19$) tests. These specific block designs generally provide a solution with an order of magnitude $t(d,n) \sim d\sqrt n$.


Edit : As you requested here is an example of how you can solve a group testing problem given a block design. An example of $(37,4,1)$-BIBD can be constructed as follows, Let $D_1=\{0,1,3,24\}$, $D_2=\{0,10,18,30\}$ and $D_3=\{0,4,26,32\}$ (a difference family), now define : $$ X=\Bbb{Z}_{37}=\{0,....,36\} \\ B_{i,j}=\{ d+j , d \in D_i\}, i\in {1,2,3}, j\in X $$

Addition is done modulo $37$, the number of blocks is $111$, size of $X$ is $37$ and each block contains exactly $4$ elements of $X$.

Now given $111$ items (Sheep), assign every sheep to a block in the example, and create the following tests $T_x=\{B\in \mathcal{B}, x \in B\}$ for each $x\in \Bbb{Z}_{37}$. I claim that these $37$ tests are sufficient to find $4-1=3$ defective items (wolves) in the original set.

In your puzzle, what you need is $(v,6,1)$-BIBD with $v\geq 46$ (or a $(v,\Bbb{N}_{\geq 6},1)$-PBD) if you have done some research on this you will find that :

  1. If there is a $(v,6,1)$-BIBD, then $v \equiv 1,3 \mod 15$.
  2. It is proven that there is no $(46,6,1)$-BIBD
  3. It is an open problem if there is $(v,6,1)$-BIBD for a finite list$v=51, 61, ...$ (as far as I know) Using BIBDs the best solution we can get less than $63$ is $61$ but that's an open problem.

What I have tried (without success) is to find instead $(v,\{6,7,8\},1)$-PBD ($v$ should be bigger than $42$).

Edit: Using packings, which is the structure that satisfies your two conditions (kind of), then if a solution exists for your puzzle that satisfies both conditions, it should contain at least $56$ tests. The argument is based on the fact that every packing design satisfies $D(v,k,2) \leq \left\lfloor \frac v k \lfloor \frac {v-1}{k-1}\rfloor \right\rfloor$.

Solution for your puzzle A possible answer for your puzzle with $60$ tests. First, consider the following $4$ base blocks :

BaseBlocks = [
    {(0,0,0,0),(0,1,0,2),(0,2,1,0),(1,0,1,1), (1,1,1,2), (1,2,∞,∞)},
    {(0,0,0,0),(0,1,0,1),(1,0,0,2),(1,1,0,2), (1,2,1,1), (0,2,∞,∞)},
    {(0,0,0,0),(0,1,0,0),(0,2,1,1),(1,0,0,0), (1,1,0,3), (1,2,∞,-∞)},
    {(0,0,0,0),(0,1,0,3),(1,0,1,3),(1,1,1,1), (1,2,0,3), (0,2,∞,-∞)}
]

Now consider the following blocks (addition is done modulo the subscripts, and infinity plus anything stays the same) :

Blocks = [ 
   { (a,b+i,c+j,d+k) , (a,b,c,d) in B}
   B in BaseBlocks
   i in Z_3
   j in Z_2
   k in Z_4
]

There are exactly $96$ blocks. There are $60$ points of the form $(a,b,c,d)$ in the union of those blocks. Now as for the first example, the blocks correspond to Sheep and the points to the tests. Up to here, with these $60$ tests, you can find $5$ wolves among $96$ sheep.

But actually you can add $8$ more sheep, By adding more blocks :

Blocks' = Blocks ∪ [ 
   {(0,0,∞, ∞),(0,1,∞, ∞),(0,2,∞, ∞),(1,0,∞,-∞),(1,1,∞,-∞),(1,2,∞,-∞)},
   {(0,0,∞,-∞),(0,1,∞,-∞),(0,2,∞,-∞),(1,0,∞, ∞),(1,1,∞, ∞),(1,2,∞, ∞)},

   // The pattern : all the first two elements are the same
   {(0,0,0,0),(0,0,0,1),(0,0,0,2),(0,0,0,3),(0,0,1,0),(0,0,1,1),(0,0,1,2),(0,0,1,3)},
   {(0,1,0,0),(0,1,0,1),(0,1,0,2),(0,1,0,3),(0,1,1,0),(0,1,1,1),(0,1,1,2),(0,1,1,3)},
   {(0,2,0,0),(0,2,0,1),(0,2,0,2),(0,2,0,3),(0,2,1,0),(0,2,1,1),(0,2,1,2),(0,2,1,3)},
   {(1,0,0,0),(1,0,0,1),(1,0,0,2),(1,0,0,3),(1,0,1,0),(1,0,1,1),(1,0,1,2),(1,0,1,3)},
   {(1,1,0,0),(1,1,0,1),(1,1,0,2),(1,1,0,3),(1,1,1,0),(1,1,1,1),(1,1,1,2),(1,1,1,3)},
   {(1,2,0,0),(1,2,0,1),(1,2,0,2),(1,2,0,3),(1,2,1,0),(1,2,1,1),(1,2,1,2),(1,2,1,3)}
]

Now with this, you can find $5$ wolves among $104$ sheep using $60$ tests !