Solution 1:

Proof that $N=49$ is impossible:

+-------+---   1) +-------+      2) +-------+      3) +-------+
| 1 2 3 | X       |(1)2 3 |*1       | 1 2 3 |(7)      | 1 2 3 |(4)
| 4 5 6 | Y       |(4)5 6 |*4       | 4 5 6 | 1       |(4)5 6 | 7
| 7 8 9 | Z       |(7)8 9 |*7       |(7)8 9 | 4       | 7 8 9 | 1
+-------+---      +-------+         +-------+         +-------+ 
| A B C | D       | A B C | D       | A*B*C |*D       | A*B*C |*D  

The block on the first figure is to be taken as our "starting" one. Since there are no other numbers on the board, having them ordered makes no difference, for now. Now, there are 3 different ways to fill tiles XYZ with 1 4 and 7, but they all fail. Since:

Case 1: Obviously, there's 3 collisions.

Case 2: BCD must contain 237 on any order. However, 7 can't be on B or C because it collides with our original grid. And it can't be on D either, because it's on the same row as the recently added one.

Case 3: Similar to before, except with 4.

That means there's no way to have those 2 blocks (23X,56Y,89Z and 56Y,89Z,BCD) successfully.

Therefore, $N=49$ is impossible.

Solution 2:

Using @Emisor's construction, we can reduce the upper bound to 39.

First a bit of set-up:

  1. I'll use Cartesian coordinates to refer to positions on the board, with (1,1) at the top left and (9,9) bottom right.

  2. I will identify solution blocks by the coordinates of their top left cell. I'll say the block is "anchored" at these coordinates. The obligatory Sudoku blocks are anchored at (1,1), (1,4), (1,7), (4,1), (4,7), etc.

If we look at @Emisor's construction carefully (actually: at the OP's distillation of it), we can see that it rules out not just the case $N = 49$ but any configuration where four cells in a 2x2 block are all solution anchors. Emisor's construction leads to this lemma:

Lemma 1: Let (x,y) be a solution anchor. Suppose that (x+1,y) and (x,y+1) are also solution anchors. Then (x+1,y+1) cannot be a solution anchor. (I won't copy the proof: Just observe that these are the only assumptions in requires.)

Lemma 2: In any 2x2 group of cells, at most three can be solution anchors. Proof: Just change orientation and apply the construction of Lemma 1.

So let's consider all of the 49 potential solution anchors. We can assign most of them to non-overlapping 2x2 groups, each of which must have at least one non-solution.

Here's a simple assignment into groups, built around the obligatory Sudoku solution blocks (which are capitalized). Each letter denotes the members of one group.

        1 2 3   4 5 6   7 8 9
      +-------+-------+-------+
    1 | A a . | B b c | C . . |
    2 | a a . | b b c | c . . |
    3 | . . . | . . . | . . . |
      +-------+-------+-------+
    4 | D d . | E e f | F . . |
    5 | d d . | e e f | f . . |
    6 | g g . | h h j | j . . |
      +-------+-------+-------+
    7 | G g . | H h j | J . . |
    8 | . . . | . . . | . . . |
    9 | . . . | . . . | . . . |
      +-------+-------+-------+

There are nine groups above, so there can at most be $49-9 = 40$ solution blocks. I.e., $N_{max}$ <= 40. (Lowered to 39 below, keep reading.)

Notes:

  1. The argument is valid for any 2x2 group, but we must choose non-overlapping groups in order to be able to count the non-solutions.

  2. Note that a number of potential anchor cells (column 3 and row 3) have not been assigned to a group. Since the solution anchors must lie in a 7x7 grid, I doubt there's a way to fit in a 10th 2x2 group. (There can only be 3 groups side by side, but I wouldn't call that a complete proof).

  3. There may well be constraints expressible in different ways. Given the number of positions that don't belong to a group, I'm pretty sure the true maximum is lower than 40. Stay tuned for updates (by me or others :-)

Update 1

Based on the above block arrangement, we can shave one more off $N_{max}$. While in general we don't know which cell in a group will be the non-solution, we know that the capital letters must anchor sudoku blocks since our grid is a valid Sudoku. In particular, position (4,4), marked by E above, must anchor a solution block. This means that at least one of the positions (3,3), (3,4), or (4,3) does not anchor a block. The are marked with @ below:

        1 2 3   4 5 6   7 8 9
      +-------+-------+-------+
    1 | A a . | B b c | C . . |
    2 | a a . | b b c | c . . |
    3 | . . @ | @ . . | . . . |
      +-------+-------+-------+
    4 | D d @ | E e f | F . . |
    5 | d d . | e e f | f . . |
    6 | g g . | h h j | j . . |
      +-------+-------+-------+
    7 | G g . | H h j | J . . |
    8 | . . . | . . . | . . . |
    9 | . . . | . . . | . . . |
      +-------+-------+-------+

This means that $N_{max}$ can be at most $49 - 9 - 1 = 39$.

Update 2

In a comment, @rfg shows how to arrange 39 anchor points so that they respect the 2x2 constraint, and include the 9 basic Sudoku blocks. Here's @rfg's sketch; the X's are the positions that are not anchor points. (imgur link)

Of course this does not mean that there is an actual Sudoku with these 39 solution blocks. But it does show that 39 is the limit of this approach: If $N_{max}$ is in fact less than 39, it cannot be shown by exploiting this constraint alone.

Solution 3:

UPDATE: After adding condition 3 below (only 3 out of every 4 locations in a 2x2 square can be a block, savile row + lingeling (a SAT solver) prove there is no solution with 34 or more blocks in about 10 minutes. This does not of course provide any kind of nice proof

I have tried modelling this problem in the Savile Row system, which can generate input for SAT solvers and the Minion constraint solver.

This can re-create the N=33 almost instantly, but so far hasn't found anything for N > 33, after a couple of hours.

I plan on working on improving my model. At the moment the only non-obvious facts I am using are:

  • When counting the 3x3 squares which satisfy the condition, remember the original sub-blocks always do

  • We can, without loss of generality, assign the first row to 1,2,...9

  • In any 2x2 block, at most 3 of the blocks with this square as their top left are blocks.

Here is the Savile Row model, for the interested reader (designed to find > 33 squares).

language ESSENCE' 1.0
letting   RANGE be domain int(1..9)
letting   LONGRANGE be domain int(0..9)
letting   VALUES be domain int(0..9)

find field: matrix indexed by [RANGE, RANGE] of RANGE

find squarecheck : matrix indexed by [LONGRANGE, LONGRANGE] of bool

find squares: int(34..100)

such that
  $ all rows have to be different
  forAll row : RANGE .
       allDiff(field[row,..]),

  $ all columns have to be different
  forAll col : RANGE .
     allDiff(field[..,col]),     

$ all 3x3 blocks have to be different
  forAll i,j : int(0..2) . (
    allDiff([field[row1+(i*3), col1+(j*3)] | 
             row1 : int(1..3), col1 : int(1..3) ])
    /\
    squarecheck[i*3,j*3] = true
  ),

  forAll i : int(0..8). forAll j : int(0..8).
    squarecheck[i,j]   + squarecheck[i+1,j]  +
    squarecheck[i,j+1] + squarecheck[i+1,j+1] <= 3,

  forAll i : int(0..9). forAll j : int(0..9) .
    squarecheck[i,j] =  allDiff([field[row1+i, col1+j] | 
             row1 : int(1..3), col1 : int(1..3) ]),

 sum(flatten(squarecheck)) = squares,


 forAll i : int(1..9). field[1,i] = i