This is an inadequate answer, but...

These problems are difficult because (a) the space of possible solutions is vast, and (b) there seems insufficient structure to reduce the space so that searching it becomes feasible.

Consider the 11-squares problem. Each square has a location and an orientation, and so can be pinned down by specifying three parameters. So 33 parameters determine a configuration of 11 squares, and so determine the surrounding square. But this means one is, in a sense, searching a 33-dimensional space for the optimal solution.

Of course, there are many constraints that make this 33 an overestimate. (For example, the first square can be constrained to, say, $(0,0){-}(1,1)$, so it is really "only" a 30-dimensional space...) But when you see a packing like that shown below (due to Károly Hajba), your confidence in constraining the parameters is shaken.


     SquarePacking
     (A packing of 51 squares. Image from this link.)