You've covered a lot of material in your question but I will focus on your main sticking point: determining a cutoff for use after sieving. There are two issues that make choosing the cutoff problematic:

  • Each of the numbers in the sieve interval is actually a different value so no one cutoff is ideal for all the whole interval.
  • In practice no one sieves for prime powers and so there are probably lots of numbers that can be factored over the factor base without actually reaching the square root of the number.

If you choose a cutoff larger than the square root of the largest number, none of the candidates will pass the test and sieving will be pointless. Also, since you are adding a crude estimate of the log of the factor, need also need some slack to account for, say, a number composed of factors that all round down. Finally, if prime $p$ is in your factor base and you don't sieve for prime powers $p^a$, you need even more lattitude to catch numbers that are not squarefree.

From a theoretical point of view, the sieving is just an optimization to avoid trial dividing all the numbers in the interval. So the choice of the cutoff is to a large extent an implementation decision:

  • If you choose the cutoff too high you will miss some numbers that can be factored over the factor base.
  • If you choose the cutoff too low you will waste a lot of time trial dividing numbers than cannot be factored over the factor base.

The ideal value of the cutoff therefore depends on exactly how fast sieving is and how fast trial dividing is. The solution is to try a variety of values and see what works best in your particular implementation.