So, As it turns out, my intuition was correct. After trying almost everything else (changing epsilon, splitting), I tested the derivative of my matrix against an automatic numerical differentiator, and the matrix using my method and the one from the program were the same. I think the intuition here is right. I assume the confusion came from a different mistake I made somewhere else.

If someone stumbles along this and is wondering what to try:

  • If the epsilon is too big the slope wont be approximated correctly
  • If the epsilon is too small you'll get round-off error from how floating point numbers are stored in a computer
  • In the book "Multiple View Geometry"(Hartley et al.) they mention epsilon for computing the Jacobian for this purpose should be $\epsilon = \text{max}(\lvert x_\text{input value to change}*10^{-4}\rvert, 10^{-6})$
  • If there are lots of nested expressions this way of finding the Jacobian could be too inaccurate to use. Remember: if $f(x) = g(d(x))$ then $J_f = J_g * J_d$ because of the chain rule.