What are the difference between some basic numerical root finding methods?

Solution 1:

In the real world situations I have encountered (and I have encountered several), evaluating the function is by far the most expensive thing you can do. Even massive amounts of side calculation to avoid a single function evaluation is well worth the while. So the faster a method converges, the better choice it can be - provided you can meet its requirements

  1. Newton's method is great for speed, but it does require that you know the derivative, and I have yet to encounter a real-world application where this was available. That is not to say that they don't occur. But I have not been so lucky. Another problem with Newton's method is instability. If you hit a place where the function is close to flat, it may send your next iteration out beyond Pluto. And in fact, there is no guarantee of convergence. You can find it getting caught in a loop.

  2. Secant Method. Well if you can't find the tangent line because you don't know the derivative, estimate it with a secant line instead. There is a school of thought that this can be faster than Newton's method despite the slower convergence, because it only requires one new function evaluation for each iteration, while Newton's method requires two. I am not sure if I buy it, but as I said, I have no practical experience with Newton's method (though plenty of academic experience with it). This method has exactly the same instability problems as Newton's method.

  3. Bisection Method. Guaranteed convergence, provided you can straddle the root at the start. Easily understood, easily programmed, easily performed, slow as blazes. Never sends your iteration off into the wild blue yonder. But still slow as blazes. This is your fallback method when all else fails.

  4. Brent's Method. No, you did not mention this one. But in practice, some variant of Brent's Method is usually what you want to use. This method combines the Secant and Bisection methods, and another method called "Inverse Quadratic", which is like the secant method, but approximates the function with an inverse quadratic function instead of a line. It results in a slight improvement in convergence speed. Occasionally it fails, so the secant method is used as a back-up. Brent's method also monitors convergence, and if it gets too slow or tries to go off into the wild, the method drops in a bisection step instead to get things back on track. The problem I've found with Brent's method is that one side of your interval will converge quickly to a root, but the other side will remain rarely moved, because the Secant/Inverse Quadratic steps will keep having their iterations on the same side of the root. Brent's method eventually brings in the other side by slow Bisection. A trick I've employed to some improvement is to intentionally double the step size of the Secant/Inverse Quadratic steps in an effort to intentionally overshoot the root, thereby bringing that side in as well. Bringing in the other side of the interval by a large amount usually significantly improves convergence speed even on the side that was converging already.