Reference/API

The Roots package provides several different algorithms to solve f(x)=0.

Roots.RootsModule
Roots

A package for solving f(x) = 0 for univariate, scalar functions.

The basic methods are

  • find_zero for using one of several methods to identify a zero
  • ZeroProblem for solving for a zero using the CommonSolve interface
  • find_zeros for heuristically identifying all zeros in a specified interval

Extended help

Root finding functions for Julia

Stable Dev Build Status codecov

This package contains simple routines for finding roots, or zeros, of scalar functions of a single real variable using floating-point math. The find_zero function provides the primary interface. The basic call is find_zero(f, x0, [M], [p]; kws...) where, typically, f is a function, x0 a starting point or bracketing interval, M is used to adjust the default algorithms used, and p can be used to pass in parameters.

The various algorithms include:

  • Bisection-like algorithms. For functions where a bracketing interval is known (one where f(a) and f(b) have alternate signs), a bracketing method, like Bisection, can be specified. The default is Bisection, for most floating point number types, employed in a manner exploiting floating point storage conventions. For other number types (e.g. BigFloat), an algorithm of Alefeld, Potra, and Shi is used by default. These default methods are guaranteed to converge. Other bracketing methods are available.

  • Several derivative-free algorithms. These are specified through the methods Order0, Order1 (the secant method), Order2 (the Steffensen method), Order5, Order8, and Order16. The number indicates, roughly, the order of convergence. The Order0 method is the default, and the most robust, but may take more function calls to converge, as it employs a bracketing method when possible. The higher order methods promise faster convergence, though don't always yield results with fewer function calls than Order1 or Order2. The methods Roots.Order1B and Roots.Order2B are superlinear and quadratically converging methods independent of the multiplicity of the zero.

  • There are historic algorithms that require a derivative or two to be specified: Roots.Newton and Roots.Halley. Roots.Schroder provides a quadratic method, like Newton's method, which is independent of the multiplicity of the zero. This is generalized by Roots.ThukralXB (with X being 2,3,4, or 5).

  • There are several non-exported algorithms, such as, Roots.Brent(), Roots.LithBoonkkampIJzermanBracket, and Roots.LithBoonkkampIJzerman.

Each method's documentation has additional detail.

Some examples:

julia> using Roots

julia> f(x) = exp(x) - x^4;

julia> α₀, α₁, α₂ = -0.8155534188089607, 1.4296118247255556, 8.6131694564414;

julia> find_zero(f, (8,9), Bisection()) ≈ α₂ # a bisection method has the bracket specified
true

julia> find_zero(f, (-10, 0)) ≈ α₀ # Bisection is default if x in `find_zero(f, x)` is not scalar
true


julia> find_zero(f, (-10, 0), Roots.A42()) ≈ α₀ # fewer function evaluations than Bisection
true

For non-bracketing methods, the initial position is passed in as a scalar, or, possibly, for secant-like methods an iterable like (x_0, x_1):

julia> find_zero(f, 3) ≈ α₁  # find_zero(f, x0::Number) will use Order0()
true

julia> find_zero(f, 3, Order1()) ≈ α₁ # same answer, different method (secant)
true

julia> find_zero(f, (3, 2), Order1()) ≈ α₁ # start secant method with (3, f(3), (2, f(2))
true


julia> find_zero(sin, BigFloat(3.0), Order16()) ≈ π # 2 iterations to 6 using Order1()
true

The find_zero function can be used with callable objects:

julia> using Polynomials;

julia> x = variable();

julia> find_zero(x^5 - x - 1, 1.0) ≈ 1.1673039782614187
true

The function should respect the units of the Unitful package:

julia> using Unitful

julia> s, m  = u"s", u"m";

julia> g, v₀, y₀ = 9.8*m/s^2, 10m/s, 16m;


julia> y(t) = -g*t^2 + v₀*t + y₀
y (generic function with 1 method)

julia> find_zero(y, 1s)  ≈ 1.886053370668014s
true

Newton's method can be used without taking derivatives by hand. The following examples use the ForwardDiff package:

julia> using ForwardDiff

julia> D(f) = x -> ForwardDiff.derivative(f,float(x))
D (generic function with 1 method)

Now we have:

julia> f(x) = x^3 - 2x - 5
f (generic function with 1 method)

julia> x0 = 2
2

julia> find_zero((f, D(f)), x0, Roots.Newton()) ≈ 2.0945514815423265
true

Automatic derivatives allow for easy solutions to finding critical points of a function.

julia> using Statistics: mean, median

julia> as = rand(5);

julia> M(x) = sum((x-a)^2 for a in as)
M (generic function with 1 method)

julia> find_zero(D(M), .5) ≈ mean(as)
true

julia> med(x) = sum(abs(x-a) for a in as)
med (generic function with 1 method)

julia> find_zero(D(med), (0, 1)) ≈ median(as)
true

The CommonSolve interface

The DifferentialEquations interface of setting up a problem; initializing the problem; then solving the problem is also implemented using the types ZeroProblem and the methods init, solve!, and solve (from CommonSolve).

For example, we can solve a problem with many different methods, as follows:

julia> f(x) = exp(-x) - x^3
f (generic function with 1 method)

julia> x0 = 2.0
2.0

julia> fx = ZeroProblem(f, x0)
ZeroProblem{typeof(f), Float64}(f, 2.0)

julia> solve(fx) ≈ 0.7728829591492101
true

With no default, and a single initial point specified, the default Order1 method is used. The solve method allows other root-solving methods to be passed, along with other options. For example, to use the Order2 method using a convergence criteria (see below) that |xₙ - xₙ₋₁| ≤ δ, we could make this call:

julia> solve(fx, Order2(); atol=0.0, rtol=0.0) ≈ 0.7728829591492101
true

Unlike find_zero, which errors on non-convergence, solve returns NaN on non-convergence.

This next example has a zero at 0.0, but for most initial values will escape towards ±∞, sometimes causing a relative tolerance to return a misleading value. Here we can see the differences:

julia> f(x) = cbrt(x) * exp(-x^2)
f (generic function with 1 method)

julia> x0 = 0.1147
0.1147

julia> find_zero(f, x0, Roots.Order5()) ≈ 5.936596662527689 # stopped as |f(xₙ)| ≤ |xₙ|ϵ
true

julia> find_zero(f, x0, Roots.Order1(), atol=0.0, rtol=0.0) # error as no check on `|f(xn)|`
ERROR: Roots.ConvergenceFailed("Algorithm failed to converge")
[...]

julia> fx = ZeroProblem(f, x0);

julia> solve(fx, Roots.Order1(), atol=0.0, rtol=0.0) # NaN, not an error
NaN

julia> fx = ZeroProblem((f, D(f)), x0); # higher order methods can identify zero of this function

julia> solve(fx, Roots.LithBoonkkampIJzerman(2,1), atol=0.0, rtol=0.0)
0.0

Functions may be parameterized, as illustrated:

julia> f(x, p=2) = cos(x) - x/p
f (generic function with 2 methods)

julia> Z = ZeroProblem(f, pi/4)
ZeroProblem{typeof(f), Float64}(f, 0.7853981633974483)

julia> solve(Z, Order1()) ≈ 1.0298665293222586     # use p=2 default
true

julia> solve(Z, Order1(), p=3) ≈ 1.170120950002626 # use p=3
true

julia> solve(Z, Order1(), 4) ≈ 1.2523532340025887  # by position, uses p=4
true

Multiple zeros

The find_zeros function can be used to search for all zeros in a specified interval. The basic algorithm essentially splits the interval into many subintervals. For each, if there is a bracket, a bracketing algorithm is used to identify a zero, otherwise a derivative free method is used to search for zeros. This heuristic algorithm can miss zeros for various reasons, so the results should be confirmed by other means.

julia> f(x) = exp(x) - x^4
f (generic function with 2 methods)

julia> find_zeros(f, -10,10) ≈ [α₀, α₁, α₂] # from above
true

The interval can also be specified using a structure with extrema defined, where extrema returns two different values:

julia> using IntervalSets

julia> find_zeros(f, -10..10) ≈ [α₀, α₁, α₂]
true

(For tougher problems, the IntervalRootFinding package gives guaranteed results, rather than the heuristically identified values returned by find_zeros.)

Convergence

For most algorithms, convergence is decided when

  • The value |f(x_n)| <= tol with tol = max(atol, abs(x_n)*rtol), or

  • the values x_n ≈ x_{n-1} with tolerances xatol and xrtol and f(x_n) ≈ 0 with a relaxed tolerance based on atol and rtol.

The find_zero algorithm stops if

  • it encounters an NaN or an Inf, or

  • the number of iterations exceed maxevals

If the algorithm stops and the relaxed convergence criteria is met, the suspected zero is returned. Otherwise an error is thrown indicating no convergence. To adjust the tolerances, find_zero accepts keyword arguments atol, rtol, xatol, and xrtol, as seen in some examples above.

The Bisection and Roots.A42 methods are guaranteed to converge even if the tolerances are set to zero, so these are the defaults. Non-zero values for xatol and xrtol can be specified to reduce the number of function calls when lower precision is required.

julia> fx = ZeroProblem(sin, (3,4));

julia> solve(fx, Bisection(); xatol=1/16)
3.125

An alternate interface

This functionality is provided by the fzero function, familiar to MATLAB users. Roots also provides this alternative interface:

  • fzero(f, x0::Real; order=0) calls a derivative-free method. with the order specifying one of Order0, Order1, etc.

  • fzero(f, a::Real, b::Real) calls the find_zero algorithm with the Bisection method.

  • fzeros(f, a::Real, b::Real) will call find_zeros.

Usage examples

julia> f(x) = exp(x) - x^4
f (generic function with 2 methods)

julia> fzero(f, 8, 9) ≈ α₂   # bracketing
true

julia> fzero(f, -10, 0) ≈ α₀
true

julia> fzeros(f, -10, 10) ≈ [α₀, α₁, α₂]
true

julia> fzero(f, 3) ≈ α₁      # default is Order0()
true

julia> fzero(sin, big(3), order=16)  ≈ π # uses higher order method
true
source

The find_zero and find_zeros functions

There are two main functions: find_zero to identify a zero of $f$ given some initial starting value or bracketing interval and find_zeros to heuristically identify all zeros in a specified interval.

Roots.find_zeroFunction
find_zero(f, x0, M, [N::AbstractBracketingMethod], [p=nothing]; kwargs...)

Interface to one of several methods for finding zeros of a univariate function, e.g. solving $f(x)=0$.

Arguments

Positional arguments

  • f: the function (univariate or f(x,p) with p holding parameters)
  • x0: the initial condition (a value, initial values, or bracketing interval)
  • M: some AbstractUnivariateZeroMethod specifying the solver
  • N: some bracketing method, when specified creates a hybrid method
  • p: for specifying a parameter to f. Also can be a keyword, but a positional argument is helpful with broadcasting.

Keyword arguments

  • xatol, xrtol: absolute and relative tolerance to decide if xₙ₊₁ ≈ xₙ
  • atol, rtol: absolute and relative tolerance to decide if f(xₙ) ≈ 0
  • maxiters: specify the maximum number of iterations the algorithm can take.
  • verbose::Bool: specifies if details about algorithm should be shown
  • tracks: allows specification of Tracks objects

Extended help

Initial starting value

For most methods, x0 is a scalar value indicating the initial value in the iterative procedure. (Secant methods can have a tuple specify their initial values.) Values must be a subtype of Number and have methods for float, real, and oneunit defined.

For bracketing intervals, x0 is specified using a tuple, a vector, or any iterable with extrema defined. A bracketing interval, $[a,b]$, is one where $f(a)$ and $f(b)$ have different signs.

Return value

If the algorithm succeeds, the approximate root identified is returned. A ConvergenceFailed error is thrown if the algorithm fails. The alternate form solve(ZeroProblem(f,x0), M) returns NaN in case of failure.

Specifying a method

A method is specified to indicate which algorithm to employ:

For more detail, see the help page for each method (e.g., ?Order1). Non-exported methods must be qualified with the module name, as in ?Roots.Schroder.

If no method is specified, the default method depends on x0:

  • If x0 is a scalar, the default is the more robust Order0 method.

  • If x0 is a tuple, vector, or iterable with extrema defined indicating a bracketing interval, then the Bisection method is used for Float64, Float32 or Float16 types; otherwise the A42 method is used.

The default methods are chosen to be robust; they may not be as efficient as some others.

Specifying the function

The function(s) are passed as the first argument.

For the few methods that use one or more derivatives (Newton, Halley, Schroder, LithBoonkkampIJzerman(S,D), etc.) a tuple of functions is used. For the classical algorithms, a function returning (f(x), f(x)/f'(x), [f'(x)/f''(x)]) may be used.

Optional arguments (tolerances, limit evaluations, tracing)

  • xatol - absolute tolerance for x values.
  • xrtol - relative tolerance for x values.
  • atol - absolute tolerance for f(x) values.
  • rtol - relative tolerance for f(x) values.
  • maxiters - limit on maximum number of iterations.
  • strict - if false (the default), when the algorithm stops, possible zeros are checked with a relaxed tolerance.
  • verbose - if true a trace of the algorithm will be shown on successful completion. See the internal Roots.Tracks object to save this trace.

See the help string for Roots.assess_convergence for details on convergence. See the help page for Roots.default_tolerances(method) for details on the default tolerances.

In general, with floating point numbers, convergence must be understood as not an absolute statement. Even if mathematically α is an answer and xstar the floating point realization, it may be that f(xstar) - f(α) ≈ xstar ⋅ f'(α) ⋅ eps(α), so the role of tolerances must be appreciated, and at times specified.

For the Bisection methods, convergence is guaranteed over Float64 values, so the tolerances are set to be $0$ by default.

If a bracketing method is passed in after the method specification, then whenever a bracket is identified during the algorithm, the method will switch to the bracketing method to identify the zero. (Bracketing methods are mathematically guaranteed to converge, non-bracketing methods may or may not converge.) This is what Order0 does by default, with an initial secant method switching to the AlefeldPotraShi method should a bracket be encountered.

Note: The order of the method is hinted at in the naming scheme. A scheme is order r if, with eᵢ = xᵢ - α, eᵢ₊₁ = C⋅eᵢʳ. If the error eᵢ is small enough, then essentially the error will gain r times as many leading zeros each step. However, if the error is not small, this will not be the case. Without good initial guesses, a high order method may still converge slowly, if at all. The OrderN methods have some heuristics employed to ensure a wider range for convergence at the cost of not faithfully implementing the method, though those are available through unexported methods.

Examples:

Default methods.

julia> using Roots

julia> find_zero(sin, 3)  # use Order0()
3.141592653589793

julia> find_zero(sin, (3,4)) # use Bisection()
3.141592653589793

Specifying a method,

julia> find_zero(sin, (3,4), Order1())            # can specify two starting points for secant method
3.141592653589793

julia> find_zero(sin, 3.0, Order2())              # Use Steffensen method
3.1415926535897936

julia> find_zero(sin, big(3.0), Order16())        # rapid convergence
3.141592653589793238462643383279502884197169399375105820974944592307816406286198

julia> find_zero(sin, (3, 4), A42())              # fewer function calls than Bisection(), in this case
3.141592653589793

julia> find_zero(sin, (3, 4), FalsePosition(8))   # 1 of 12 possible algorithms for false position
3.141592653589793

julia> find_zero((sin,cos), 3.0, Roots.Newton())  # use Newton's method
3.141592653589793

julia> find_zero((sin, cos, x->-sin(x)), 3.0, Roots.Halley())  # use Halley's method
3.141592653589793

Changing tolerances.

julia> fn = x -> (2x*cos(x) + x^2 - 3)^10/(x^2 + 1);

julia> x0, xstar = 3.0,  2.9947567209477;

julia> fn(find_zero(fn, x0, Order2())) <= 1e-14  # f(xₙ) ≈ 0, but Δxₙ can be largish
true

julia> find_zero(fn, x0, Order2(), atol=0.0, rtol=0.0) # error: x_n ≉ x_{n-1}; just f(x_n) ≈ 0
ERROR: Roots.ConvergenceFailed("Algorithm failed to converge")
[...]

julia> fn = x -> (sin(x)*cos(x) - x^3 + 1)^9;

julia> x0, xstar = 1.0,  1.112243913023029;

julia> isapprox(find_zero(fn, x0, Order2()), xstar; atol=1e-4)
true

julia> find_zero(fn, x0, Order2(), maxiters=3)    # need more steps to converge
ERROR: Roots.ConvergenceFailed("Algorithm failed to converge")
[...]

Tracing

Passing verbose=true will show details on the steps of the algorithm. The tracks argument allows the passing of a Roots.Tracks object to record the values of x and f(x) used in the algorithm.

Note

See solve! and ZeroProblem for an alternate interface.

source
Roots.find_zerosFunction
find_zeros(f, a, [b]; [no_pts=12, k=8, naive=false, xatol, xrtol, atol, rtol])

Search for zeros of f in the interval [a,b] with an heuristic algorithm.

  • f: a function or callable object
  • a, b: If b is specified, the interval $[a,b]$ is used. If only a is specified, it is passed to extrema to define the interval to search over. It is assumed that neither endpoint is a zero.

Returns a vector of zeros in sorted order, possibly empty.

Extended help

Examples

julia> using Roots

julia> find_zeros(x -> exp(x) - x^4, -5, 20)        # a few well-spaced zeros
3-element Vector{Float64}:
 -0.8155534188089606
  1.4296118247255556
  8.613169456441398

julia> find_zeros(x -> sin(x^2) + cos(x)^2, 0, 2pi)  # many zeros
12-element Vector{Float64}:
 1.78518032659534
 2.391345462376604
 3.2852368649448853
 3.3625557095737544
 4.016412952618305
 4.325091924521049
 4.68952781386834
 5.00494459113514
 5.35145266881871
 5.552319796014526
 5.974560835055425
 6.039177477770888

julia> find_zeros(x -> cos(x) + cos(2x), (0, 4pi))    # mix of simple, non-simple zeros
6-element Vector{Float64}:
  1.0471975511965976
  3.141592653589943
  5.235987755982988
  7.330382858376184
  9.424777960769228
 11.519173063162574

julia> f(x) = (x-0.5) * (x-0.5001) * (x-1)          # nearby zeros
f (generic function with 1 method)

julia> find_zeros(f, 0, 2)
3-element Vector{Float64}:
 0.5
 0.5001
 1.0

julia> f(x) = (x-0.5) * (x-0.5001) * (x-4) * (x-4.001) * (x-4.2)
f (generic function with 1 method)

julia> find_zeros(f, 0, 10)
3-element Vector{Float64}:
 0.5
 0.5001
 4.2

julia> f(x) = (x-0.5)^2 * (x-0.5001)^3 * (x-4) * (x-4.001) * (x-4.2)^2  # hard to identify
f (generic function with 1 method)

julia> find_zeros(f, 0, 10, no_pts=21)                # too hard for default
5-element Vector{Float64}:
 0.49999999999999994
 0.5001
 4.0
 4.001
 4.200000000000001
Note

Some cases where the number of zeros may be underreported:

  • if the initial interval, (a,b), is too wide
  • if there are zeros that are very nearby
  • the function is flat, e.g., x->0.

The basic algorithm checks for zeros among the endpoints, and then divides the interval (a,b) into no_pts-1 subintervals and then proceeds to look for zeros through bisection or a derivative-free method. As checking for a bracketing interval is relatively cheap and bisection is guaranteed to converge, each interval has k pairs of intermediate points checked for a bracket.

If any zeros are found, the algorithm uses these to partition (a,b) into subintervals. Each subinterval is shrunk so that the endpoints are not zeros and the process is repeated on the subinterval. If the initial interval is too large, then the naive scanning for zeros may be fruitless and no zeros will be reported. If there are nearby zeros, the shrinking of the interval may jump over them, though as seen in the examples, nearby roots can be identified correctly, though for really nearby points, or very flat functions, it may help to increase no_pts.

The tolerances are used to shrink the intervals, but not to find zeros within a search. For searches, bisection is guaranteed to converge with no specified tolerance. For the derivative free search, a modification of the Order0 method is used, which at worst case compares |f(x)| <= 8*eps(x) to identify a zero. The algorithm might identify more than one value for a zero, due to floating point approximations. If a potential pair of zeros satisfy isapprox(a,b,atol=sqrt(xatol), rtol=sqrt(xrtol)) then they are consolidated.

The algorithm can make many function calls. When zeros are found in an interval, the naive search is carried out on each subinterval. To cut down on function calls, though with some increased chance of missing some zeros, the adaptive nature can be skipped with the argument naive=true or the number of points stepped down.

The algorithm is derived from one in a PR by @djsegal.

Note

The IntervalRootFinding package provides a rigorous alternative to this heuristic one. That package uses interval arithmetic, so can compute bounds on the size of the image of an interval under f. If this image includes 0, then it can look for the zero. Bisection, on the other hand, only will look for a zero if the two endpoints have different signs, a much more rigid condition for a potential zero.

`IntervalRootFinding` extension

As of version 1.9 an extension is provided so that when the IntervalRootFinding package is loaded, the find_zeros function will call IntervalRootFinding.roots to find the isolating brackets and find_zero to find the roots, when possible, if the interval is specified as an Interval object, as created by -1..1, say.

For example, this function (due to @truculentmath) is particularly tricky, as it is positive at every floating point number, but has two zeros (the vertical asymptote at 15//11 is only negative within adjacent floating point values):

julia> using IntervalArithmetic, IntervalRootFinding, Roots

julia> g(x) = x^2 + 1 +log(abs( 11*x-15 ))/99
g (generic function with 1 method)

julia> find_zeros(g, -3, 3)
Float64[]

julia> IntervalRootFinding.roots(g, -3..3, IntervalRootFinding.Bisection)
1-element Vector{Root{Interval{Float64}}}:
 Root([1.36363, 1.36364], :unknown)

A less extreme usage might be the following, where unique indicates Bisection could be useful and indeed find_zeros will identify these values:

julia> g(x) = exp(x) - x^5
g (generic function with 1 method)

julia> rts = IntervalRootFinding.roots(g, -20..20)
2-element Vector{Root{Interval{Float64}}}:
 Root([12.7132, 12.7133], :unique)
 Root([1.29585, 1.29586], :unique)

julia> find_zeros(g, -20, 20)
2-element Vector{Float64}:
  1.2958555090953687
 12.713206788867632
source

CommonSolve interface

The problem-algorithm-solve interface is a pattern popularized in Julia by the DifferentialEquations.jl suite of packages. This can be used as an alternative to find_zero. Unlike find_zero, solve will return NaN on non-convergence.

CommonSolve.solve!Function
solve!(P::ZeroProblemIterator)
solve(fx::ZeroProblem, [M], [N]; p=nothing, kwargs...)
init(fx::ZeroProblem, [M], [N];
     p=nothing,
     verbose=false, tracks=NullTracks(), kwargs...)

Solve for the zero of a scalar-valued univariate function specified through ZeroProblem or ZeroProblemIterator using the CommonSolve interface.

The defaults for M and N depend on the ZeroProblem: if x0 is a number, then M=Secant() and N=AlefeldPotraShi() is used (Order0); if x0 has 2 (or more values) then it is assumed to be a bracketing interval and M=AlefeldPotraShi() is used.

The methods involved with this interface are:

  • ZeroProblem: used to specify a problem with a function (or functions) and an initial guess
  • solve: to solve for a zero in a ZeroProblem

The latter calls the following, which can be useful independently:

  • init: to initialize an iterator with a method for solution, any adjustments to the default tolerances, and a specification to log the steps or not.
  • solve! to iterate to convergence.

Returns NaN, not an error like find_zero, when the problem can not be solved. Tested for zero allocations.

Examples:

julia> using Roots

julia> fx = ZeroProblem(sin, 3)
ZeroProblem{typeof(sin), Int64}(sin, 3)

julia> solve(fx)
3.141592653589793

Or, if the iterable is required

julia> problem = init(fx);

julia> solve!(problem)
3.141592653589793

keyword arguments can be used to adjust the default tolerances.

julia> solve(fx, Order5(); atol=1/100)
3.1425464815525403

The above is equivalent to:

julia> problem = init(fx, Order5(), atol=1/100);

julia> solve!(problem)
3.1425464815525403

The keyword argument p may be used if the function(s) to be solved depend on a parameter in their second positional argument (e.g., f(x, p)). For example

julia> f(x,p) = exp(-x) - p # to solve p = exp(-x)
f (generic function with 1 method)

julia> fx = ZeroProblem(f, 1)
ZeroProblem{typeof(f), Int64}(f, 1)

julia> solve(fx; p=1/2)  # log(2)
0.6931471805599453

This would be recommended, as there is no recompilation due to the function changing. For use with broadcasting, p may also be the last positional argument.

The argument verbose=true for init instructs that steps to be logged;

The iterator interface allows for the creation of hybrid solutions, such as is used when two methods are passed to solve. For example, this is essentially how the hybrid default is constructed:

julia> function order0(f, x)
           fx = ZeroProblem(f, x)
           p = init(fx, Roots.Secant())
           xᵢ,st = ϕ = iterate(p)
           while ϕ !== nothing
               xᵢ, st = ϕ
               state, ctr = st
               fᵢ₋₁, fᵢ = state.fxn0, state.fxn1
               if sign(fᵢ₋₁)*sign(fᵢ) < 0 # check for bracket
                   x0 = (state.xn0, state.xn1)
                   fx′ = ZeroProblem(f, x0)
                   p = init(fx′, Bisection())
                   xᵢ = solve!(p)
                   break
               end
               ϕ = iterate(p, st)
           end
           xᵢ
       end
order0 (generic function with 1 method)

julia> order0(sin, 3)
3.141592653589793
source
CommonSolve.solveFunction
solve(fx::ZeroProblem, args...; verbose=false, kwargs...)

Disptaches to solve!(init(fx, args...; kwargs...)). See solve! for details.

source
Roots.ZeroProblemType
ZeroProblem{F,X}

A container for a function and initial guess to be used with solve.

source

Classical methods based on derivatives

We begin by describing the classical methods even though they are not necessarily recommended because they require more work of the user, as they give insight into why there are a variety of methods available.

The classical methods of Newton and Halley utilize information about the function and its derivative(s) in an iterative manner to converge to a zero of $f(x)$ given an initial starting value.

Newton's method is easily described:

From an initial point, the next point in the iterative algorithm is found by identifying the intersection of the $x$ axis with the tangent line of $f$ at the initial point. This is repeated until convergence or the realization that convergence won't happen for the initial point. Mathematically,

$x_{n+1} = x_{n} - f(x_n)/f'(x_n).$

Some facts are helpful to understand the different methods available in Roots:

  • For Newton's method there is a formula for the error: Set $\epsilon_n = \alpha - x_n$, where $\alpha$ is the zero, then $\epsilon_{n+1} = -f''(\xi_n)/(2f'(\xi_n) \cdot \epsilon_n^2,$ here $\xi_n$ is some value between $\alpha$ and $x_n$.

  • The error term, when of the form $|\epsilon_{n+1}| \leq C\cdot|\epsilon_n|^2$, can be used to identify an interval around $\alpha$ for which convergence is guaranteed. Such convergence is termed quadratic (order 2). For floating point solutions, quadratic convergence and a well chosen initial point can lead to convergence in 4 or 5 iterations. In general, convergence is termed order $q$ when $|\epsilon_{n+1}| \approx C\cdot|\epsilon_n|^q$

  • The term $-f''(\xi_n)/(2f'(\xi_n)$ indicates possible issues when $f''$ is too big near $\alpha$ or $f'$ is too small near $\alpha$. In particular if $f'(\alpha) = 0$, there need not be quadratic convergence, and convergence can take many iterations. A zero for which $f(x) = (x-\alpha)^{1+\beta}\cdot g(x)$, with $g(\alpha) \neq 0$ is called simple when $\beta=0$ and non-simple when $\beta > 0$. Newton's method is quadratic near simple zeros and need not be quadratic near non-simple zeros.

As well, if $f''$ is too big near $\alpha$, or $f'$ too small near $\alpha$, or $x_n$ too far from $\alpha$ (that is, $|\epsilon_n|>1$) the error might actually increase and convergence is not guaranteed.

  • The explicit form of the error function can be used to guarantee convergence for functions with a certain shape (monotonic, convex functions where the sign of $f''$ and $f'$ don't change). Quadratic convergence may only occur once the algorithm is near the zero.

  • The number of function evaluations per step for Newton's method is 2.


Roots.NewtonType
Roots.Newton()

Implements Newton's method: xᵢ₊₁ = xᵢ - f(xᵢ)/f'(xᵢ). This is a quadratically convergent method requiring one derivative and two function calls per step.

Examples

julia> using Roots

julia> find_zero((sin,cos), 3.0, Roots.Newton()) ≈ π
true

If function evaluations are expensive one can pass in a function which returns (f, f/f') as follows

julia> find_zero(x -> (sin(x), sin(x)/cos(x)), 3.0, Roots.Newton()) ≈ π
true

This can be advantageous if the derivative is easily computed from the value of f, but otherwise would be expensive to compute.


The error, eᵢ = xᵢ - α, can be expressed as eᵢ₊₁ = f[xᵢ,xᵢ,α]/(2f[xᵢ,xᵢ])eᵢ² (Sidi, Unified treatment of regula falsi, Newton-Raphson, secant, and Steffensen methods for nonlinear equations).

source
Roots.HalleyType
Roots.Halley()

Implements Halley's method, xᵢ₊₁ = xᵢ - (f/f')(xᵢ) * (1 - (f/f')(xᵢ) * (f''/f')(xᵢ) * 1/2)^(-1) This method is cubically converging, it requires $3$ function calls per step. Halley's method finds xₙ₊₁ as the zero of a hyperbola at the point (xₙ, f(xₙ)) matching the first and second derivatives of f.

The function, its derivative and second derivative can be passed either as a tuple of $3$ functions or as a function returning values for $(f, f/f', f'/f'')$, which could be useful when function evaluations are expensive.

Examples

julia> using Roots

julia> find_zero((sin, cos, x->-sin(x)), 3.0, Roots.Halley()) ≈ π
true

julia> function f(x)
       s,c = sincos(x)
       (s, s/c, -c/s)
       end;

julia> find_zero(f, 3.0, Roots.Halley()) ≈ π
true

This can be advantageous if the derivatives are easily computed from the computation for f, but otherwise would be expensive to compute separately.


The error, eᵢ = xᵢ - α, satisfies eᵢ₊₁ ≈ -(2f'⋅f''' -3⋅(f'')²)/(12⋅(f'')²) ⋅ eᵢ³ (all evaluated at α).

source
Roots.QuadraticInverseType
Roots.QuadraticInverse()

Implements the quadratic inverse method also known as Chebyshev's method, xᵢ₊₁ = xᵢ - (f/f')(xᵢ) * (1 + (f/f')(xᵢ) * (f''/f')(xᵢ) * 1/2). This method is cubically converging, it requires $3$ function calls per step.

Example

julia> using Roots

julia> find_zero((sin, cos, x->-sin(x)), 3.0, Roots.QuadraticInverse()) ≈ π
true

If function evaluations are expensive one can pass in a function which returns (f, f/f',f'/f'') as follows

julia> find_zero(x -> (sin(x), sin(x)/cos(x), -cos(x)/sin(x)), 3.0, Roots.QuadraticInverse()) ≈ π
true

This can be advantageous if the derivatives are easily computed from the computation for f, but otherwise would be expensive to compute separately.

The error, eᵢ = xᵢ - α, satisfies eᵢ₊₁ ≈ (1/2⋅(f''/f')² - 1/6⋅f'''/f')) ⋅ eᵢ³ (all evaluated at α).

source
Roots.ChebyshevLikeType

Chebyshev-like methods and quadratic equations (J. A. Ezquerro, J. M. Gutiérrez, M. A. Hernández and M. A. Salanova)

source
Roots.SuperHalleyType

An acceleration of Newton's method: Super-Halley method (J.M. Gutierrez, M.A. Hernandez)

source

Newton and Halley's method are members of this family of methods:

Roots.LithBoonkkampIJzermanType
LithBoonkkampIJzerman{S,D} <: AbstractNewtonLikeMethod
LithBoonkkampIJzerman(S,D)

A family of different methods that includes the secant method and Newton's method.

Specifies a linear multistep solver with S steps and D derivatives following Lith, Boonkkamp, and IJzerman.

Extended help

Examples

julia> using Roots

julia> find_zero(sin, 3, Roots.LithBoonkkampIJzerman(2,0)) ≈ π # the secant method
true

julia> find_zero((sin,cos), 3, Roots.LithBoonkkampIJzerman(1,1)) ≈ π # Newton's method
true

julia> find_zero((sin,cos), 3, Roots.LithBoonkkampIJzerman(3,1)) ≈ π # Faster convergence rate
true

julia> find_zero((sin,cos, x->-sin(x)), 3, Roots.LithBoonkkampIJzerman(1,2)) ≈ π # Halley-like method
true

The method can be more robust to the initial condition. This example is from the paper (p13). Newton's method (the S=1, D=1 case) fails if |x₀| ≥ 1.089 but methods with more memory succeed.

julia> fx =  ZeroProblem((tanh,x->sech(x)^2), 1.239); # zero at 0.0

julia> solve(fx, Roots.LithBoonkkampIJzerman(1,1)) |> isnan# Newton, NaN
true

julia> solve(fx, Roots.LithBoonkkampIJzerman(2,1)) |> abs |> <(eps())
true

julia> solve(fx, Roots.LithBoonkkampIJzerman(3,1)) |> abs |> <(eps())
true

Multiple derivatives can be constructed automatically using automatic differentiation. For example,

julia> using ForwardDiff

julia> function δ(f, n::Int=1)
           n <= 0 && return f
           n == 1 && return x -> ForwardDiff.derivative(f,float(x))
           δ(δ(f,1),n-1)
       end;

julia> fs(f,n) = ntuple(i -> δ(f,i-1), Val(n+1));

julia> f(x) = cbrt(x) * exp(-x^2); # cf. Table 6 in paper, α = 0

julia> fx = ZeroProblem(fs(f,1), 0.1147);

julia> opts = (xatol=2eps(), xrtol=0.0, atol=0.0, rtol=0.0); # converge if |xₙ - xₙ₋₁| <= 2ϵ

julia> solve(fx, Roots.LithBoonkkampIJzerman(1, 1); opts...) |> isnan # NaN -- no convergence
true

julia> solve(fx, Roots.LithBoonkkampIJzerman(2, 1); opts...) |> abs |> <(eps()) # converges
true

julia> fx = ZeroProblem(fs(f,2), 0.06);                       # need better starting point

julia> solve(fx, Roots.LithBoonkkampIJzerman(2, 2); opts...) |> abs |> <(eps()) # converges
true

For the case D=1, a bracketing method based on this approach is implemented in LithBoonkkampIJzermanBracket

Reference

In Lith, Boonkkamp, and IJzerman an analysis is given of the convergence rates when using linear multistep methods to solve 0=f(x) using f⁻¹(0) = x when f is a sufficiently smooth linear function. The reformulation, attributed to Grau-Sanchez, finds a differential equation for f⁻¹: dx/dy = [f⁻¹]′(y) = 1/f′(x) = F as x(0) = x₀ + ∫⁰_y₀ F(x(y)) dy.

A linear multi-step method is used to solve this equation numerically. Let S be the number of memory steps (S= 1,2,...) and D be the number of derivatives employed, then, with F(x) = dx/dy x_{n+S} = ∑_{k=0}^{S-1} aₖ x_{n+k} +∑d=1^D ∑_{k=1}^{S-1} aᵈ_{n+k}F⁽ᵈ⁾(x_{n+k}). The aₖs and aᵈₖs are computed each step.

This table is from Tables 1 and 3 of the paper and gives the convergence rate for simple roots identified therein:

s: number of steps remembered
d: number of derivatives uses
s/d  0    1    2    3    4
1    .    2    3    4    5
2    1.62 2.73 3.79 4.82 5.85
3    1.84 2.91 3.95 4.97 5.98
4    1.92 2.97 3.99 4.99 5.996
5    1.97 .    .    .    .

That is, more memory leads to a higher convergence rate; more derivatives leads to a higher convergence rate. However, the interval about α, the zero, where the convergence rate is guaranteed may get smaller.

Note

For the larger values of S, the expressions to compute the next value get quite involved. The higher convergence rate is likely only to be of help for finding solutions to high precision.

source

Derivative free methods

The secant method replaces the derivative term in Newton's method with the slope of a secant line using two prior values:

$x_{n+1} = x_n - (\frac{f(x_n)-f(x_{n-1})}{x_n - x_{n-1}})^{-1}\cdot f(x_n).$

Though the secant method has convergence rate of order $\approx 1.618$ – i.e., is not quadratic – it only requires one new function call per step so can be very effective. Often function evaluations are the slowest part of the computation and, as well, no derivative is needed. Because it can be very efficient, the secant method is used in the default method of find_zero when called with a single initial starting point.

Steffensen's method is a quadratically converging. derivative-free method which uses a secant line based on $x_n$ and $x_n + f(x_n)$. Though of higher order, it requires additional function calls per step and depends on a good initial starting value. Other derivative free methods are available, trading off increased function calls for higher-order convergence. They may be of interest when arbitrary precision is needed. A measure of efficiency is $q^{1/r}$ where $q$ is the order of convergence and $r$ the number of function calls per step. With this measure, the secant method would be $\approx (1.618)^{1/1}$ and Steffensen's would be less ($2^{1/2}$).


Roots.SecantType
Secant()
Order1()
Orderφ()

The Order1() method is an alias for Secant. It specifies the secant method. This method keeps two values in its state, xₙ and xₙ₋₁. The updated point is the intersection point of $x$ axis with the secant line formed from the two points. The secant method uses $1$ function evaluation per step and has order φ≈ (1+sqrt(5))/2.

The error, eᵢ = xᵢ - α, satisfies eᵢ₊₂ = f[xᵢ₊₁,xᵢ,α] / f[xᵢ₊₁,xᵢ] * (xᵢ₊₁-α) * (xᵢ - α).

source
Roots.Order1Type
Secant()
Order1()
Orderφ()

The Order1() method is an alias for Secant. It specifies the secant method. This method keeps two values in its state, xₙ and xₙ₋₁. The updated point is the intersection point of $x$ axis with the secant line formed from the two points. The secant method uses $1$ function evaluation per step and has order φ≈ (1+sqrt(5))/2.

The error, eᵢ = xᵢ - α, satisfies eᵢ₊₂ = f[xᵢ₊₁,xᵢ,α] / f[xᵢ₊₁,xᵢ] * (xᵢ₊₁-α) * (xᵢ - α).

source
Roots.SteffensenType
Steffensen()

The quadratically converging Steffensen method is used for the derivative-free Order2() algorithm. Unlike the quadratically converging Newton's method, no derivative is necessary, though like Newton's method, two function calls per step are. Steffensen's algorithm is more sensitive than Newton's method to poor initial guesses when f(x) is large, due to how f'(x) is approximated. The Order2 method replaces a Steffensen step with a secant step when f(x) is large.

The error, eᵢ - α, satisfies eᵢ₊₁ = f[xᵢ, xᵢ+fᵢ, α] / f[xᵢ,xᵢ+fᵢ] ⋅ (1 - f[xᵢ,α] ⋅ eᵢ²

source
Roots.Order5Type
Order5()
KumarSinghAkanksha()

Implements an order 5 algorithm from A New Fifth Order Derivative Free Newton-Type Method for Solving Nonlinear Equations by Manoj Kumar, Akhilesh Kumar Singh, and Akanksha, Appl. Math. Inf. Sci. 9, No. 3, 1507-1513 (2015), DOI: 10.12785/amis/090346. Four function calls per step are needed. The Order5 method replaces a Steffensen step with a secant step when f(x) is large.

The error, eᵢ = xᵢ - α, satisfies eᵢ₊₁ = K₁ ⋅ K₅ ⋅ M ⋅ eᵢ⁵ + O(eᵢ⁶)

source
Roots.Order8Type
Order8()
Thukral8()

Implements an eighth-order algorithm from New Eighth-Order Derivative-Free Methods for Solving Nonlinear Equations by Rajinder Thukral, International Journal of Mathematics and Mathematical Sciences Volume 2012 (2012), Article ID 493456, 12 pages DOI: 10.1155/2012/493456. Four function calls per step are required. The Order8 method replaces a Steffensen step with a secant step when f(x) is large.

The error, eᵢ = xᵢ - α, is expressed as eᵢ₊₁ = K ⋅ eᵢ⁸ in (2.25) of the paper for an explicit K.

source
Roots.Order16Type
Order16()
Thukral16()

Implements the order 16 algorithm from New Sixteenth-Order Derivative-Free Methods for Solving Nonlinear Equations by R. Thukral, American Journal of Computational and Applied Mathematics p-ISSN: 2165-8935; e-ISSN: 2165-8943; 2012; 2(3): 112-118 DOI: 10.5923/j.ajcam.20120203.08.

Five function calls per step are required. Though rapidly converging, this method generally isn't faster (fewer function calls/steps) over other methods when using Float64 values, but may be useful for solving over BigFloat. The Order16 method replaces a Steffensen step with a secant step when f(x) is large.

The error, eᵢ = xᵢ - α, is expressed as eᵢ₊₁ = K⋅eᵢ¹⁶ for an explicit K in equation (50) of the paper.

source

Bracketing methods

The bisection method identifies a zero of a continuous function between $a$ and $b$ when $f(a)$ and $f(b)$ have different signs. (The interval $[a,b]$ is called a bracketing interval when $f(a)\cdot f(b) <0$.) The basic algorithm is particularly simple, an interval $[a_i,b_i]$ is split at $c = (a_i+b_i)/2$. Either $f(c)=0$, or one of $[a_i,c]$ or $[c,b_i]$ is a bracketing interval, which is called $[a_{i+1},b_{i+1}]$. From this description, we see that $[a_i,b_i]$ has length $2^{-i}$ times the length of $[a_0,b_0]$, so the intervals will eventually terminate by finding a zero, $c$, or converge to a zero. This convergence is slow (the efficiency is only $1$, but guaranteed. For 16-, 32-, and 64-bit floating point values, a reinterpretation of how the midpoint ($c$) is found leads to convergence in no more than $64$ iterations, unlike the midpoint found above, where some cases can take many more steps to converge.

In floating point, by guaranteed convergence we have either an exact zero or a bracketing interval consisting of two adjacent floating point values. When applied to non-continuous functions, this algorithm will identify an exact zero or a zero crossing of the function. (E.g., applied to $f(x)=1/x$ it will find $0$.)

The default selection of midpoint described above includes no information about the function $f$ beyond its sign. Algorithms exploiting the shape of the function can be significantly more efficient. For example, the bracketing method Roots.AlefeldPotraShi due to Alefeld, Potra, and Shi has efficiency $\approx 1.6686$. This method is also used in the default method for find_zero when a single initial starting point is given if a bracketing interval is identified.


Roots.BisectionType
Bisection()

If possible, will use the bisection method over Float64 values. The bisection method starts with a bracketing interval [a,b] and splits it into two intervals [a,c] and [c,b], If c is not a zero, then one of these two will be a bracketing interval and the process continues. The computation of c is done by _middle, which reinterprets floating point values as unsigned integers and splits there. It was contributed by Jason Merrill. This method avoids floating point issues and when the tolerances are set to zero (the default) guarantees a "best" solution (one where a zero is found or the bracketing interval is of the type [a, nextfloat(a)]).

When tolerances are given, this algorithm terminates when the interval length is less than or equal to the tolerance max(δₐ, 2abs(u)δᵣ) with u in {a,b} chosen by the smaller of |f(a)| and |f(b)|, or or the function value is less than max(tol, min(abs(a), abs(b)) * rtol). The latter is used only if the default tolerances (atol or rtol) are adjusted.

When solving $f(x,p) = 0$ for $x^*(p)$ using Bisection one can not take the derivative directly via automatatic differentiation, as the algorithm is not differentiable. See Sensitivity in the documentation for alternatives.

source
Roots.A42Type
Roots.A42()

Bracketing method which finds the root of a continuous function within a provided bracketing interval [a, b], without requiring derivatives. It is based on Algorithm 4.2 described in: G. E. Alefeld, F. A. Potra, and Y. Shi, "Algorithm 748: enclosing zeros of continuous functions," ACM Trans. Math. Softw. 21, 327–344 (1995), DOI: 10.1145/210089.210111. The asymptotic efficiency index, $q^{1/k}$, is $(2 + 7^{1/2})^{1/3} = 1.6686...$.

Originally by John Travers.

Note

The paper refenced above shows that for a continuously differentiable $f$ over $[a,b]$ with a simple root the algorithm terminates at a zero or asymptotically the steps are of the inverse cubic type (Lemma 5.1). This is proved under an assumption that $f$ is four-times continuously differentiable.

source
Roots.AlefeldPotraShiType
Roots.AlefeldPotraShi()

Follows Algorithm 4.1 in "ON ENCLOSING SIMPLE ROOTS OF NONLINEAR EQUATIONS", by Alefeld, Potra, Shi; DOI: 10.1090/S0025-5718-1993-1192965-2.

The order of convergence is 2 + √5; asymptotically there are 3 function evaluations per step. Asymptotic efficiency index is $(2+√5)^{1/3} ≈ 1.618...$. Less efficient, but can run faster than the related A42 method.

Originally by John Travers.

source
Roots.BrentType
Roots.Brent()

An implementation of Brent's (or Brent-Dekker) method. This method uses a choice of inverse quadratic interpolation or a secant step, falling back on bisection if necessary.

source
Roots.RiddersType
Roots.Ridders()

Implements Ridders' method. This bracketing method finds the midpoint, x₁; then interpolates an exponential; then uses false position with the interpolated value to find c. If c and x₁ form a bracket is used, otherwise the subinterval [a,c] or [c,b] is used.

Example:

julia> using Roots

julia> find_zero(x -> exp(x) - x^4, (5, 15), Roots.Ridders()) ≈ 8.61316945644
true

julia> find_zero(x -> x*exp(x) - 10, (-100, 100), Roots.Ridders()) ≈ 1.74552800274
true

julia> find_zero(x -> tan(x)^tan(x) - 1e3, (0, 1.5), Roots.Ridders()) ≈ 1.3547104419
true

Ridders showed the error satisfies eₙ₊₁ ≈ 1/2 eₙeₙ₋₁eₙ₋₂ ⋅ (g^2-2fh)/f for f=F', g=F''/2, h=F'''/6, suggesting converence at rate ≈ 1.839.... It uses two function evaluations per step, so its order of convergence is ≈ 1.225....

source
Roots.ITPType
Roots.ITP(;[κ₁=0.2, κ₂=2, n₀=1])

Use the ITP bracketing method. This method claims it "is the first root-finding algorithm that achieves the superlinear convergence of the secant method while retaining the optimal worst-case performance of the bisection method."

The values κ₁, κ₂, and n₀ are tuning parameters.

The suggested value of κ₁ is 0.2/(b-a), but the default here is 0.2. The value of κ₂ is 2, and the default value of n₀ is 1.

Note:

Suggested on discourse by @TheLateKronos, who supplied the original version of the code.

source
Roots.FalsePositionType
FalsePosition([galadino_factor])

Use the false position method to find a zero for the function f within the bracketing interval [a,b].

The false position method is a modified bisection method, where the midpoint between [aₖ, bₖ] is chosen to be the intersection point of the secant line with the $x$ axis, and not the average between the two values.

To speed up convergence for concave functions, this algorithm implements the $12$ reduction factors of Galdino (A family of regula falsi root-finding methods). These are specified by number, as in FalsePosition(2) or by one of three names FalsePosition(:pegasus), FalsePosition(:illinois), or FalsePosition(:anderson_bjork) (the default). The default choice has generally better performance than the others, though there are exceptions.

For some problems, the number of function calls can be greater than for the Bisection method, but generally this algorithm will make fewer function calls.

Examples

find_zero(x -> x^5 - x - 1, (-2, 2), FalsePosition())
source
Roots.LithBoonkkampIJzermanBracketType
LithBoonkkampIJzermanBracket()

A bracketing method which is a modification of Brent's method due to Lith, Boonkkamp, and IJzerman. The best possible convergence rate is 2.91.

A function, its derivative, and a bracketing interval need to be specified.

The state includes the 3 points – a bracket [a,b] (b=xₙ has f(b) closest to 0) and c=xₙ₋₁ – and the corresponding values for the function and its derivative at these three points.

The next proposed step is either a S=2 or S=3 selection for the LithBoonkkampIJzerman methods with derivative information included only if it would be of help. The proposed is modified if it is dithering. The proposed is compared against a bisection step; the one in the bracket and with the smaller function value is chosen as the next step.

source
Roots.BracketedHalleyType
BracketedHalley

For a bracket [a,b], uses the Roots.Halley method starting at the x value for which fa or fb is closest to 0. Uses the Roots.AbstractAlefeldPotraShi framework to enforce the bracketing, taking an additional double secant step each time.

source
Roots.BracketedChebyshevType
BracketedChebyshev

For a bracket [a,b], uses the Roots.QuadraticInverse method starting at the x value for which fa or fb is closest to 0. Uses the Roots.AbstractAlefeldPotraShi framework to enforce the bracketing, taking an additional double secant step each time.

source
Roots.BracketedSchroderType
BracketedSchroder

For a bracket [a,b], uses the Roots.Schroder method starting at the x value for which fa or fb is closest to 0. Uses the Roots.AbstractAlefeldPotraShi framework to enforce the bracketing, taking an additional double secant step each time.

source

Non-simple zeros

The order of convergence for most methods is for simple zeros, values $\alpha$ where $f(x) = (x-\alpha) \cdot g(x)$, with $g(\alpha)$ being non-zero. For methods which are of order $k$ for non-simple zeros, usually an additional function call is needed per step. For example, this is the case for Roots.Newton as compared to Roots.Schroder.

Derivative-free methods for non-simple zeros have the following implemented:

Roots.KingType
Roots.King()

A superlinear (order 1.6...) modification of the secant method for multiple roots. Presented in A SECANT METHOD FOR MULTIPLE ROOTS, by RICHARD F. KING, BIT 17 (1977), 321-328

The basic idea is similar to Schroder's method: apply the secant method to f/f'. However, this uses f' ~ fp = (fx - f(x-fx))/fx (a Steffensen step). In this implementation, Order1B, when fx is too big, a single secant step of f is used.

The asymptotic error, eᵢ = xᵢ - α, is given by eᵢ₊₂ = 1/2⋅G''/G'⋅ eᵢ⋅eᵢ₊₁ + (1/6⋅G'''/G' - (1/2⋅G''/G'))^2⋅eᵢ⋅eᵢ₊₁⋅(eᵢ+eᵢ₊₁).

source
Roots.EsserType
Roots.Esser()

Esser's method. This is a quadratically convergent method that, like Schroder's method, does not depend on the multiplicity of the zero. Schroder's method has update step x - r2/(r2-r1) * r1, where ri = fⁱ⁻¹/fⁱ. Esser approximates f' ~ f[x-h, x+h], f'' ~ f[x-h,x,x+h], where h = fx, as with Steffensen's method, Requiring 3 function calls per step. The implementation Order2B uses a secant step when |fx| is considered too large.

Esser, H. Computing (1975) 14: 367. DOI: 10.1007/BF02253547 Eine stets quadratisch konvergente Modification des Steffensen-Verfahrens

Examples

f(x) = cos(x) - x
g(x) = f(x)^2
x0 = pi/4
find_zero(f, x0, Order2(), verbose=true)        #  3 steps / 7 function calls
find_zero(f, x0, Roots.Order2B(), verbose=true) #  4 / 9
find_zero(g, x0, Order2(), verbose=true)        #  22 / 45
find_zero(g, x0, Roots.Order2B(), verbose=true) #  4 / 10
source

For non-simple zeros, Schroder showed an additional derivative can be used to yield quadratic convergence based on Newton's method:

Roots.SchroderType
Roots.Schroder()

Schröder's method, like Halley's method, utilizes f, f', and f''. Unlike Halley it is quadratically converging, but this is independent of the multiplicity of the zero (cf. Schröder, E. "Über unendlich viele Algorithmen zur Auflösung der Gleichungen." Math. Ann. 2, 317-365, 1870; mathworld).

Schröder's method applies Newton's method to f/f', a function with all simple zeros.

Example

m = 2
f(x) = (cos(x)-x)^m
fp(x) = (-x + cos(x))*(-2*sin(x) - 2)
fpp(x) = 2*((x - cos(x))*cos(x) + (sin(x) + 1)^2)
find_zero((f, fp, fpp), pi/4, Roots.Halley())     # 14 steps
find_zero((f, fp, fpp), 1.0, Roots.Schroder())    # 3 steps

(Whereas, when m=1, Halley is 2 steps to Schröder's 3.)

If function evaluations are expensive one can pass in a function which returns (f, f/f',f'/f'') as follows

find_zero(x -> (sin(x), sin(x)/cos(x), -cos(x)/sin(x)), 3.0, Roots.Schroder())

This can be advantageous if the derivatives are easily computed from the value of f, but otherwise would be expensive to compute.

The error, eᵢ = xᵢ - α, is the same as Newton with f replaced by f/f'.

source

A family of methods for non-simple zeros which require $k$ derivatives to be order $k$, with $k=2$ yielding Schroder's method, are implemented in:

Roots.AbstractThukralBMethodType
AbstractThukralBMethod

Abstract type for ThukralXB methods for X being 2,3,4, or 5.

These are a family of methods which are

  • efficient (order X) for non-simple roots (e.g. Thukral2B is the Schroder method)
  • take X+1 function calls per step
  • require X derivatives. These can be passed as a tuple of functions, (f, f', f'', …), or as

a function returning the ratios: x -> (f(x), f(x)/f'(x), f'(x)/f''(x), …).

Examples

using ForwardDiff
Base.adjoint(f::Function)  = x  -> ForwardDiff.derivative(f, float(x))
f(x) = (exp(x) + x - 2)^6
x0 = 1/4
find_zero((f, f', f''), x0, Roots.Halley())               # 14 iterations; ≈ 48 function evaluations
find_zero((f, f', f''), big(x0), Roots.Thukral2B())       #  3 iterations; ≈ 9 function evaluations
find_zero((f, f', f'', f'''), big(x0), Roots.Thukral3B()) #  2 iterations; ≈ 8 function evaluations

Reference

Introduction to a family of Thukral $k$-order method for finding multiple zeros of nonlinear equations, R. Thukral, JOURNAL OF ADVANCES IN MATHEMATICS 13(3):7230-7237, DOI: 10.24297/jam.v13i3.6146.

source

Hybrid methods

A useful strategy is to begin with a non-bracketing method and switch to a bracketing method should a bracket be encountered. This allows for the identification of zeros which are not surrounded by a bracket, and have guaranteed convergence should a bracket be encountered. It is used by default by find_zero(f,a).

Roots.Order0Type
Order0()

The Order0 method is engineered to be a more robust, though possibly slower, alternative to the other derivative-free root-finding methods. The implementation roughly follows the algorithm described in Personal Calculator Has Key to Solve Any Equation $f(x) = 0$, the SOLVE button from the HP-34C. The basic idea is to use a secant step. If along the way a bracket is found, switch to a bracketing algorithm, using AlefeldPotraShi. If the secant step fails to decrease the function value, a quadratic step is used up to $4$ times.

This is not really $0$-order: the secant method has order $1.6...$ Wikipedia and the the bracketing method has order $1.6180...$ Wikipedia so for reasonable starting points and functions, this algorithm should be superlinear, and relatively robust to non-reasonable starting points.

source

All zeros

The find_zeros function heuristically scans an interval for all zeros using a combination of bracketing and non-bracketing methods. The AllZeros method may be passed to solve to call this.

Roots.AllZerosType
AllZeros

Type to indicate to solve that find_zeros should be used to solve the given ZeroProblem.

Example

julia> Z = ZeroProblem(cos, (0, 2pi));

julia> solve(Z, AllZeros())
2-element Vector{Float64}:
 1.5707963267948966
 4.71238898038469
source

Rates of convergence

The order of a method is $q$, where $e_{i+1} \approx e_i^q$. Newton's method is famously quadratic for simple roots; the secant method of order $q \approx \varphi=1.618\dots$. However, $p=2$ calls are needed for Newton's method, and only $p=1$ for the secant method. The asymptotic efficiency is $q^{1/p}$, which penalizes function calls. There are other order $k$ methods taking $k$ function calls per step, e.g., Halley's; others take fewer, as seen below. Many use inverse quadratic steps, others inverse cubic–these have order $q$ solving $q^{s+1}-2q^s+1$ ($s=3$ for quadratic). For robust methods, generally $1$ additional function call is needed to achieve the convergence rate, Schroder being a good example.

TypeMethodOrderF evalsAsymptotic efficiency
HybridOrder0$\approx 1.618\dots$
Derivative FreeSecant$\varphi=1.618\dots$$1$$1.618\dots$
Derivative FreeSteffensen$2$$2$$1.414\dots$
Derivative FreeOrder5$5$$4$$1.495\dots$
Derivative FreeOrder8$8$$4$$1.681\dots$
Derivative FreeOrder16$16$$5$$1.718\dots$
ClassicalNewton$2$$2$$1.414\dots$
ClassicalHalley$3$$3$$1.442\dots$
ClassicalQuadraticInverse$3$$3$$1.442\dots$
ClassicalChebyshevLike$3$$3$$1.442\dots$
ClassicalSuperHalley$3$$3$$1.442\dots$
MultiStepLithBoonkkampIJzerman{S,D}$p^s=\sum p^k(d+\sigma_k)$$D+1$varies, $1.92\dots$ max
BracketingBisectionExact$1$$1$$1$
BracketingA42$(2 + 7^{1/2})$$3,4$$(2 + 7^{1/2})^{1/3} = 1.6686\dots$
BracketingAlefeldPotraShi$3,4$$1.618\dots$
BracketingBrent$\leq 1.89\dots$$1$$\leq 1.89\dots$
BracketingITP$\leq \varphi$$1$$\leq \varphi$
BracketingRidders$1.83\dots$$2$$1.225\dots$
BracketingFalsePosition$1.442\dots$$1$$1.442\dots$
BracketingLithBoonkkampIJzermanBracket$2.91$$3$$1.427\dots$
RobustKing$\varphi=1.618\dots$$2$$1.272\dots$
RobustEsser$2$$3$$1.259\dots$
RobustSchroder$2$$3$$1.259\dots$
RobustThukral3$3$$4$$1.316\dots$
RobustThukral4$4$$5$$1.319\dots$
RobustThukral5$5$$6$$1.307\dots$

Convergence

Identifying when an algorithm converges or diverges requires specifications of tolerances and convergence criteria.

In the case of exact bisection, convergence is mathematically guaranteed. For floating point numbers, either an exact zero is found, or the bracketing interval can be subdivided into $[a_n,b_n]$ with $a_n$ and $b_n$ being adjacent floating point values. That is $b_n-a_n$ is as small as possible in floating point numbers. This can be considered a stopping criteria in $\Delta x$. For early termination (less precision but fewer function calls) a tolerance can be given so that if $\Delta_n=b_n-a_n$ is small enough the algorithm stops successfully. In floating point, assessing if $b_n \approx a_n$ requires two tolerances: a relative tolerance, as the minimal differences in floating point values depend on the size of $b_n$ and $a_n$, and an absolute tolerance for values near $0$. The values xrtol and xatol are passed to the Base.isapprox function to determine closeness.

Relying on the closeness of two $x$ values will not be adequate for all problems, as there are examples where the difference $\Delta_n=|x_n-x_{n-1}|$ can be quite small, $0$ even, yet $f(x_n)$ is not near a $0$. As such, for non-bracketing methods, a check on the size of $f(x_n)$ is also used. As we find floating point approximations to $\alpha$, the zero, we must consider values small when $f(\alpha(1+\epsilon))$ is small. By Taylor's approximation, we can expect this to be around $\alpha\cdot \epsilon \cdot f'(\alpha)$. That is, small depends on the size of $\alpha$ and the derivative at $\alpha$. The former is handled by both relative and absolute tolerances (rtol and atol). The size of $f'(\alpha)$ is problem dependent, and can be accommodated by larger relative or absolute tolerances.

When an algorithm returns an NaN value, it terminates. This can happen near convergence or may indicate some issues. Early termination is checked for convergence in the size of $f(x_n)$ with a relaxed tolerance when strict=false is specified (the default).

Relative tolerances and assessing `f(x) ≈ 0`

The use of relative tolerances to check if $f(x) \approx 0$ can lead to spurious answers where $x$ is very large (and hence the relative tolerance is large). The return of very large solutions should be checked against expectations of the answer.

Deciding if an algorithm won't terminate is done through counting the number or iterations performed; the default adjusted through maxiters. As most algorithms are superlinear, convergence happens rapidly near the answer, but all the algorithms can take a while to get near an answer, even when progress is made. As such, the maximum must be large enough to consider linear cases, yet small enough to avoid too many steps when an algorithm is non-convergent.

Convergence criteria are method dependent and are determined by the Roots.assess_convergence methods.

Roots.assess_convergenceFunction
Roots.assess_convergence(method, state, options)

Assess if algorithm has converged.

Return a convergence flag and a Boolean indicating if algorithm has terminated (converged or not converged)

If algorithm hasn't converged this returns (:not_converged, false).

If algorithm has stopped or converged, return flag and true. Flags are:

  • :x_converged if xn1 ≈ xn, typically with non-zero tolerances specified.

  • :f_converged if |f(xn1)| < max(atol, |xn1|*rtol)

  • :nan or :inf if fxn1 is NaN or an infinity.

  • :not_converged if algorithm should continue

Does not check number of steps taken nor number of function evaluations.

In decide_convergence, stopped values (and :x_converged when strict=false) are checked for convergence with a relaxed tolerance.

source

Default tolerances are specified through the Roots.default_tolerances methods.

Roots.default_tolerancesFunction
default_tolerances(M::AbstractUnivariateZeroMethod, [T], [S])

The default tolerances for most methods are xatol=eps(T), xrtol=eps(T), atol=4eps(S), and rtol=4eps(S), with the proper units (absolute tolerances have the units of x and f(x); relative tolerances are unitless). For Complex{T} values, T is used.

The number of iterations is limited by maxiters=40.

source
default_tolerances(M::AbstractBisectionMethod, [T], [S])

For Bisection when the x values are of type Float64, Float32, or Float16, the default tolerances are zero and there is no limit on the number of iterations. In this case, the algorithm is guaranteed to converge to an exact zero, or a point where the function changes sign at one of the answer's adjacent floating point values.

For other types, default non-zero tolerances for xatol and xrtol are given.

source

Simplified versions

The abstractions and many checks for convergence employed by find_zero have a performance cost. When that is a critical concern, there are several "simple" methods provided which can offer improved performance.

Roots.secant_methodFunction
secant_method(f, xs; [atol=0.0, rtol=8eps(), maxevals=1000])

Perform secant method to solve f(x) = 0.

The secant method is an iterative method with update step given by b - fb/m where m is the slope of the secant line between (a,fa) and (b,fb).

The initial values can be specified as a pair of 2, as in (x₀, x₁) or [x₀, x₁], or as a single value, x₁ in which case a value of x₀ is chosen.

The algorithm returns m when abs(fm) <= max(atol, abs(m) * rtol). If this doesn't occur before maxevals steps or the algorithm encounters an issue, a value of NaN is returned. If too many steps are taken, the current value is checked to see if there is a sign change for neighboring floating point values.

The Order1 method for find_zero also implements the secant method. This one should be slightly faster, as there are fewer setup costs.

Examples:

Roots.secant_method(sin, (3,4))
Roots.secant_method(x -> x^5 -x - 1, 1.1)
Specialization

This function will specialize on the function f, so that the initial call can take more time than a call to the Order1() method, though subsequent calls will be much faster. Using FunctionWrappers.jl can ensure that the initial call is also equally as fast as subsequent ones.

source
Roots.bisectionFunction
bisection(f, a, b; [xatol, xrtol])

Performs bisection method to find a zero of a continuous function.

It is assumed that (a,b) is a bracket, that is, the function has different signs at a and b. The interval (a,b) is converted to floating point and shrunk when a or b is infinite. The function f may be infinite for the typical case. If f is not continuous, the algorithm may find jumping points over the x axis, not just zeros.

If non-trivial tolerances are specified, the process will terminate when the bracket (a,b) satisfies isapprox(a, b, atol=xatol, rtol=xrtol). For zero tolerances, the default, for Float64, Float32, or Float16 values, the process will terminate at a value x with f(x)=0 or f(x)*f(prevfloat(x)) < 0 or f(x) * f(nextfloat(x)) < 0. For other number types, the Roots.A42 method is used.

source
Roots.mullerFunction
muller(f, xᵢ; xatol=nothing, xrtol=nothing, maxevals=100)
muller(f, xᵢ₋₂, xᵢ₋₁, xᵢ; xatol=nothing, xrtol=nothing, maxevals=100)

Muller’s method generalizes the secant method, but uses quadratic interpolation among three points instead of linear interpolation between two. Solving for the zeros of the quadratic allows the method to find complex pairs of roots. Given three previous guesses for the root xᵢ₋₂, xᵢ₋₁, xᵢ, and the values of the polynomial f at those points, the next approximation xᵢ₊₁ is produced.

Excerpt and the algorithm taken from

W.H. Press, S.A. Teukolsky, W.T. Vetterling and B.P. Flannery Numerical Recipes in C, Cambridge University Press (2002), p. 371

Convergence here is decided by xᵢ₊₁ ≈ xᵢ using the tolerances specified, which both default to eps(one(typeof(abs(xᵢ))))^4/5 in the appropriate units. Each iteration performs three evaluations of f. The first method picks two remaining points at random in relative proximity of xᵢ.

Note that the method may return complex result even for real initial values as this depends on the function.

Examples:

muller(x->x^3-1, 0.5, 0.5im, -0.5) # → -0.500 + 0.866…im
muller(x->x^2+2, 0.0, 0.5, 1.0) # → ≈ 0.00 - 1.41…im
muller(x->(x-5)*x*(x+5), rand(3)...) # → ≈ 0.00
muller(x->x^3-1, 1.5, 1.0, 2.0) # → 2.0, Not converged
source
Roots.newtonFunction
Roots.newton(f, fp, x0; kwargs...)

Implementation of Newton's method: xᵢ₊₁ = xᵢ - f(xᵢ)/f'(xᵢ).

Arguments:

  • f::Function – function to find zero of

  • fp::Function – the derivative of f.

  • x0::Number – initial guess. For Newton's method this may be complex.

With the ForwardDiff package derivatives may be computed automatically. For example, defining D(f) = x -> ForwardDiff.derivative(f, float(x)) allows D(f) to be used for the first derivative.

Keyword arguments are passed to find_zero using the Roots.Newton() method.

See also Roots.newton((f,fp), x0) and Roots.newton(fΔf, x0) for simpler implementations.

source
Roots.dfreeFunction
dfree(f, xs)

A more robust secant method implementation

Solve for f(x) = 0 using an algorithm from Personal Calculator Has Key to Solve Any Equation f(x) = 0, the SOLVE button from the HP-34C.

This is also implemented as the Order0 method for find_zero.

The initial values can be specified as a pair of two values, as in (a,b) or [a,b], or as a single value, in which case a value of b is computed, possibly from fb. The basic idea is to follow the secant method to convergence unless:

  • a bracket is found, in which case AlefeldPotraShi is used;

  • the secant method is not converging, in which case a few steps of a quadratic method are used to see if that improves matters.

Convergence occurs when f(m) == 0, there is a sign change between m and an adjacent floating point value, or f(m) <= 2^3*eps(m).

A value of NaN is returned if the algorithm takes too many steps before identifying a zero.

Examples

Roots.dfree(x -> x^5 - x - 1, 1.0)
source

MATLAB interface

The initial naming scheme used fzero instead of find_zero, following the name of the MATLAB function fzero. This interface is not recommended, but, for now, still maintained.

Roots.fzeroFunction
fzero(f, x0; order=0; kwargs...)
fzero(f, x0, M; kwargs...)
fzero(f, x0, M, N; kwargs...)
fzero(f, x0; kwargs...)
fzero(f, a::Number, b::Number; kwargs...)
fzero(f, a::Number, b::Number; order=?, kwargs...)
fzero(f, fp, a::Number; kwargs...)

Find zero of a function using one of several iterative algorithms.

  • f: a scalar function or callable object

  • x0: an initial guess, a scalar value or tuple of two values

  • order: An integer, symbol, or string indicating the algorithm to use for find_zero. The Order0 default may be specified directly by order=0, order=:0, or order="0"; Order1() by order=1, order=:1, order="1", or order=:secant; Order1B() by order="1B", etc.

  • M: a specific method, as would be passed to find_zero, bypassing the use of the order keyword

  • N: a specific bracketing method. When given, if a bracket is identified, method N will be used to finish instead of method M.

  • a, b: When two values are passed along, if no order value is specified, Bisection will be used over the bracketing interval (a,b). If an order value is specified, the value of x0 will be set to (a,b) and the specified method will be used.

  • fp: when fp is specified (assumed to compute the derivative of f), Newton's method will be used

  • kwargs...: See find_zero for the specification of tolerances and other keyword arguments

Examples:

fzero(sin, 3)                  # use Order0() method, the default
fzero(sin, 3, order=:secant)   # use secant method (also just `order=1`)
fzero(sin, 3, Roots.Order1B()) # use secant method variant for multiple roots.
fzero(sin, 3, 4)               # use bisection method over (3,4)
fzero(sin, 3, 4, xatol=1e-6)   # use bisection method until |x_n - x_{n-1}| <= 1e-6
fzero(sin, 3, 3.1, order=1)    # use secant method with x_0=3.0, x_1 = 3.1
fzero(sin, (3, 3.1), order=2)  # use Steffensen's method with x_0=3.0, x_1 = 3.1
fzero(sin, cos, 3)             # use Newton's method
Note

Unlike find_zero, fzero does not specialize on the type of the function argument. This has the advantage of making the first use of the function f faster, but subsequent uses slower.

source
Roots.fzerosFunction
fzeros(f, a, b; kwargs...)
fzeros(f, ab; kwargs...)

Searches for all zeros of f within an interval (a,b). Assumes neither a or b is a zero.

Compatibility interface for find_zeros.

source

Tracking iterations

It is possible to add the keyword argument verbose=true when calling the find_zero function to get detailed information about the solution and data from each iteration. To save this data a Tracksobject may be passed in to tracks.


Roots.TracksType
Roots.Tracks{T,S}

A Tracks instance is used to record the progress of an algorithm. T is the type of function inputs, and S is the type of function outputs. They both default to Float64. Note that because this type is not exported, you have to write Roots.Tracks() to construct a Tracks object.

By default, no tracking is done while finding a root. To change this, construct a Tracks object, and pass it to the keyword argument tracks. This will modify the Tracks object, storing the input and function values at each iteration, along with additional information about the root-finding process.

Tracks objects are shown in an easy-to-read format. Internally either a tuple of (x,f(x)) pairs or (aₙ, bₙ) pairs are stored, the latter for bracketing methods. (These implementation details may change without notice.) The methods empty!, to reset the Tracks object; get, to get the tracks; last, to get the value converted to, may be of interest.

If you only want to print the information, but you don't need it later, this can conveniently be done by passing verbose=true to the root-finding function. This will not effect the return value, which will still be the root of the function.

Examples

julia> using Roots

julia> f(x) = x^2-2
f (generic function with 1 method)

julia> tracker = Roots.Tracks()
Algorithm has not been run

julia> find_zero(f, (0, 2), Roots.Secant(), tracks=tracker) ≈ √2
true

julia> tracker
Results of univariate zero finding:

* Converged to: 1.4142135623730947
* Algorithm: Secant()
* iterations: 7
* function evaluations ≈ 9
* stopped as |f(x_n)| ≤ max(δ, |x|⋅ϵ) using δ = atol, ϵ = rtol

Trace:
x₁ = 0,	 fx₁ = -2
x₂ = 2,	 fx₂ = 2
x₃ = 1,	 fx₃ = -1
x₄ = 1.3333333333333333,	 fx₄ = -0.22222222222222232
x₅ = 1.4285714285714286,	 fx₅ = 0.04081632653061229
x₆ = 1.4137931034482758,	 fx₆ = -0.0011890606420930094
x₇ = 1.4142114384748701,	 fx₇ = -6.0072868388605372e-06
x₈ = 1.4142135626888697,	 fx₈ = 8.9314555751229818e-10
x₉ = 1.4142135623730947,	 fx₉ = -8.8817841970012523e-16

julia> empty!(tracker)  # resets

julia> find_zero(sin, (3, 4), Roots.A42(), tracks=tracker) ≈ π
true

julia> get(tracker)
4-element Vector{NamedTuple{names, Tuple{Float64, Float64}} where names}:
 (a = 3.0, b = 4.0)
 (a = 3.0, b = 3.157162792479947)
 (a = 3.141592614491745, b = 3.1415926926910007)
 (a = 3.141592653589793, b = 3.141592653589794)

julia> last(tracker)
3.141592653589793
Note

As designed, the Tracks object may not record actions taken while the state object is initialized. An example is the default bisection algorithm where an initial midpoint is found to ensure the bracket does not straddle $0$.

source