Reference/API
The Roots
package provides several different algorithms to solve f(x)=0
.
Roots.Roots
Roots.A42
Roots.AbstractThukralBMethod
Roots.AlefeldPotraShi
Roots.AllZeros
Roots.Bisection
Roots.BracketedChebyshev
Roots.BracketedHalley
Roots.BracketedSchroder
Roots.Brent
Roots.Chandrapatla
Roots.ChebyshevLike
Roots.Esser
Roots.FalsePosition
Roots.Halley
Roots.ITP
Roots.King
Roots.LithBoonkkampIJzerman
Roots.LithBoonkkampIJzermanBracket
Roots.Newton
Roots.Order0
Roots.Order1
Roots.Order16
Roots.Order1B
Roots.Order2
Roots.Order2B
Roots.Order5
Roots.Order8
Roots.QuadraticInverse
Roots.Ridders
Roots.Schroder
Roots.Secant
Roots.Steffensen
Roots.SuperHalley
Roots.Tracks
Roots.ZeroProblem
CommonSolve.solve
CommonSolve.solve!
Roots.assess_convergence
Roots.bisection
Roots.default_tolerances
Roots.dfree
Roots.find_zero
Roots.find_zeros
Roots.fzero
Roots.fzeros
Roots.muller
Roots.newton
Roots.secant_method
Roots.Roots
— ModuleRoots
A package for solving f(x) = 0
for univariate, scalar functions.
The basic methods are
find_zero
for using one of several methods to identify a zeroZeroProblem
for solving for a zero using theCommonSolve
interfacefind_zeros
for heuristically identifying all zeros in a specified interval
Extended help
Root finding functions for Julia
This package contains simple routines for finding roots, or zeros, of scalar functions of a single real variable using floating-point math. The find_zero
function provides the primary interface. The basic call is find_zero(f, x0, [M], [p]; kws...)
where, typically, f
is a function, x0
a starting point or bracketing interval, M
is used to adjust the default algorithms used, and p
can be used to pass in parameters.
The various algorithms include:
Bisection-like algorithms. For functions where a bracketing interval is known (one where
f(a)
andf(b)
have alternate signs), a bracketing method, likeBisection
, can be specified. The default isBisection
, for most floating point number types, employed in a manner exploiting floating point storage conventions. For other number types (e.g.BigFloat
), an algorithm of Alefeld, Potra, and Shi is used by default. These default methods are guaranteed to converge. Other bracketing methods are available.Several derivative-free algorithms. These are specified through the methods
Order0
,Order1
(the secant method),Order2
(the Steffensen method),Order5
,Order8
, andOrder16
. The number indicates, roughly, the order of convergence. TheOrder0
method is the default, and the most robust, but may take more function calls to converge, as it employs a bracketing method when possible. The higher order methods promise faster convergence, though don't always yield results with fewer function calls thanOrder1
orOrder2
. The methodsRoots.Order1B
andRoots.Order2B
are superlinear and quadratically converging methods independent of the multiplicity of the zero.There are historic algorithms that require a derivative or two to be specified:
Roots.Newton
andRoots.Halley
.Roots.Schroder
provides a quadratic method, like Newton's method, which is independent of the multiplicity of the zero. This is generalized byRoots.ThukralXB
(withX
being 2,3,4, or 5).There are several non-exported algorithms, such as,
Roots.Brent()
,Roots.LithBoonkkampIJzermanBracket
, andRoots.LithBoonkkampIJzerman
.
Each method's documentation has additional detail.
Some examples:
julia> using Roots
julia> f(x) = exp(x) - x^4;
julia> α₀, α₁, α₂ = -0.8155534188089607, 1.4296118247255556, 8.6131694564414;
julia> find_zero(f, (8,9), Bisection()) ≈ α₂ # a bisection method has the bracket specified
true
julia> find_zero(f, (-10, 0)) ≈ α₀ # Bisection is default if x in `find_zero(f, x)` is not scalar
true
julia> find_zero(f, (-10, 0), Roots.A42()) ≈ α₀ # fewer function evaluations than Bisection
true
For non-bracketing methods, the initial position is passed in as a scalar, or, possibly, for secant-like methods an iterable like (x_0, x_1)
:
julia> find_zero(f, 3) ≈ α₁ # find_zero(f, x0::Number) will use Order0()
true
julia> find_zero(f, 3, Order1()) ≈ α₁ # same answer, different method (secant)
true
julia> find_zero(f, (3, 2), Order1()) ≈ α₁ # start secant method with (3, f(3), (2, f(2))
true
julia> find_zero(sin, BigFloat(3.0), Order16()) ≈ π # 2 iterations to 6 using Order1()
true
The find_zero
function can be used with callable objects:
julia> using Polynomials;
julia> x = variable();
julia> find_zero(x^5 - x - 1, 1.0) ≈ 1.1673039782614187
true
The function should respect the units of the Unitful
package:
julia> using Unitful
julia> s, m = u"s", u"m";
julia> g, v₀, y₀ = 9.8*m/s^2, 10m/s, 16m;
julia> y(t) = -g*t^2 + v₀*t + y₀
y (generic function with 1 method)
julia> find_zero(y, 1s) ≈ 1.886053370668014s
true
Newton's method can be used without taking derivatives by hand. The following examples use the ForwardDiff
package:
julia> using ForwardDiff
julia> D(f) = x -> ForwardDiff.derivative(f,float(x))
D (generic function with 1 method)
Now we have:
julia> f(x) = x^3 - 2x - 5
f (generic function with 1 method)
julia> x0 = 2
2
julia> find_zero((f, D(f)), x0, Roots.Newton()) ≈ 2.0945514815423265
true
Automatic derivatives allow for easy solutions to finding critical points of a function.
julia> using Statistics: mean, median
julia> as = rand(5);
julia> M(x) = sum((x-a)^2 for a in as)
M (generic function with 1 method)
julia> find_zero(D(M), .5) ≈ mean(as)
true
julia> med(x) = sum(abs(x-a) for a in as)
med (generic function with 1 method)
julia> find_zero(D(med), (0, 1)) ≈ median(as)
true
The CommonSolve interface
The DifferentialEquations interface of setting up a problem; initializing the problem; then solving the problem is also implemented using the types ZeroProblem
and the methods init
, solve!
, and solve
(from CommonSolve).
For example, we can solve a problem with many different methods, as follows:
julia> f(x) = exp(-x) - x^3
f (generic function with 1 method)
julia> x0 = 2.0
2.0
julia> fx = ZeroProblem(f, x0)
ZeroProblem{typeof(f), Float64}(f, 2.0)
julia> solve(fx) ≈ 0.7728829591492101
true
With no default, and a single initial point specified, the default Order1
method is used. The solve
method allows other root-solving methods to be passed, along with other options. For example, to use the Order2
method using a convergence criteria (see below) that |xₙ - xₙ₋₁| ≤ δ
, we could make this call:
julia> solve(fx, Order2(); atol=0.0, rtol=0.0) ≈ 0.7728829591492101
true
Unlike find_zero
, which errors on non-convergence, solve
returns NaN
on non-convergence.
This next example has a zero at 0.0
, but for most initial values will escape towards ±∞
, sometimes causing a relative tolerance to return a misleading value. Here we can see the differences:
julia> f(x) = cbrt(x) * exp(-x^2)
f (generic function with 1 method)
julia> x0 = 0.1147
0.1147
julia> find_zero(f, x0, Roots.Order5()) ≈ 5.936596662527689 # stopped as |f(xₙ)| ≤ |xₙ|ϵ
true
julia> find_zero(f, x0, Roots.Order1(), atol=0.0, rtol=0.0) # error as no check on `|f(xn)|`
ERROR: Roots.ConvergenceFailed("Algorithm failed to converge")
[...]
julia> fx = ZeroProblem(f, x0);
julia> solve(fx, Roots.Order1(), atol=0.0, rtol=0.0) # NaN, not an error
NaN
julia> fx = ZeroProblem((f, D(f)), x0); # higher order methods can identify zero of this function
julia> solve(fx, Roots.LithBoonkkampIJzerman(2,1), atol=0.0, rtol=0.0)
0.0
Functions may be parameterized, as illustrated:
julia> f(x, p=2) = cos(x) - x/p
f (generic function with 2 methods)
julia> Z = ZeroProblem(f, pi/4)
ZeroProblem{typeof(f), Float64}(f, 0.7853981633974483)
julia> solve(Z, Order1()) ≈ 1.0298665293222586 # use p=2 default
true
julia> solve(Z, Order1(), p=3) ≈ 1.170120950002626 # use p=3
true
julia> solve(Z, Order1(), 4) ≈ 1.2523532340025887 # by position, uses p=4
true
Multiple zeros
The find_zeros
function can be used to search for all zeros in a specified interval. The basic algorithm essentially splits the interval into many subintervals. For each, if there is a bracket, a bracketing algorithm is used to identify a zero, otherwise a derivative free method is used to search for zeros. This heuristic algorithm can miss zeros for various reasons, so the results should be confirmed by other means.
julia> f(x) = exp(x) - x^4
f (generic function with 2 methods)
julia> find_zeros(f, -10,10) ≈ [α₀, α₁, α₂] # from above
true
The interval can also be specified using a structure with extrema
defined, where extrema
returns two different values:
julia> using IntervalSets
julia> find_zeros(f, -10..10) ≈ [α₀, α₁, α₂]
true
(For tougher problems, the IntervalRootFinding package gives guaranteed results, rather than the heuristically identified values returned by find_zeros
.)
Convergence
For most algorithms, convergence is decided when
The value
|f(x_n)| <= tol
withtol = max(atol, abs(x_n)*rtol)
, orthe values
x_n ≈ x_{n-1}
with tolerancesxatol
andxrtol
andf(x_n) ≈ 0
with a relaxed tolerance based onatol
andrtol
.
The find_zero
algorithm stops if
it encounters an
NaN
or anInf
, orthe number of iterations exceed
maxevals
If the algorithm stops and the relaxed convergence criteria is met, the suspected zero is returned. Otherwise an error is thrown indicating no convergence. To adjust the tolerances, find_zero
accepts keyword arguments atol
, rtol
, xatol
, and xrtol
, as seen in some examples above.
The Bisection
and Roots.A42
methods are guaranteed to converge even if the tolerances are set to zero, so these are the defaults. Non-zero values for xatol
and xrtol
can be specified to reduce the number of function calls when lower precision is required.
julia> fx = ZeroProblem(sin, (3,4));
julia> solve(fx, Bisection(); xatol=1/16)
3.125
An alternate interface
This functionality is provided by the fzero
function, familiar to MATLAB users. Roots
also provides this alternative interface:
fzero(f, x0::Real; order=0)
calls a derivative-free method. with the order specifying one ofOrder0
,Order1
, etc.fzero(f, a::Real, b::Real)
calls thefind_zero
algorithm with theBisection
method.fzeros(f, a::Real, b::Real)
will callfind_zeros
.
Usage examples
julia> f(x) = exp(x) - x^4
f (generic function with 2 methods)
julia> fzero(f, 8, 9) ≈ α₂ # bracketing
true
julia> fzero(f, -10, 0) ≈ α₀
true
julia> fzeros(f, -10, 10) ≈ [α₀, α₁, α₂]
true
julia> fzero(f, 3) ≈ α₁ # default is Order0()
true
julia> fzero(sin, big(3), order=16) ≈ π # uses higher order method
true
The find_zero
and find_zeros
functions
There are two main functions: find_zero
to identify a zero of $f$ given some initial starting value or bracketing interval and find_zeros
to heuristically identify all zeros in a specified interval.
Roots.find_zero
— Functionfind_zero(f, x0, M, [N::AbstractBracketingMethod], [p=nothing]; kwargs...)
Interface to one of several methods for finding zeros of a univariate function, e.g. solving $f(x)=0$.
Arguments
Positional arguments
f
: the function (univariate orf(x,p)
withp
holding parameters)x0
: the initial condition (a value, initial values, or bracketing interval)M
: someAbstractUnivariateZeroMethod
specifying the solverN
: some bracketing method, when specified creates a hybrid methodp
: for specifying a parameter tof
. Also can be a keyword, but a positional argument is helpful with broadcasting.
Keyword arguments
xatol
,xrtol
: absolute and relative tolerance to decide ifxₙ₊₁ ≈ xₙ
atol
,rtol
: absolute and relative tolerance to decide iff(xₙ) ≈ 0
maxiters
: specify the maximum number of iterations the algorithm can take.verbose::Bool
: specifies if details about algorithm should be showntracks
: allows specification ofTracks
objects
Extended help
Initial starting value
For most methods, x0
is a scalar value indicating the initial value in the iterative procedure. (Secant methods can have a tuple specify their initial values.) Values must be a subtype of Number
and have methods for float
, real
, and oneunit
defined.
For bracketing intervals, x0
is specified using a tuple, a vector, or any iterable with extrema
defined. A bracketing interval, $[a,b]$, is one where $f(a)$ and $f(b)$ have different signs.
Return value
If the algorithm succeeds, the approximate root identified is returned. A ConvergenceFailed
error is thrown if the algorithm fails. The alternate form solve(ZeroProblem(f,x0), M)
returns NaN
in case of failure.
Specifying a method
A method is specified to indicate which algorithm to employ:
There are methods where a bracket is specified:
Bisection
,A42
,AlefeldPotraShi
,Roots.Brent
, among others. Bisection is the default for basic floating point types, butA42
generally requires far fewer iterations.There are several derivative-free methods: cf.
Order0
,Order1
(alsoRoots.Secant
),Order2
(alsoSteffensen
),Order5
,Order8
, andOrder16
, where the number indicates the order of the convergence.There are some classical methods where derivatives need specification:
Roots.Newton
,Roots.Halley
,Roots.Schroder
, among others.Methods intended for problems with multiplicities include
Roots.Order1B
,Roots.Order2B
, andRoots.ThukralXB
for differentX
s.The family
Roots.LithBoonkkampIJzerman{S,D}
,for differentS
andD
, uses a linear multistep method root finder. The(2,0)
method is the secant method,(1,1)
is Newton's method.
For more detail, see the help page for each method (e.g., ?Order1
). Non-exported methods must be qualified with the module name, as in ?Roots.Schroder
.
If no method is specified, the default method depends on x0
:
If
x0
is a scalar, the default is the more robustOrder0
method.If
x0
is a tuple, vector, or iterable withextrema
defined indicating a bracketing interval, then theBisection
method is used forFloat64
,Float32
orFloat16
types; otherwise theA42
method is used.
The default methods are chosen to be robust; they may not be as efficient as some others.
Specifying the function
The function(s) are passed as the first argument.
For the few methods that use one or more derivatives (Newton
, Halley
, Schroder
, LithBoonkkampIJzerman(S,D)
, etc.) a tuple of functions is used. For the classical algorithms, a function returning (f(x), f(x)/f'(x), [f'(x)/f''(x)])
may be used.
Optional arguments (tolerances, limit evaluations, tracing)
xatol
- absolute tolerance forx
values.xrtol
- relative tolerance forx
values.atol
- absolute tolerance forf(x)
values.rtol
- relative tolerance forf(x)
values.maxiters
- limit on maximum number of iterations.strict
- iffalse
(the default), when the algorithm stops, possible zeros are checked with a relaxed tolerance.verbose
- iftrue
a trace of the algorithm will be shown on successful completion. See the internalRoots.Tracks
object to save this trace.
See the help string for Roots.assess_convergence
for details on convergence. See the help page for Roots.default_tolerances(method)
for details on the default tolerances.
In general, with floating point numbers, convergence must be understood as not an absolute statement. Even if mathematically α
is an answer and xstar
the floating point realization, it may be that f(xstar) - f(α) ≈ xstar ⋅ f'(α) ⋅ eps(α)
, so the role of tolerances must be appreciated, and at times specified.
For the Bisection
methods, convergence is guaranteed over Float64
values, so the tolerances are set to be $0$ by default.
If a bracketing method is passed in after the method specification, then whenever a bracket is identified during the algorithm, the method will switch to the bracketing method to identify the zero. (Bracketing methods are mathematically guaranteed to converge, non-bracketing methods may or may not converge.) This is what Order0
does by default, with an initial secant method switching to the AlefeldPotraShi
method should a bracket be encountered.
Note: The order of the method is hinted at in the naming scheme. A scheme is order r
if, with eᵢ = xᵢ - α
, eᵢ₊₁ = C⋅eᵢʳ
. If the error eᵢ
is small enough, then essentially the error will gain r
times as many leading zeros each step. However, if the error is not small, this will not be the case. Without good initial guesses, a high order method may still converge slowly, if at all. The OrderN
methods have some heuristics employed to ensure a wider range for convergence at the cost of not faithfully implementing the method, though those are available through unexported methods.
Examples:
Default methods.
julia> using Roots
julia> find_zero(sin, 3) # use Order0()
3.141592653589793
julia> find_zero(sin, (3,4)) # use Bisection()
3.141592653589793
Specifying a method,
julia> find_zero(sin, (3,4), Order1()) # can specify two starting points for secant method
3.141592653589793
julia> find_zero(sin, 3.0, Order2()) # Use Steffensen method
3.1415926535897936
julia> find_zero(sin, big(3.0), Order16()) # rapid convergence
3.141592653589793238462643383279502884197169399375105820974944592307816406286198
julia> find_zero(sin, (3, 4), A42()) # fewer function calls than Bisection(), in this case
3.141592653589793
julia> find_zero(sin, (3, 4), FalsePosition(8)) # 1 of 12 possible algorithms for false position
3.141592653589793
julia> find_zero((sin,cos), 3.0, Roots.Newton()) # use Newton's method
3.141592653589793
julia> find_zero((sin, cos, x->-sin(x)), 3.0, Roots.Halley()) # use Halley's method
3.141592653589793
Changing tolerances.
julia> fn = x -> (2x*cos(x) + x^2 - 3)^10/(x^2 + 1);
julia> x0, xstar = 3.0, 2.9947567209477;
julia> fn(find_zero(fn, x0, Order2())) <= 1e-14 # f(xₙ) ≈ 0, but Δxₙ can be largish
true
julia> find_zero(fn, x0, Order2(), atol=0.0, rtol=0.0) # error: x_n ≉ x_{n-1}; just f(x_n) ≈ 0
ERROR: Roots.ConvergenceFailed("Algorithm failed to converge")
[...]
julia> fn = x -> (sin(x)*cos(x) - x^3 + 1)^9;
julia> x0, xstar = 1.0, 1.112243913023029;
julia> isapprox(find_zero(fn, x0, Order2()), xstar; atol=1e-4)
true
julia> find_zero(fn, x0, Order2(), maxiters=3) # need more steps to converge
ERROR: Roots.ConvergenceFailed("Algorithm failed to converge")
[...]
Tracing
Passing verbose=true
will show details on the steps of the algorithm. The tracks
argument allows the passing of a Roots.Tracks
object to record the values of x
and f(x)
used in the algorithm.
See solve!
and ZeroProblem
for an alternate interface.
Roots.find_zeros
— Functionfind_zeros(f, a, [b]; [no_pts=12, k=8, naive=false, xatol, xrtol, atol, rtol])
Search for zeros of f
in the interval [a,b]
with an heuristic algorithm.
f
: a function or callable objecta
,b
: Ifb
is specified, the interval $[a,b]$ is used. If onlya
is specified, it is passed toextrema
to define the interval to search over. It is assumed that neither endpoint is a zero.
Returns a vector of zeros in sorted order, possibly empty.
Extended help
Examples
julia> using Roots
julia> find_zeros(x -> exp(x) - x^4, -5, 20) # a few well-spaced zeros
3-element Vector{Float64}:
-0.8155534188089606
1.4296118247255556
8.613169456441398
julia> find_zeros(x -> sin(x^2) + cos(x)^2, 0, 2pi) # many zeros
12-element Vector{Float64}:
1.78518032659534
2.391345462376604
3.2852368649448853
3.3625557095737544
4.016412952618305
4.325091924521049
4.68952781386834
5.00494459113514
5.35145266881871
5.552319796014526
5.974560835055425
6.039177477770888
julia> find_zeros(x -> cos(x) + cos(2x), (0, 4pi)) # mix of simple, non-simple zeros
6-element Vector{Float64}:
1.0471975511965976
3.141592653589943
5.235987755982988
7.330382858376184
9.424777960769228
11.519173063162574
julia> f(x) = (x-0.5) * (x-0.5001) * (x-1) # nearby zeros
f (generic function with 1 method)
julia> find_zeros(f, 0, 2)
3-element Vector{Float64}:
0.5
0.5001
1.0
julia> f(x) = (x-0.5) * (x-0.5001) * (x-4) * (x-4.001) * (x-4.2)
f (generic function with 1 method)
julia> find_zeros(f, 0, 10)
3-element Vector{Float64}:
0.5
0.5001
4.2
julia> f(x) = (x-0.5)^2 * (x-0.5001)^3 * (x-4) * (x-4.001) * (x-4.2)^2 # hard to identify
f (generic function with 1 method)
julia> find_zeros(f, 0, 10, no_pts=21) # too hard for default
5-element Vector{Float64}:
0.49999999999999994
0.5001
4.0
4.001
4.200000000000001
Some cases where the number of zeros may be underreported:
- if the initial interval,
(a,b)
, is too wide - if there are zeros that are very nearby
- the function is flat, e.g.,
x->0
.
The basic algorithm checks for zeros among the endpoints, and then divides the interval (a,b)
into no_pts-1
subintervals and then proceeds to look for zeros through bisection or a derivative-free method. As checking for a bracketing interval is relatively cheap and bisection is guaranteed to converge, each interval has k
pairs of intermediate points checked for a bracket.
If any zeros are found, the algorithm uses these to partition (a,b)
into subintervals. Each subinterval is shrunk so that the endpoints are not zeros and the process is repeated on the subinterval. If the initial interval is too large, then the naive scanning for zeros may be fruitless and no zeros will be reported. If there are nearby zeros, the shrinking of the interval may jump over them, though as seen in the examples, nearby roots can be identified correctly, though for really nearby points, or very flat functions, it may help to increase no_pts
.
The tolerances are used to shrink the intervals, but not to find zeros within a search. For searches, bisection is guaranteed to converge with no specified tolerance. For the derivative free search, a modification of the Order0
method is used, which at worst case compares |f(x)| <= 8*eps(x)
to identify a zero. The algorithm might identify more than one value for a zero, due to floating point approximations. If a potential pair of zeros satisfy isapprox(a,b,atol=sqrt(xatol), rtol=sqrt(xrtol))
then they are consolidated.
The algorithm can make many function calls. When zeros are found in an interval, the naive search is carried out on each subinterval. To cut down on function calls, though with some increased chance of missing some zeros, the adaptive nature can be skipped with the argument naive=true
or the number of points stepped down.
The algorithm is derived from one in a PR by @djsegal.
The IntervalRootFinding
package provides a rigorous alternative to this heuristic one. That package uses interval arithmetic, so can compute bounds on the size of the image of an interval under f
. If this image includes 0
, then it can look for the zero. Bisection, on the other hand, only will look for a zero if the two endpoints have different signs, a much more rigid condition for a potential zero.
As of version 1.9
an extension is provided so that when the IntervalRootFinding
package is loaded, the find_zeros
function will call IntervalRootFinding.roots
to find the isolating brackets and find_zero
to find the roots, when possible, if the interval is specified as an Interval
object, as created by -1..1
, say.
For example, this function (due to @truculentmath
) is particularly tricky, as it is positive at every floating point number, but has two zeros (the vertical asymptote at 15//11
is only negative within adjacent floating point values):
julia> using IntervalArithmetic, IntervalRootFinding, Roots
julia> g(x) = x^2 + 1 +log(abs( 11*x-15 ))/99
g (generic function with 1 method)
julia> find_zeros(g, -3, 3)
Float64[]
julia> IntervalRootFinding.roots(g, -3..3, IntervalRootFinding.Bisection)
1-element Vector{Root{Interval{Float64}}}:
Root([1.36363, 1.36364], :unknown)
A less extreme usage might be the following, where unique
indicates Bisection could be useful and indeed find_zeros
will identify these values:
julia> g(x) = exp(x) - x^5
g (generic function with 1 method)
julia> rts = IntervalRootFinding.roots(g, -20..20)
2-element Vector{Root{Interval{Float64}}}:
Root([12.7132, 12.7133], :unique)
Root([1.29585, 1.29586], :unique)
julia> find_zeros(g, -20, 20)
2-element Vector{Float64}:
1.2958555090953687
12.713206788867632
CommonSolve interface
The problem-algorithm-solve interface is a pattern popularized in Julia
by the DifferentialEquations.jl
suite of packages. This can be used as an alternative to find_zero
. Unlike find_zero
, solve
will return NaN
on non-convergence.
CommonSolve.solve!
— Functionsolve!(P::ZeroProblemIterator)
solve(fx::ZeroProblem, [M], [N]; p=nothing, kwargs...)
init(fx::ZeroProblem, [M], [N];
p=nothing,
verbose=false, tracks=NullTracks(), kwargs...)
Solve for the zero of a scalar-valued univariate function specified through ZeroProblem
or ZeroProblemIterator
using the CommonSolve
interface.
The defaults for M
and N
depend on the ZeroProblem
: if x0
is a number, then M=Secant()
and N=AlefeldPotraShi()
is used (Order0
); if x0
has 2
(or more values) then it is assumed to be a bracketing interval and M=AlefeldPotraShi()
is used.
The methods involved with this interface are:
ZeroProblem
: used to specify a problem with a function (or functions) and an initial guesssolve
: to solve for a zero in aZeroProblem
The latter calls the following, which can be useful independently:
init
: to initialize an iterator with a method for solution, any adjustments to the default tolerances, and a specification to log the steps or not.solve!
to iterate to convergence.
Returns NaN
, not an error like find_zero
, when the problem can not be solved. Tested for zero allocations.
Examples:
julia> using Roots
julia> fx = ZeroProblem(sin, 3)
ZeroProblem{typeof(sin), Int64}(sin, 3)
julia> solve(fx)
3.141592653589793
Or, if the iterable is required
julia> problem = init(fx);
julia> solve!(problem)
3.141592653589793
keyword arguments can be used to adjust the default tolerances.
julia> solve(fx, Order5(); atol=1/100)
3.1425464815525403
The above is equivalent to:
julia> problem = init(fx, Order5(), atol=1/100);
julia> solve!(problem)
3.1425464815525403
The keyword argument p
may be used if the function(s) to be solved depend on a parameter in their second positional argument (e.g., f(x, p)
). For example
julia> f(x,p) = exp(-x) - p # to solve p = exp(-x)
f (generic function with 1 method)
julia> fx = ZeroProblem(f, 1)
ZeroProblem{typeof(f), Int64}(f, 1)
julia> solve(fx; p=1/2) # log(2)
0.6931471805599453
This would be recommended, as there is no recompilation due to the function changing. For use with broadcasting, p
may also be the last positional argument.
The argument verbose=true
for init
instructs that steps to be logged;
The iterator interface allows for the creation of hybrid solutions, such as is used when two methods are passed to solve
. For example, this is essentially how the hybrid default is constructed:
julia> function order0(f, x)
fx = ZeroProblem(f, x)
p = init(fx, Roots.Secant())
xᵢ,st = ϕ = iterate(p)
while ϕ !== nothing
xᵢ, st = ϕ
state, ctr = st
fᵢ₋₁, fᵢ = state.fxn0, state.fxn1
if sign(fᵢ₋₁)*sign(fᵢ) < 0 # check for bracket
x0 = (state.xn0, state.xn1)
fx′ = ZeroProblem(f, x0)
p = init(fx′, Bisection())
xᵢ = solve!(p)
break
end
ϕ = iterate(p, st)
end
xᵢ
end
order0 (generic function with 1 method)
julia> order0(sin, 3)
3.141592653589793
CommonSolve.solve
— Functionsolve(fx::ZeroProblem, args...; verbose=false, kwargs...)
Disptaches to solve!(init(fx, args...; kwargs...))
. See solve!
for details.
Roots.ZeroProblem
— TypeZeroProblem{F,X}
A container for a function and initial guess to be used with solve
.
Classical methods based on derivatives
We begin by describing the classical methods even though they are not necessarily recommended because they require more work of the user, as they give insight into why there are a variety of methods available.
The classical methods of Newton and Halley utilize information about the function and its derivative(s) in an iterative manner to converge to a zero of $f(x)$ given an initial starting value.
Newton's method is easily described:
From an initial point, the next point in the iterative algorithm is found by identifying the intersection of the $x$ axis with the tangent line of $f$ at the initial point. This is repeated until convergence or the realization that convergence won't happen for the initial point. Mathematically,
$x_{n+1} = x_{n} - f(x_n)/f'(x_n).$
Some facts are helpful to understand the different methods available in Roots
:
For Newton's method there is a formula for the error: Set $\epsilon_n = \alpha - x_n$, where $\alpha$ is the zero, then $\epsilon_{n+1} = -f''(\xi_n)/(2f'(\xi_n) \cdot \epsilon_n^2,$ here $\xi_n$ is some value between $\alpha$ and $x_n$.
The error term, when of the form $|\epsilon_{n+1}| \leq C\cdot|\epsilon_n|^2$, can be used to identify an interval around $\alpha$ for which convergence is guaranteed. Such convergence is termed quadratic (order 2). For floating point solutions, quadratic convergence and a well chosen initial point can lead to convergence in 4 or 5 iterations. In general, convergence is termed order $q$ when $|\epsilon_{n+1}| \approx C\cdot|\epsilon_n|^q$
The term $-f''(\xi_n)/(2f'(\xi_n)$ indicates possible issues when $f''$ is too big near $\alpha$ or $f'$ is too small near $\alpha$. In particular if $f'(\alpha) = 0$, there need not be quadratic convergence, and convergence can take many iterations. A zero for which $f(x) = (x-\alpha)^{1+\beta}\cdot g(x)$, with $g(\alpha) \neq 0$ is called simple when $\beta=0$ and non-simple when $\beta > 0$. Newton's method is quadratic near simple zeros and need not be quadratic near non-simple zeros.
As well, if $f''$ is too big near $\alpha$, or $f'$ too small near $\alpha$, or $x_n$ too far from $\alpha$ (that is, $|\epsilon_n|>1$) the error might actually increase and convergence is not guaranteed.
The explicit form of the error function can be used to guarantee convergence for functions with a certain shape (monotonic, convex functions where the sign of $f''$ and $f'$ don't change). Quadratic convergence may only occur once the algorithm is near the zero.
The number of function evaluations per step for Newton's method is 2.
Roots.Newton
— TypeRoots.Newton()
Implements Newton's method: xᵢ₊₁ = xᵢ - f(xᵢ)/f'(xᵢ)
. This is a quadratically convergent method requiring one derivative and two function calls per step.
Examples
julia> using Roots
julia> find_zero((sin,cos), 3.0, Roots.Newton()) ≈ π
true
If function evaluations are expensive one can pass in a function which returns (f, f/f') as follows
julia> find_zero(x -> (sin(x), sin(x)/cos(x)), 3.0, Roots.Newton()) ≈ π
true
This can be advantageous if the derivative is easily computed from the value of f, but otherwise would be expensive to compute.
The error, eᵢ = xᵢ - α
, can be expressed as eᵢ₊₁ = f[xᵢ,xᵢ,α]/(2f[xᵢ,xᵢ])eᵢ²
(Sidi, Unified treatment of regula falsi, Newton-Raphson, secant, and Steffensen methods for nonlinear equations).
Roots.Halley
— TypeRoots.Halley()
Implements Halley's method, xᵢ₊₁ = xᵢ - (f/f')(xᵢ) * (1 - (f/f')(xᵢ) * (f''/f')(xᵢ) * 1/2)^(-1)
This method is cubically converging, it requires $3$ function calls per step. Halley's method finds xₙ₊₁
as the zero of a hyperbola at the point (xₙ, f(xₙ))
matching the first and second derivatives of f
.
The function, its derivative and second derivative can be passed either as a tuple of $3$ functions or as a function returning values for $(f, f/f', f'/f'')$, which could be useful when function evaluations are expensive.
Examples
julia> using Roots
julia> find_zero((sin, cos, x->-sin(x)), 3.0, Roots.Halley()) ≈ π
true
julia> function f(x)
s,c = sincos(x)
(s, s/c, -c/s)
end;
julia> find_zero(f, 3.0, Roots.Halley()) ≈ π
true
This can be advantageous if the derivatives are easily computed from the computation for f, but otherwise would be expensive to compute separately.
The error, eᵢ = xᵢ - α
, satisfies eᵢ₊₁ ≈ -(2f'⋅f''' -3⋅(f'')²)/(12⋅(f'')²) ⋅ eᵢ³
(all evaluated at α
).
Roots.QuadraticInverse
— TypeRoots.QuadraticInverse()
Implements the quadratic inverse method also known as Chebyshev's method, xᵢ₊₁ = xᵢ - (f/f')(xᵢ) * (1 + (f/f')(xᵢ) * (f''/f')(xᵢ) * 1/2)
. This method is cubically converging, it requires $3$ function calls per step.
Example
julia> using Roots
julia> find_zero((sin, cos, x->-sin(x)), 3.0, Roots.QuadraticInverse()) ≈ π
true
If function evaluations are expensive one can pass in a function which returns (f, f/f',f'/f'')
as follows
julia> find_zero(x -> (sin(x), sin(x)/cos(x), -cos(x)/sin(x)), 3.0, Roots.QuadraticInverse()) ≈ π
true
This can be advantageous if the derivatives are easily computed from the computation for f, but otherwise would be expensive to compute separately.
The error, eᵢ = xᵢ - α
, satisfies eᵢ₊₁ ≈ (1/2⋅(f''/f')² - 1/6⋅f'''/f')) ⋅ eᵢ³
(all evaluated at α
).
Roots.ChebyshevLike
— TypeChebyshev-like methods and quadratic equations (J. A. Ezquerro, J. M. Gutiérrez, M. A. Hernández and M. A. Salanova)
Roots.SuperHalley
— TypeAn acceleration of Newton's method: Super-Halley method (J.M. Gutierrez, M.A. Hernandez)
Newton and Halley's method are members of this family of methods:
Roots.LithBoonkkampIJzerman
— TypeLithBoonkkampIJzerman{S,D} <: AbstractNewtonLikeMethod
LithBoonkkampIJzerman(S,D)
A family of different methods that includes the secant method and Newton's method.
Specifies a linear multistep solver with S
steps and D
derivatives following Lith, Boonkkamp, and IJzerman.
Extended help
Examples
julia> using Roots
julia> find_zero(sin, 3, Roots.LithBoonkkampIJzerman(2,0)) ≈ π # the secant method
true
julia> find_zero((sin,cos), 3, Roots.LithBoonkkampIJzerman(1,1)) ≈ π # Newton's method
true
julia> find_zero((sin,cos), 3, Roots.LithBoonkkampIJzerman(3,1)) ≈ π # Faster convergence rate
true
julia> find_zero((sin,cos, x->-sin(x)), 3, Roots.LithBoonkkampIJzerman(1,2)) ≈ π # Halley-like method
true
The method can be more robust to the initial condition. This example is from the paper (p13). Newton's method (the S=1
, D=1
case) fails if |x₀| ≥ 1.089
but methods with more memory succeed.
julia> fx = ZeroProblem((tanh,x->sech(x)^2), 1.239); # zero at 0.0
julia> solve(fx, Roots.LithBoonkkampIJzerman(1,1)) |> isnan# Newton, NaN
true
julia> solve(fx, Roots.LithBoonkkampIJzerman(2,1)) |> abs |> <(eps())
true
julia> solve(fx, Roots.LithBoonkkampIJzerman(3,1)) |> abs |> <(eps())
true
Multiple derivatives can be constructed automatically using automatic differentiation. For example,
julia> using ForwardDiff
julia> function δ(f, n::Int=1)
n <= 0 && return f
n == 1 && return x -> ForwardDiff.derivative(f,float(x))
δ(δ(f,1),n-1)
end;
julia> fs(f,n) = ntuple(i -> δ(f,i-1), Val(n+1));
julia> f(x) = cbrt(x) * exp(-x^2); # cf. Table 6 in paper, α = 0
julia> fx = ZeroProblem(fs(f,1), 0.1147);
julia> opts = (xatol=2eps(), xrtol=0.0, atol=0.0, rtol=0.0); # converge if |xₙ - xₙ₋₁| <= 2ϵ
julia> solve(fx, Roots.LithBoonkkampIJzerman(1, 1); opts...) |> isnan # NaN -- no convergence
true
julia> solve(fx, Roots.LithBoonkkampIJzerman(2, 1); opts...) |> abs |> <(eps()) # converges
true
julia> fx = ZeroProblem(fs(f,2), 0.06); # need better starting point
julia> solve(fx, Roots.LithBoonkkampIJzerman(2, 2); opts...) |> abs |> <(eps()) # converges
true
For the case D=1
, a bracketing method based on this approach is implemented in LithBoonkkampIJzermanBracket
Reference
In Lith, Boonkkamp, and IJzerman an analysis is given of the convergence rates when using linear multistep methods to solve 0=f(x)
using f⁻¹(0) = x
when f
is a sufficiently smooth linear function. The reformulation, attributed to Grau-Sanchez, finds a differential equation for f⁻¹
: dx/dy = [f⁻¹]′(y) = 1/f′(x) = F
as x(0) = x₀ + ∫⁰_y₀ F(x(y)) dy
.
A linear multi-step method is used to solve this equation numerically. Let S be the number of memory steps (S= 1,2,...) and D be the number of derivatives employed, then, with F(x) = dx/dy
x_{n+S} = ∑_{k=0}^{S-1} aₖ x_{n+k} +∑d=1^D ∑_{k=1}^{S-1} aᵈ_{n+k}F⁽ᵈ⁾(x_{n+k})
. The aₖ
s and aᵈₖ
s are computed each step.
This table is from Tables 1 and 3 of the paper and gives the convergence rate for simple roots identified therein:
s: number of steps remembered
d: number of derivatives uses
s/d 0 1 2 3 4
1 . 2 3 4 5
2 1.62 2.73 3.79 4.82 5.85
3 1.84 2.91 3.95 4.97 5.98
4 1.92 2.97 3.99 4.99 5.996
5 1.97 . . . .
That is, more memory leads to a higher convergence rate; more derivatives leads to a higher convergence rate. However, the interval about α
, the zero, where the convergence rate is guaranteed may get smaller.
For the larger values of S
, the expressions to compute the next value get quite involved. The higher convergence rate is likely only to be of help for finding solutions to high precision.
Derivative free methods
The secant method replaces the derivative term in Newton's method with the slope of a secant line using two prior values:
$x_{n+1} = x_n - (\frac{f(x_n)-f(x_{n-1})}{x_n - x_{n-1}})^{-1}\cdot f(x_n).$
Though the secant method has convergence rate of order $\approx 1.618$ – i.e., is not quadratic – it only requires one new function call per step so can be very effective. Often function evaluations are the slowest part of the computation and, as well, no derivative is needed. Because it can be very efficient, the secant method is used in the default method of find_zero
when called with a single initial starting point.
Steffensen's method is a quadratically converging. derivative-free method which uses a secant line based on $x_n$ and $x_n + f(x_n)$. Though of higher order, it requires additional function calls per step and depends on a good initial starting value. Other derivative free methods are available, trading off increased function calls for higher-order convergence. They may be of interest when arbitrary precision is needed. A measure of efficiency is $q^{1/r}$ where $q$ is the order of convergence and $r$ the number of function calls per step. With this measure, the secant method would be $\approx (1.618)^{1/1}$ and Steffensen's would be less ($2^{1/2}$).
Roots.Secant
— TypeSecant()
Order1()
Orderφ()
The Order1()
method is an alias for Secant
. It specifies the secant method. This method keeps two values in its state, xₙ
and xₙ₋₁
. The updated point is the intersection point of $x$ axis with the secant line formed from the two points. The secant method uses $1$ function evaluation per step and has order φ≈ (1+sqrt(5))/2
.
The error, eᵢ = xᵢ - α
, satisfies eᵢ₊₂ = f[xᵢ₊₁,xᵢ,α] / f[xᵢ₊₁,xᵢ] * (xᵢ₊₁-α) * (xᵢ - α)
.
Roots.Order1
— TypeSecant()
Order1()
Orderφ()
The Order1()
method is an alias for Secant
. It specifies the secant method. This method keeps two values in its state, xₙ
and xₙ₋₁
. The updated point is the intersection point of $x$ axis with the secant line formed from the two points. The secant method uses $1$ function evaluation per step and has order φ≈ (1+sqrt(5))/2
.
The error, eᵢ = xᵢ - α
, satisfies eᵢ₊₂ = f[xᵢ₊₁,xᵢ,α] / f[xᵢ₊₁,xᵢ] * (xᵢ₊₁-α) * (xᵢ - α)
.
Roots.Steffensen
— TypeSteffensen()
The quadratically converging Steffensen method is used for the derivative-free Order2()
algorithm. Unlike the quadratically converging Newton's method, no derivative is necessary, though like Newton's method, two function calls per step are. Steffensen's algorithm is more sensitive than Newton's method to poor initial guesses when f(x)
is large, due to how f'(x)
is approximated. The Order2
method replaces a Steffensen step with a secant step when f(x)
is large.
The error, eᵢ - α
, satisfies eᵢ₊₁ = f[xᵢ, xᵢ+fᵢ, α] / f[xᵢ,xᵢ+fᵢ] ⋅ (1 - f[xᵢ,α] ⋅ eᵢ²
Roots.Order2
— TypeOrder2
Steffensen
with a guard on the secant step.
Roots.Order5
— TypeOrder5()
KumarSinghAkanksha()
Implements an order 5 algorithm from A New Fifth Order Derivative Free Newton-Type Method for Solving Nonlinear Equations by Manoj Kumar, Akhilesh Kumar Singh, and Akanksha, Appl. Math. Inf. Sci. 9, No. 3, 1507-1513 (2015), DOI: 10.12785/amis/090346. Four function calls per step are needed. The Order5
method replaces a Steffensen step with a secant step when f(x)
is large.
The error, eᵢ = xᵢ - α
, satisfies eᵢ₊₁ = K₁ ⋅ K₅ ⋅ M ⋅ eᵢ⁵ + O(eᵢ⁶)
Roots.Order8
— TypeOrder8()
Thukral8()
Implements an eighth-order algorithm from New Eighth-Order Derivative-Free Methods for Solving Nonlinear Equations by Rajinder Thukral, International Journal of Mathematics and Mathematical Sciences Volume 2012 (2012), Article ID 493456, 12 pages DOI: 10.1155/2012/493456. Four function calls per step are required. The Order8
method replaces a Steffensen step with a secant step when f(x)
is large.
The error, eᵢ = xᵢ - α
, is expressed as eᵢ₊₁ = K ⋅ eᵢ⁸
in (2.25) of the paper for an explicit K
.
Roots.Order16
— TypeOrder16()
Thukral16()
Implements the order 16 algorithm from New Sixteenth-Order Derivative-Free Methods for Solving Nonlinear Equations by R. Thukral, American Journal of Computational and Applied Mathematics p-ISSN: 2165-8935; e-ISSN: 2165-8943; 2012; 2(3): 112-118 DOI: 10.5923/j.ajcam.20120203.08.
Five function calls per step are required. Though rapidly converging, this method generally isn't faster (fewer function calls/steps) over other methods when using Float64
values, but may be useful for solving over BigFloat
. The Order16
method replaces a Steffensen step with a secant step when f(x)
is large.
The error, eᵢ = xᵢ - α
, is expressed as eᵢ₊₁ = K⋅eᵢ¹⁶
for an explicit K
in equation (50) of the paper.
Bracketing methods
The bisection method identifies a zero of a continuous function between $a$ and $b$ when $f(a)$ and $f(b)$ have different signs. (The interval $[a,b]$ is called a bracketing interval when $f(a)\cdot f(b) <0$.) The basic algorithm is particularly simple, an interval $[a_i,b_i]$ is split at $c = (a_i+b_i)/2$. Either $f(c)=0$, or one of $[a_i,c]$ or $[c,b_i]$ is a bracketing interval, which is called $[a_{i+1},b_{i+1}]$. From this description, we see that $[a_i,b_i]$ has length $2^{-i}$ times the length of $[a_0,b_0]$, so the intervals will eventually terminate by finding a zero, $c$, or converge to a zero. This convergence is slow (the efficiency is only $1$, but guaranteed. For 16
-, 32
-, and 64
-bit floating point values, a reinterpretation of how the midpoint ($c$) is found leads to convergence in no more than $64$ iterations, unlike the midpoint found above, where some cases can take many more steps to converge.
In floating point, by guaranteed convergence we have either an exact zero or a bracketing interval consisting of two adjacent floating point values. When applied to non-continuous functions, this algorithm will identify an exact zero or a zero crossing of the function. (E.g., applied to $f(x)=1/x$ it will find $0$.)
The default selection of midpoint described above includes no information about the function $f$ beyond its sign. Algorithms exploiting the shape of the function can be significantly more efficient. For example, the bracketing method Roots.AlefeldPotraShi
due to Alefeld, Potra, and Shi has efficiency $\approx 1.6686$. This method is also used in the default method for find_zero
when a single initial starting point is given if a bracketing interval is identified.
Roots.Bisection
— TypeBisection()
If possible, will use the bisection method over Float64
values. The bisection method starts with a bracketing interval [a,b]
and splits it into two intervals [a,c]
and [c,b]
, If c
is not a zero, then one of these two will be a bracketing interval and the process continues. The computation of c
is done by _middle
, which reinterprets floating point values as unsigned integers and splits there. It was contributed by Jason Merrill. This method avoids floating point issues and when the tolerances are set to zero (the default) guarantees a "best" solution (one where a zero is found or the bracketing interval is of the type [a, nextfloat(a)]
).
When tolerances are given, this algorithm terminates when the interval length is less than or equal to the tolerance max(δₐ, 2abs(u)δᵣ)
with u
in {a,b}
chosen by the smaller of |f(a)|
and |f(b)|
, or or the function value is less than max(tol, min(abs(a), abs(b)) * rtol)
. The latter is used only if the default tolerances (atol
or rtol
) are adjusted.
When solving $f(x,p) = 0$ for $x^*(p)$ using Bisection
one can not take the derivative directly via automatatic differentiation, as the algorithm is not differentiable. See Sensitivity in the documentation for alternatives.
Roots.A42
— TypeRoots.A42()
Bracketing method which finds the root of a continuous function within a provided bracketing interval [a, b]
, without requiring derivatives. It is based on Algorithm 4.2 described in: G. E. Alefeld, F. A. Potra, and Y. Shi, "Algorithm 748: enclosing zeros of continuous functions," ACM Trans. Math. Softw. 21, 327–344 (1995), DOI: 10.1145/210089.210111. The asymptotic efficiency index, $q^{1/k}$, is $(2 + 7^{1/2})^{1/3} = 1.6686...$.
Originally by John Travers.
The paper refenced above shows that for a continuously differentiable $f$ over $[a,b]$ with a simple root the algorithm terminates at a zero or asymptotically the steps are of the inverse cubic type (Lemma 5.1). This is proved under an assumption that $f$ is four-times continuously differentiable.
Roots.AlefeldPotraShi
— TypeRoots.AlefeldPotraShi()
Follows Algorithm 4.1 in "ON ENCLOSING SIMPLE ROOTS OF NONLINEAR EQUATIONS", by Alefeld, Potra, Shi; DOI: 10.1090/S0025-5718-1993-1192965-2.
The order of convergence is 2 + √5
; asymptotically there are 3 function evaluations per step. Asymptotic efficiency index is $(2+√5)^{1/3} ≈ 1.618...$. Less efficient, but can run faster than the related A42
method.
Originally by John Travers.
Roots.Brent
— TypeRoots.Brent()
An implementation of Brent's (or Brent-Dekker) method. This method uses a choice of inverse quadratic interpolation or a secant step, falling back on bisection if necessary.
Roots.Chandrapatla
— TypeRoots.Chandrapatla()
Use Chandrapatla's algorithm (cf. Scherer) to solve $f(x) = 0$.
Chandrapatla's algorithm chooses between an inverse quadratic step or a bisection step based on a computed inequality.
Roots.Ridders
— TypeRoots.Ridders()
Implements Ridders' method. This bracketing method finds the midpoint, x₁
; then interpolates an exponential; then uses false position with the interpolated value to find c
. If c
and x₁
form a bracket is used, otherwise the subinterval [a,c]
or [c,b]
is used.
Example:
julia> using Roots
julia> find_zero(x -> exp(x) - x^4, (5, 15), Roots.Ridders()) ≈ 8.61316945644
true
julia> find_zero(x -> x*exp(x) - 10, (-100, 100), Roots.Ridders()) ≈ 1.74552800274
true
julia> find_zero(x -> tan(x)^tan(x) - 1e3, (0, 1.5), Roots.Ridders()) ≈ 1.3547104419
true
Ridders showed the error satisfies eₙ₊₁ ≈ 1/2 eₙeₙ₋₁eₙ₋₂ ⋅ (g^2-2fh)/f
for f=F', g=F''/2, h=F'''/6
, suggesting converence at rate ≈ 1.839...
. It uses two function evaluations per step, so its order of convergence is ≈ 1.225...
.
Roots.ITP
— TypeRoots.ITP(;[κ₁-0.2, κ₂=2, n₀=1])
Use the ITP bracketing method. This method claims it "is the first root-finding algorithm that achieves the superlinear convergence of the secant method while retaining the optimal worst-case performance of the bisection method."
The values κ1
, κ₂
, and n₀
are tuning parameters.
The suggested value of κ₁
is 0.2/(b-a)
, but the default here is 0.2
. The value of κ₂
is 2
, and the default value of n₀
is 1
.
Note:
Suggested on discourse by @TheLateKronos
, who supplied the original version of the code.
Roots.FalsePosition
— TypeFalsePosition([galadino_factor])
Use the false position method to find a zero for the function f
within the bracketing interval [a,b]
.
The false position method is a modified bisection method, where the midpoint between [aₖ, bₖ]
is chosen to be the intersection point of the secant line with the $x$ axis, and not the average between the two values.
To speed up convergence for concave functions, this algorithm implements the $12$ reduction factors of Galdino (A family of regula falsi root-finding methods). These are specified by number, as in FalsePosition(2)
or by one of three names FalsePosition(:pegasus)
, FalsePosition(:illinois)
, or FalsePosition(:anderson_bjork)
(the default). The default choice has generally better performance than the others, though there are exceptions.
For some problems, the number of function calls can be greater than for the Bisection
method, but generally this algorithm will make fewer function calls.
Examples
find_zero(x -> x^5 - x - 1, (-2, 2), FalsePosition())
Roots.LithBoonkkampIJzermanBracket
— TypeLithBoonkkampIJzermanBracket()
A bracketing method which is a modification of Brent's method due to Lith, Boonkkamp, and IJzerman. The best possible convergence rate is 2.91.
A function, its derivative, and a bracketing interval need to be specified.
The state includes the 3 points – a bracket [a,b]
(b=xₙ
has f(b)
closest to 0
) and c=xₙ₋₁
– and the corresponding values for the function and its derivative at these three points.
The next proposed step is either a S=2
or S=3
selection for the LithBoonkkampIJzerman
methods with derivative information included only if it would be of help. The proposed is modified if it is dithering. The proposed is compared against a bisection step; the one in the bracket and with the smaller function value is chosen as the next step.
Roots.BracketedHalley
— TypeBracketedHalley
For a bracket [a,b]
, uses the Roots.Halley
method starting at the x
value for which fa
or fb
is closest to 0
. Uses the Roots.AbstractAlefeldPotraShi
framework to enforce the bracketing, taking an additional double secant step each time.
Roots.BracketedChebyshev
— TypeBracketedChebyshev
For a bracket [a,b]
, uses the Roots.QuadraticInverse
method starting at the x
value for which fa
or fb
is closest to 0
. Uses the Roots.AbstractAlefeldPotraShi
framework to enforce the bracketing, taking an additional double secant step each time.
Roots.BracketedSchroder
— TypeBracketedSchroder
For a bracket [a,b]
, uses the Roots.Schroder
method starting at the x
value for which fa
or fb
is closest to 0
. Uses the Roots.AbstractAlefeldPotraShi
framework to enforce the bracketing, taking an additional double secant step each time.
Non-simple zeros
The order of convergence for most methods is for simple zeros, values $\alpha$ where $f(x) = (x-\alpha) \cdot g(x)$, with $g(\alpha)$ being non-zero. For methods which are of order $k$ for non-simple zeros, usually an additional function call is needed per step. For example, this is the case for Roots.Newton
as compared to Roots.Schroder
.
Derivative-free methods for non-simple zeros have the following implemented:
Roots.King
— TypeRoots.King()
A superlinear (order 1.6...
) modification of the secant method for multiple roots. Presented in A SECANT METHOD FOR MULTIPLE ROOTS, by RICHARD F. KING, BIT 17 (1977), 321-328
The basic idea is similar to Schroder's method: apply the secant method to f/f'
. However, this uses f' ~ fp = (fx - f(x-fx))/fx
(a Steffensen step). In this implementation, Order1B
, when fx
is too big, a single secant step of f
is used.
The asymptotic error, eᵢ = xᵢ - α
, is given by eᵢ₊₂ = 1/2⋅G''/G'⋅ eᵢ⋅eᵢ₊₁ + (1/6⋅G'''/G' - (1/2⋅G''/G'))^2⋅eᵢ⋅eᵢ₊₁⋅(eᵢ+eᵢ₊₁)
.
Roots.Order1B
— TypeRoots.Order1B()
King
method with guarded secant step.
Roots.Esser
— TypeRoots.Esser()
Esser's method. This is a quadratically convergent method that, like Schroder's method, does not depend on the multiplicity of the zero. Schroder's method has update step x - r2/(r2-r1) * r1
, where ri = fⁱ⁻¹/fⁱ
. Esser approximates f' ~ f[x-h, x+h], f'' ~ f[x-h,x,x+h]
, where h = fx
, as with Steffensen's method, Requiring 3 function calls per step. The implementation Order2B
uses a secant step when |fx|
is considered too large.
Esser, H. Computing (1975) 14: 367. DOI: 10.1007/BF02253547 Eine stets quadratisch konvergente Modification des Steffensen-Verfahrens
Examples
f(x) = cos(x) - x
g(x) = f(x)^2
x0 = pi/4
find_zero(f, x0, Order2(), verbose=true) # 3 steps / 7 function calls
find_zero(f, x0, Roots.Order2B(), verbose=true) # 4 / 9
find_zero(g, x0, Order2(), verbose=true) # 22 / 45
find_zero(g, x0, Roots.Order2B(), verbose=true) # 4 / 10
Roots.Order2B
— TypeRoots.Order2B()
Esser
method with guarded secant step.
For non-simple zeros, Schroder showed an additional derivative can be used to yield quadratic convergence based on Newton's method:
Roots.Schroder
— TypeRoots.Schroder()
Schröder's method, like Halley's method, utilizes f
, f'
, and f''
. Unlike Halley it is quadratically converging, but this is independent of the multiplicity of the zero (cf. Schröder, E. "Über unendlich viele Algorithmen zur Auflösung der Gleichungen." Math. Ann. 2, 317-365, 1870; mathworld).
Schröder's method applies Newton's method to f/f'
, a function with all simple zeros.
Example
m = 2
f(x) = (cos(x)-x)^m
fp(x) = (-x + cos(x))*(-2*sin(x) - 2)
fpp(x) = 2*((x - cos(x))*cos(x) + (sin(x) + 1)^2)
find_zero((f, fp, fpp), pi/4, Roots.Halley()) # 14 steps
find_zero((f, fp, fpp), 1.0, Roots.Schroder()) # 3 steps
(Whereas, when m=1
, Halley is 2 steps to Schröder's 3.)
If function evaluations are expensive one can pass in a function which returns (f, f/f',f'/f'')
as follows
find_zero(x -> (sin(x), sin(x)/cos(x), -cos(x)/sin(x)), 3.0, Roots.Schroder())
This can be advantageous if the derivatives are easily computed from the value of f
, but otherwise would be expensive to compute.
The error, eᵢ = xᵢ - α
, is the same as Newton
with f
replaced by f/f'
.
A family of methods for non-simple zeros which require $k$ derivatives to be order $k$, with $k=2$ yielding Schroder's method, are implemented in:
Roots.AbstractThukralBMethod
— TypeAbstractThukralBMethod
Abstract type for ThukralXB
methods for X
being 2
,3
,4
, or 5
.
These are a family of methods which are
- efficient (order
X
) for non-simple roots (e.g.Thukral2B
is theSchroder
method) - take
X+1
function calls per step - require
X
derivatives. These can be passed as a tuple of functions,(f, f', f'', …)
, or as
a function returning the ratios: x -> (f(x), f(x)/f'(x), f'(x)/f''(x), …)
.
Examples
using ForwardDiff
Base.adjoint(f::Function) = x -> ForwardDiff.derivative(f, float(x))
f(x) = (exp(x) + x - 2)^6
x0 = 1/4
find_zero((f, f', f''), x0, Roots.Halley()) # 14 iterations; ≈ 48 function evaluations
find_zero((f, f', f''), big(x0), Roots.Thukral2B()) # 3 iterations; ≈ 9 function evaluations
find_zero((f, f', f'', f'''), big(x0), Roots.Thukral3B()) # 2 iterations; ≈ 8 function evaluations
Reference
Introduction to a family of Thukral $k$-order method for finding multiple zeros of nonlinear equations, R. Thukral, JOURNAL OF ADVANCES IN MATHEMATICS 13(3):7230-7237, DOI: 10.24297/jam.v13i3.6146.
Hybrid methods
A useful strategy is to begin with a non-bracketing method and switch to a bracketing method should a bracket be encountered. This allows for the identification of zeros which are not surrounded by a bracket, and have guaranteed convergence should a bracket be encountered. It is used by default by find_zero(f,a)
.
Roots.Order0
— TypeOrder0()
The Order0
method is engineered to be a more robust, though possibly slower, alternative to the other derivative-free root-finding methods. The implementation roughly follows the algorithm described in Personal Calculator Has Key to Solve Any Equation $f(x) = 0$, the SOLVE button from the HP-34C. The basic idea is to use a secant step. If along the way a bracket is found, switch to a bracketing algorithm, using AlefeldPotraShi
. If the secant step fails to decrease the function value, a quadratic step is used up to $4$ times.
This is not really $0$-order: the secant method has order $1.6...$ Wikipedia and the the bracketing method has order $1.6180...$ Wikipedia so for reasonable starting points and functions, this algorithm should be superlinear, and relatively robust to non-reasonable starting points.
All zeros
The find_zeros
function heuristically scans an interval for all zeros using a combination of bracketing and non-bracketing methods. The AllZeros
method may be passed to solve
to call this.
Roots.AllZeros
— TypeAllZeros
Type to indicate to solve
that find_zeros
should be used to solve the given ZeroProblem
.
Example
julia> Z = ZeroProblem(cos, (0, 2pi));
julia> solve(Z, AllZeros())
2-element Vector{Float64}:
1.5707963267948966
4.71238898038469
Rates of convergence
The order of a method is $q$, where $e_{i+1} \approx e_i^q$. Newton's method is famously quadratic for simple roots; the secant method of order $q \approx \varphi=1.618\dots$. However, $p=2$ calls are needed for Newton's method, and only $p=1$ for the secant method. The asymptotic efficiency is $q^{1/p}$, which penalizes function calls. There are other order $k$ methods taking $k$ function calls per step, e.g., Halley's; others take fewer, as seen below. Many use inverse quadratic steps, others inverse cubic–these have order $q$ solving $q^{s+1}-2q^s+1$ ($s=3$ for quadratic). For robust methods, generally $1$ additional function call is needed to achieve the convergence rate, Schroder
being a good example.
Type | Method | Order | F evals | Asymptotic efficiency |
---|---|---|---|---|
Hybrid | Order0 | $\approx 1.618\dots$ | ||
Derivative Free | Secant | $\varphi=1.618\dots$ | $1$ | $1.618\dots$ |
Derivative Free | Steffensen | $2$ | $2$ | $1.414\dots$ |
Derivative Free | Order5 | $5$ | $4$ | $1.495\dots$ |
Derivative Free | Order8 | $8$ | $4$ | $1.681\dots$ |
Derivative Free | Order16 | $16$ | $5$ | $1.718\dots$ |
Classical | Newton | $2$ | $2$ | $1.414\dots$ |
Classical | Halley | $3$ | $3$ | $1.442\dots$ |
Classical | QuadraticInverse | $3$ | $3$ | $1.442\dots$ |
Classical | ChebyshevLike | $3$ | $3$ | $1.442\dots$ |
Classical | SuperHalley | $3$ | $3$ | $1.442\dots$ |
MultiStep | LithBoonkkampIJzerman{S,D} | $p^s=\sum p^k(d+\sigma_k)$ | $D+1$ | varies, $1.92\dots$ max |
Bracketing | BisectionExact | $1$ | $1$ | $1$ |
Bracketing | A42 | $(2 + 7^{1/2})$ | $3,4$ | $(2 + 7^{1/2})^{1/3} = 1.6686\dots$ |
Bracketing | AlefeldPotraShi | $3,4$ | $1.618\dots$ | |
Bracketing | Brent | $\leq 1.89\dots$ | $1$ | $\leq 1.89\dots$ |
Bracketing | ITP | $\leq \varphi$ | $1$ | $\leq \varphi$ |
Bracketing | Ridders | $1.83\dots$ | $2$ | $1.225\dots$ |
Bracketing | FalsePosition | $1.442\dots$ | $1$ | $1.442\dots$ |
Bracketing | LithBoonkkampIJzermanBracket | $2.91$ | $3$ | $1.427\dots$ |
Robust | King | $\varphi=1.618\dots$ | $2$ | $1.272\dots$ |
Robust | Esser | $2$ | $3$ | $1.259\dots$ |
Robust | Schroder | $2$ | $3$ | $1.259\dots$ |
Robust | Thukral3 | $3$ | $4$ | $1.316\dots$ |
Robust | Thukral4 | $4$ | $5$ | $1.319\dots$ |
Robust | Thukral5 | $5$ | $6$ | $1.307\dots$ |
Convergence
Identifying when an algorithm converges or diverges requires specifications of tolerances and convergence criteria.
In the case of exact bisection, convergence is mathematically guaranteed. For floating point numbers, either an exact zero is found, or the bracketing interval can be subdivided into $[a_n,b_n]$ with $a_n$ and $b_n$ being adjacent floating point values. That is $b_n-a_n$ is as small as possible in floating point numbers. This can be considered a stopping criteria in $\Delta x$. For early termination (less precision but fewer function calls) a tolerance can be given so that if $\Delta_n=b_n-a_n$ is small enough the algorithm stops successfully. In floating point, assessing if $b_n \approx a_n$ requires two tolerances: a relative tolerance, as the minimal differences in floating point values depend on the size of $b_n$ and $a_n$, and an absolute tolerance for values near $0$. The values xrtol
and xatol
are passed to the Base.isapprox
function to determine closeness.
Relying on the closeness of two $x$ values will not be adequate for all problems, as there are examples where the difference $\Delta_n=|x_n-x_{n-1}|$ can be quite small, $0$ even, yet $f(x_n)$ is not near a $0$. As such, for non-bracketing methods, a check on the size of $f(x_n)$ is also used. As we find floating point approximations to $\alpha$, the zero, we must consider values small when $f(\alpha(1+\epsilon))$ is small. By Taylor's approximation, we can expect this to be around $\alpha\cdot \epsilon \cdot f'(\alpha)$. That is, small depends on the size of $\alpha$ and the derivative at $\alpha$. The former is handled by both relative and absolute tolerances (rtol
and atol
). The size of $f'(\alpha)$ is problem dependent, and can be accommodated by larger relative or absolute tolerances.
When an algorithm returns an NaN
value, it terminates. This can happen near convergence or may indicate some issues. Early termination is checked for convergence in the size of $f(x_n)$ with a relaxed tolerance when strict=false
is specified (the default).
The use of relative tolerances to check if $f(x) \approx 0$ can lead to spurious answers where $x$ is very large (and hence the relative tolerance is large). The return of very large solutions should be checked against expectations of the answer.
Deciding if an algorithm won't terminate is done through counting the number or iterations performed; the default adjusted through maxiters
. As most algorithms are superlinear, convergence happens rapidly near the answer, but all the algorithms can take a while to get near an answer, even when progress is made. As such, the maximum must be large enough to consider linear cases, yet small enough to avoid too many steps when an algorithm is non-convergent.
Convergence criteria are method dependent and are determined by the Roots.assess_convergence
methods.
Roots.assess_convergence
— FunctionRoots.assess_convergence(method, state, options)
Assess if algorithm has converged.
Return a convergence flag and a Boolean indicating if algorithm has terminated (converged or not converged)
If algorithm hasn't converged this returns (:not_converged, false)
.
If algorithm has stopped or converged, return flag and true
. Flags are:
:x_converged
ifxn1 ≈ xn
, typically with non-zero tolerances specified.:f_converged
if|f(xn1)| < max(atol, |xn1|*rtol)
:nan
or:inf
if fxn1 isNaN
or an infinity.:not_converged
if algorithm should continue
Does not check number of steps taken nor number of function evaluations.
In decide_convergence
, stopped values (and :x_converged
when strict=false
) are checked for convergence with a relaxed tolerance.
Default tolerances are specified through the Roots.default_tolerances
methods.
Roots.default_tolerances
— Functiondefault_tolerances(M::AbstractUnivariateZeroMethod, [T], [S])
The default tolerances for most methods are xatol=eps(T)
, xrtol=eps(T)
, atol=4eps(S)
, and rtol=4eps(S)
, with the proper units (absolute tolerances have the units of x
and f(x)
; relative tolerances are unitless). For Complex{T}
values, T
is used.
The number of iterations is limited by maxiters=40
.
default_tolerances(M::AbstractBisectionMethod, [T], [S])
For Bisection
when the x
values are of type Float64
, Float32
, or Float16
, the default tolerances are zero and there is no limit on the number of iterations. In this case, the algorithm is guaranteed to converge to an exact zero, or a point where the function changes sign at one of the answer's adjacent floating point values.
For other types, default non-zero tolerances for xatol
and xrtol
are given.
Simplified versions
The abstractions and many checks for convergence employed by find_zero
have a performance cost. When that is a critical concern, there are several "simple" methods provided which can offer improved performance.
Roots.secant_method
— Functionsecant_method(f, xs; [atol=0.0, rtol=8eps(), maxevals=1000])
Perform secant method to solve f(x) = 0.
The secant method is an iterative method with update step given by b - fb/m
where m
is the slope of the secant line between (a,fa)
and (b,fb)
.
The initial values can be specified as a pair of 2, as in (x₀, x₁)
or [x₀, x₁]
, or as a single value, x₁
in which case a value of x₀
is chosen.
The algorithm returns m when abs(fm) <= max(atol, abs(m) * rtol)
. If this doesn't occur before maxevals
steps or the algorithm encounters an issue, a value of NaN
is returned. If too many steps are taken, the current value is checked to see if there is a sign change for neighboring floating point values.
The Order1
method for find_zero
also implements the secant method. This one should be slightly faster, as there are fewer setup costs.
Examples:
Roots.secant_method(sin, (3,4))
Roots.secant_method(x -> x^5 -x - 1, 1.1)
This function will specialize on the function f
, so that the initial call can take more time than a call to the Order1()
method, though subsequent calls will be much faster. Using FunctionWrappers.jl
can ensure that the initial call is also equally as fast as subsequent ones.
Roots.bisection
— Functionbisection(f, a, b; [xatol, xrtol])
Performs bisection method to find a zero of a continuous function.
It is assumed that (a,b)
is a bracket, that is, the function has different signs at a
and b
. The interval (a,b)
is converted to floating point and shrunk when a
or b
is infinite. The function f
may be infinite for the typical case. If f
is not continuous, the algorithm may find jumping points over the x axis, not just zeros.
If non-trivial tolerances are specified, the process will terminate when the bracket (a,b)
satisfies isapprox(a, b, atol=xatol, rtol=xrtol)
. For zero tolerances, the default, for Float64
, Float32
, or Float16
values, the process will terminate at a value x
with f(x)=0
or f(x)*f(prevfloat(x)) < 0
or f(x) * f(nextfloat(x)) < 0
. For other number types, the Roots.A42
method is used.
Roots.muller
— Functionmuller(f, xᵢ; xatol=nothing, xrtol=nothing, maxevals=100)
muller(f, xᵢ₋₂, xᵢ₋₁, xᵢ; xatol=nothing, xrtol=nothing, maxevals=100)
Muller’s method generalizes the secant method, but uses quadratic interpolation among three points instead of linear interpolation between two. Solving for the zeros of the quadratic allows the method to find complex pairs of roots. Given three previous guesses for the root
xᵢ₋₂
,xᵢ₋₁
,xᵢ
, and the values of the polynomialf
at those points, the next approximationxᵢ₊₁
is produced.
Excerpt and the algorithm taken from
W.H. Press, S.A. Teukolsky, W.T. Vetterling and B.P. Flannery Numerical Recipes in C, Cambridge University Press (2002), p. 371
Convergence here is decided by xᵢ₊₁ ≈ xᵢ
using the tolerances specified, which both default to eps(one(typeof(abs(xᵢ))))^4/5
in the appropriate units. Each iteration performs three evaluations of f
. The first method picks two remaining points at random in relative proximity of xᵢ
.
Note that the method may return complex result even for real initial values as this depends on the function.
Examples:
muller(x->x^3-1, 0.5, 0.5im, -0.5) # → -0.500 + 0.866…im
muller(x->x^2+2, 0.0, 0.5, 1.0) # → ≈ 0.00 - 1.41…im
muller(x->(x-5)*x*(x+5), rand(3)...) # → ≈ 0.00
muller(x->x^3-1, 1.5, 1.0, 2.0) # → 2.0, Not converged
Roots.newton
— FunctionRoots.newton(f, fp, x0; kwargs...)
Implementation of Newton's method: xᵢ₊₁ = xᵢ - f(xᵢ)/f'(xᵢ)
.
Arguments:
f::Function
– function to find zero offp::Function
– the derivative off
.x0::Number
– initial guess. For Newton's method this may be complex.
With the ForwardDiff
package derivatives may be computed automatically. For example, defining D(f) = x -> ForwardDiff.derivative(f, float(x))
allows D(f)
to be used for the first derivative.
Keyword arguments are passed to find_zero
using the Roots.Newton()
method.
See also Roots.newton((f,fp), x0)
and Roots.newton(fΔf, x0)
for simpler implementations.
Roots.dfree
— Functiondfree(f, xs)
A more robust secant method implementation
Solve for f(x) = 0
using an algorithm from Personal Calculator Has Key to Solve Any Equation f(x) = 0, the SOLVE button from the HP-34C.
This is also implemented as the Order0
method for find_zero
.
The initial values can be specified as a pair of two values, as in (a,b)
or [a,b]
, or as a single value, in which case a value of b
is computed, possibly from fb
. The basic idea is to follow the secant method to convergence unless:
a bracket is found, in which case
AlefeldPotraShi
is used;the secant method is not converging, in which case a few steps of a quadratic method are used to see if that improves matters.
Convergence occurs when f(m) == 0
, there is a sign change between m
and an adjacent floating point value, or f(m) <= 2^3*eps(m)
.
A value of NaN
is returned if the algorithm takes too many steps before identifying a zero.
Examples
Roots.dfree(x -> x^5 - x - 1, 1.0)
MATLAB interface
The initial naming scheme used fzero
instead of find_zero
, following the name of the MATLAB function fzero. This interface is not recommended, but, for now, still maintained.
Roots.fzero
— Functionfzero(f, x0; order=0; kwargs...)
fzero(f, x0, M; kwargs...)
fzero(f, x0, M, N; kwargs...)
fzero(f, x0; kwargs...)
fzero(f, a::Number, b::Number; kwargs...)
fzero(f, a::Number, b::Number; order=?, kwargs...)
fzero(f, fp, a::Number; kwargs...)
Find zero of a function using one of several iterative algorithms.
f
: a scalar function or callable objectx0
: an initial guess, a scalar value or tuple of two valuesorder
: An integer, symbol, or string indicating the algorithm to use forfind_zero
. TheOrder0
default may be specified directly byorder=0
,order=:0
, ororder="0"
;Order1()
byorder=1
,order=:1
,order="1"
, ororder=:secant
;Order1B()
byorder="1B"
, etc.M
: a specific method, as would be passed tofind_zero
, bypassing the use of theorder
keywordN
: a specific bracketing method. When given, if a bracket is identified, methodN
will be used to finish instead of methodM
.a
,b
: When two values are passed along, if noorder
value is specified,Bisection
will be used over the bracketing interval(a,b)
. If anorder
value is specified, the value ofx0
will be set to(a,b)
and the specified method will be used.fp
: whenfp
is specified (assumed to compute the derivative off
), Newton's method will be usedkwargs...
: Seefind_zero
for the specification of tolerances and other keyword arguments
Examples:
fzero(sin, 3) # use Order0() method, the default
fzero(sin, 3, order=:secant) # use secant method (also just `order=1`)
fzero(sin, 3, Roots.Order1B()) # use secant method variant for multiple roots.
fzero(sin, 3, 4) # use bisection method over (3,4)
fzero(sin, 3, 4, xatol=1e-6) # use bisection method until |x_n - x_{n-1}| <= 1e-6
fzero(sin, 3, 3.1, order=1) # use secant method with x_0=3.0, x_1 = 3.1
fzero(sin, (3, 3.1), order=2) # use Steffensen's method with x_0=3.0, x_1 = 3.1
fzero(sin, cos, 3) # use Newton's method
Unlike find_zero
, fzero
does not specialize on the type of the function argument. This has the advantage of making the first use of the function f
faster, but subsequent uses slower.
Roots.fzeros
— Functionfzeros(f, a, b; kwargs...)
fzeros(f, ab; kwargs...)
Searches for all zeros of f
within an interval (a,b)
. Assumes neither a
or b
is a zero.
Compatibility interface for find_zeros
.
Tracking iterations
It is possible to add the keyword argument verbose=true
when calling the find_zero
function to get detailed information about the solution and data from each iteration. To save this data a Tracks
object may be passed in to tracks
.
Roots.Tracks
— TypeRoots.Tracks{T,S}
A Tracks
instance is used to record the progress of an algorithm. T
is the type of function inputs, and S
is the type of function outputs. They both default to Float64
. Note that because this type is not exported, you have to write Roots.Tracks()
to construct a Tracks
object.
By default, no tracking is done while finding a root. To change this, construct a Tracks
object, and pass it to the keyword argument tracks
. This will modify the Tracks
object, storing the input and function values at each iteration, along with additional information about the root-finding process.
Tracks
objects are shown in an easy-to-read format. Internally either a tuple of (x,f(x))
pairs or (aₙ, bₙ)
pairs are stored, the latter for bracketing methods. (These implementation details may change without notice.) The methods empty!
, to reset the Tracks
object; get
, to get the tracks; last
, to get the value converted to, may be of interest.
If you only want to print the information, but you don't need it later, this can conveniently be done by passing verbose=true
to the root-finding function. This will not effect the return value, which will still be the root of the function.
Examples
julia> using Roots
julia> f(x) = x^2-2
f (generic function with 1 method)
julia> tracker = Roots.Tracks()
Algorithm has not been run
julia> find_zero(f, (0, 2), Roots.Secant(), tracks=tracker) ≈ √2
true
julia> tracker
Results of univariate zero finding:
* Converged to: 1.4142135623730947
* Algorithm: Secant()
* iterations: 7
* function evaluations ≈ 9
* stopped as |f(x_n)| ≤ max(δ, |x|⋅ϵ) using δ = atol, ϵ = rtol
Trace:
x₁ = 0, fx₁ = -2
x₂ = 2, fx₂ = 2
x₃ = 1, fx₃ = -1
x₄ = 1.3333333333333333, fx₄ = -0.22222222222222232
x₅ = 1.4285714285714286, fx₅ = 0.04081632653061229
x₆ = 1.4137931034482758, fx₆ = -0.0011890606420930094
x₇ = 1.4142114384748701, fx₇ = -6.0072868388605372e-06
x₈ = 1.4142135626888697, fx₈ = 8.9314555751229818e-10
x₉ = 1.4142135623730947, fx₉ = -8.8817841970012523e-16
julia> empty!(tracker) # resets
julia> find_zero(sin, (3, 4), Roots.A42(), tracks=tracker) ≈ π
true
julia> get(tracker)
4-element Vector{NamedTuple{names, Tuple{Float64, Float64}} where names}:
(a = 3.0, b = 4.0)
(a = 3.0, b = 3.157162792479947)
(a = 3.141592614491745, b = 3.1415926926910007)
(a = 3.141592653589793, b = 3.141592653589794)
julia> last(tracker)
3.141592653589793
As designed, the Tracks
object may not record actions taken while the state object is initialized. An example is the default bisection algorithm where an initial midpoint is found to ensure the bracket does not straddle $0$.