Solving for zeroes with julia
Solving for zero is a skill learned early on. In the case of lines, it can be done directly using algebra. In the case of factorable polynomials, we are taught to factor and then set each term to 0 to find the possible solutions. In general, the problem is to find one (or all) solutions to the equation
\[ f(x) = 0. \]
A related problem is to find one (or all) solutions to an equation of this type:
\[ f(x) = g(x) \]
Conceptually this is identical to the above, as we just set \(h(x) = f(x) - g(x)\) and solve for when \(h(x)\) is \(0\).
In general there may be no easy-to-find solution. Here we discuss a few different means to do so with julia
, leaving others for a later time.
Before proceeding, we load our packages for plotting:
julia> using Gadfly, Compose
Zeroes of a polynomial
Polynomials are special functions, in that their relatively simple form allows for many explicit things to be known. E.g., for quadratic polynomials, one of these facts is the quadratic equation which explicitly gives a formula for the roots of a quadratic polynomial. (A “root” of a polynomial is just a polynomial-specific name for a zero.)
For example, if we have the polynomial \(2x^2 + 3x - 2\) we can solve for the roots with:
julia> a = 2; b = 3; c = -2
-2
julia> discr = b^2 - 4*a*c
25
julia> ( (-b + sqrt(discr))/(2a), (-b - sqrt(discr))/(2a) )
(0.5,-2.0)
If you wanted to write a function to do this, it would be straightforward:
julia> function quadratic(a, b, c)
discr = b^2 - 4*a*c
discr >= 0 ? ( (-b + sqrt(discr))/(2a), (-b - sqrt(discr))/(2a) ) : error("Only complex roots")
end
And to find the roots of \(x^2 + x - 1\) we have:
julia> quadratic(1, 1, -1)
(0.6180339887498949,-1.618033988749895)
There are further such formula for third and fourth degree polynomials. However, Galois at the tender age of 20 demonstrated that in general, there can be no formula for the roots of a fifth or higher degree polynomial. There are still facts known about such polynomials. For example, the Fundamental theorem of algebra states that every polynomial of degree \(n\) will have \(n\) roots, where we count complex roots and multiplicities.
How to find them is left to numeric methods though. In julia
there is an add-on package that will do this work for us. To use add on code in julia
is relatively painless. In this case, the code comes with julia
in the extras
directory. Including it is as simple as:
julia> using Polynomial ## may need Pkg.add("Polynomial")
The command to find the roots of the polynomial \(f(x)\) is simply roots
. That's easy. However, specifying the polynomial is done differently than just defining \(f(x)\) as a function.
A general polynomial is specified through its coefficients.
\[ a_n x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0. \]
Just like with our quadratic
function, to represent a polynomial most efficiently, only the coefficients are needed. We place them into a vector with the leading coefficient first:
[an, a_n1, a_n2, ..., a1, a0]
Or to be specific, if \(f(x) = -16x^2 + 32 x\) then this is
\[ -16 x^2 + 32 x + 0 \]
So \(a_2 = -16\), \(a_1 = 32\) and importantly \(a_0 = 0\), so we would use [-16, 32, 0]
.
The key point is that we must put \(0\) for any missing terms and a \(1\) when the term is a simple monomial like \(x^2\).
To create a polynomial object from this specification is done with the Polynomial
constructor:
julia> p = Poly([-16, 32, 0])
Poly(-16x^2 + 32x^1)
And finding the roots is done with the roots function:
julia> roots(p)
[2.0, 0.0]
That's it.
Use the roots
command to find the largest real root of the polynomial \(x^2 + x + 1\)
Use the roots
command to find the largest real root of the polynomial \(x^3 - x + 3\)
Before moving on, we note that the polynomial representation in terms of p
does not mean we can't use this as a function for plotting, say. In fact, we could define f
through:
julia> f(x) = polyval(p, x)
The polyval
function does the hard work. It multiplies the coefficients of p
against the proper powers of x
and adds the results.
For example, we can graph f
with
julia> plot(f, -1, 3)
However, there is no simple way to go from a factored form of a polynomial, e.g. \((x-1)(x-2)(x-4)\) to an expression like p
though. (Well you could with some derivatives through \(a_n = f^n(0)/n!\), but we resist here.)
Graphical solutions
More generally, the equation \(f(x) = 0\) may not have any special form leading to a known solution. In this case, there are various techniques to find zeroes. Here we mention graphing, such as is done with a graphing calculator. In the next section, we discuss the bisection algorithm for root finding.
Example: Finding a zero
The flight of an arrow can be modeled using various functions, depending on assumptions. Suppose an arrow is launched in the air from a height of 0 feet above the ground at an angle of \(\theta = \pi/4\). With a suitable choice for the initial velocity, a model without wind resistance for the height of the arrow at a distance \(x\) units away may be:
\[ h(x) = \tan(\theta) x- 16(\frac{x}{200 \cos\theta})^2. \]
With a wind resistance given by \(k\), again with some units, a similar equation can be constructed. It takes a different form:
\[ g(x) = \tan(\theta) x + \frac{32/k}{200 \cos\theta} x - \frac{32}{k} \log\left(\frac{200 \cos\theta}{200\cos\theta - x}\right). \]
For each model, we wish to find the value of \(x\) after launching where the height is modeled to be 0. That is how far will the arrow travel before touching the ground?
For the model without wind resistance, we can graph the function easily enough. Let's guess the distance is no more than 500 feet:
julia> begin
theta = pi/4
h(x) = tan(theta) * x - 16(x/(200cos(theta)))^2
plot(h, 0, 500)
end
Well, we haven't even seen the peak yet. Better to do a little spade
work first. This is a quadratic equation, so we can use our function
quadratic
. We just need to identify a
, b
and c
.
The value of a
comes fy factoring out the \(x^2\):
julia> a = - 16(1/(200cos(theta)))^2
-0.0008
The values for b
and c
should be clear, so we get
julia> begin
a = - 16(1/(200cos(theta)))^2
b = tan(theta)
c = 0
roots(Poly([a,b,c])) ## or quadratic(a, b, c)
end
[1250.0, 0.0]
We see that 1250 is the value. So we plot over this domain to visualize the flight:
julia> plot(h, 0, 1250)
As for the model with g
, we first define the function. This same
function appeared in a previous example. We copy that work here:
julia> function g(x)
theta = pi/4
k = 1/2
a = 200*cos(theta)
b = 32/k
tan(theta)*x + (b/a)*x - b*log(a/(a-x))
end
(And make the same note, it is more general to parameterize this by passing in values for k
and theta
.)
A quick plot over the same interval, \([0, 1250]\) yields:
julia> plot(g, 0, 1250)
error with: plot(g, 0, 1250)
DomainError()
Oh, “Domain Error.” Of course, when \(x > a\) the logarithm fails. We
try on the reduced interval avoiding the obvious asymptote at a
by
subtracting \(1\):
julia> a = 200*cos(theta)
141.4213562373095
julia> plot(g, 0, a - 1)
Now we can see the zero is between 130 and 140. We re-plot:
julia> plot(g, 130, 140)
Finally, we plot between 134 and 135 and draw the line for \(y=0\) to see the zero clearly:
julia> plot([g, x -> 0], 134, 135)
The answer is approximately \(134.8\)
Finally, we plot both graphs at once to see that it was a very windy day indeed.
julia> begin
a = 134.8
plot([h, u -> u < a ? g(u) : 0] , 0, 1250) # awkward way to plot g and h
end
Repeat the above with a value of \(k=1.1\). How far does the arrow travel to one decimal point?
Bisection algorithm
The last example had us graphically “zoom” in on a zero, and led us to an estimate to 1 or 2 decimal points. Here we discuss a method to get much greater accuracy.
The bisection algorithm is a simple, iterative procedure for finding a numeric approximate root of a continuous function over the closed interval \([a,b]\) provided \(f(a)\) and \(f(b)\) do not share the same sign. We say \(a\) and \(b\) bracket a root. With these assumptions, the fact there is at least one root (possibly more) is due to the intermediate value theorem.
The basic idea is the midpoint of \([a,b]\), \(M = (a + b)/2\), is tested for its function value. If \(f(M) = 0\), great, we are done. If it has opposite sign of \(f(a)\), then a root must be in the interval \([a,M]\), so the problem is reduced a smaller interval. Otherwise, it has opposite sign of \(f(b)\) and the problem is reduced to \([M,b]\). At this point, the algorithm is repeated. As each step halves the interval length, it must eventually converge to an answer within some specified tolerance. A familiar name for this algorithm is the divide and conqueror algorithm.
To code this requires a few tricks.
First, we need some means to test if a function value is \(0\), or
really close enough to \(0\). As values are floating point, they may
not be exactly \(0\). As such, we use a tolerance that checks if a
number is approximately \(0\). Here is a function to do so and a
specialization using a standard tolerance. We qualify both x
and y
to be real numbers, just in case you want to define some other
behavior for this function:
julia> close_enough(x::Real, y::Real, tol::Real) = abs(x-y) < tol
Now we define a specialization for the standard tolerance:
julia> close_enough(x::Real, y::Real) = close_enough(x, y, sqrt(eps()))
Let's check to see it working:
julia> close_enough(1, 1.1, 0.01) ## difference is 0.1 > 0.01
false
julia> close_enough(1, 1.01, 0.1) ## difference is 0.01 < 0.1
true
With this, we can begin to sketch out our flow with a simple problem: find a zero of \(f(x) = x^2 - 2\) between \(0\) and \(2\). Clearly, the answer is \(\sqrt{2}\).
julia> a = 0; b=2
2
julia> f(x) = x^2 - 2
julia> mid = (a + b)/2
1.0
julia> close_enough(f(mid), 0, 0.01)
false
Well are we “close enough?”. Clearly not (\(f(1) = -1\) and is no where
near \(0\)). So we close in. We move a
or b
depending on the
sign. If f(mid)
is of different sign than f(a)
then we set b =
mid
, otherwise we set a = mid
. This is done quickly as follows
using the ternary operator:
julia> f(a) * f(mid) < 0 ? (b = mid) : (a = mid)
1.0
The parentheses are needed here to postpone the assignment until after the ternary operator does its work.
Now we update the midpoint and check again
julia> mid = (a + b)/2
1.5
julia> close_enough(f(mid), 0, 0.01)
false
Still, not good enough, so we repeat the three key steps:
julia> f(a) * f(mid) < 0 ? (b = mid) : (a = mid)
1.5
julia> mid = (a + b)/2
1.25
julia> close_enough(f(mid), 0, 0.01)
false
Again, still not good enough. We find, that we will need to repeat the above three lines several times to get to an answer at the very undemanding tolerance of \(0.01\).
How many times do you need to update the bracket before the midpoint is within the tolerance \(0.01\)?
The fact that the method is slow to converge would be a problem for
us, were we not able to let the computer repeat the steps until it is
done. As such, we need some way to repeat a process an arbitrary
number of times. A for
loop repeats a specific number of times. Not
quite right. For this task, we want a while
loop which will
continue looping until some condition is met (our condition of
close_enough
).
Our code is as follows. The basic algorithm is fairly straightforward to implement. Try and read over the code to see what is being done.
There are a few new things:
1) The function is longer than a one-liner. For such functions, the longer style
function function_name (arguments)
... body ...
end
is used. Recall, the last value computed is returned unless an
explicit return
call is made, as is done below.
2) The arguments have special annotations (::Function
,
::Real
). This notation is special to julia
and is the key to the
elegance of the language. Basically, this defines the function only
for arguments that match these specifications. It is julia
's job to
check the arguments specified to a function and to call the proper
one, as many can have the same name. In computer science, such
functions are known as generic functions and the ability to dispatch
to the correct definition is called polymorphism.
Without more fuss, here is the function definition.
julia> function bisection_method (f::Function, a::Real, b::Real)
## redefine to be self-contained
close_enough(x, y) = abs( x - y ) < sqrt(eps())
close_enough(x) = close_enough(x, 0)
## check endpoints
if f(a) * f(b) > 0
error("Function values at endpoints must have different signs")
end
## endpoints aren't already zeroes?
if close_enough(f(a))
return(a)
elseif close_enough(f(b))
return(b)
end
## begin
update_mid(a, b) = (a + b)/2
mid = update_mid(a, b)
ctr = 1
while ( !close_enough(f(mid)) )
## update a or b and then mid
f(a) * f(mid) < 0 ? (b = mid) : (a = mid)
mid = update_mid(a, b)
ctr = ctr + 1
end
println("Converged to $mid in $ctr steps")
return(mid)
end
The above can be copy-and-pasted into a julia
session or loaded
(along with the other long functions used in the projects) via this:
julia> using JSON
julia> function require_gist(id::String)
require_content(content) = begin _f = tempname(); io = open(_f, "w"); write(io, content); close(io); require(_f); run(`rm $_f`) end
gist = (open(download("https://api.github.com/gists/$id")) | readlines | join | JSON.parse)
[require_content(gist["files"][nm]["content"]) for nm in keys(gist["files"]) ]
end
julia> require_gist("4530920")
{nothing, nothing, nothing, nothing, nothing, nothing, nothing}
If you have the TeachersHelpers
package, this can be done through:
julia> begin
using TeachersHelpers
require_gist("4530920")
end
error with: begin
using TeachersHelpers
require_gist("4530920")
end
ErrorException("unsupported or misplaced expression using")
Okay, to see this in use. Let's look at the function \(f(x) = -16x^2 + 32x\). We know that 0 and \(2\) are roots. Let's see if our algorithm finds them:
julia> f(x) = -16x^2 + 32x
julia> bisection_method(f, -1, 1) ## should find 0
0.0
julia> bisection_method(f, 1, 3) ## should find 2
2.0
Okay, it seems to work. Lets try it on a less trivial problem. We know \(\sin(x)\) and \(\cos(x)\) cross in the interval \([0, \pi/2]\). If we are too tired to remember where, we can simply ask:
julia> f(x) = cos(x) - sin(x)
julia> out = bisection_method(f, 0, pi/2)
0.7853981633974483
This, of course, looks betters as \(\pi/4\):
julia> out - pi/4
0.0
Example: Graphical and numerical answers
One needs to know where to look in order to use the bisection method. The basic one-two punch is
graph the function to identify quickly values \([a,b]\) which bound a zero, then
use the bisection method to find the zero to many decimal points.
Here we illustrate with the problem of finding all intersection points of \(e^x = x^4\) over the interval \([0,10]\). That is, for the function \(f(x) = e^x - x^4\) find the zeroes.
A quick plot shows, that the function has such a wide range, that looking over the entire domain at once will be problematic:
julia> f(x) = exp(x) - x^4
julia> plot(f, 0, 10)
Instead, we look between \([0,3]\) and \([8,9]\). A quick confirmation shows these are good choices to use. For example, between \(8\) and \(9\) we have:
julia> plot([f, x->0], 8, 9) ## plot f and the line y=0
So we find the values of the zero in the bracketed region \([8,9]\):
julia> bisection_method(f, 8, 9)
8.61316945643921
The root within \([0, 3]\) is found with:
julia> bisection_method(f, 0, 3)
1.4296118253841996
There is another root in the interval \([-10, 0]\) for the function \(f(x) = e^x - x^4\). Find its value:
Let b=
37 and \(f(x) = x^2 - b \cdot x \cdot \log(x)\). This
function has two zeroes on the positive \(x\) axis. You are asked to
find the largest (graph and bracket…):
Use the bisection method to find the root of
\[ g(x) = \tan(\theta) x + \frac{32/k}{200 \cos\theta} x - \frac{32}{k} \log\left(\frac{200 \cos\theta}{200\cos\theta - x}\right). \]
for \(\theta = \pi/4\) and \(k=½\). If \(a = 200 \cos\theta\) use the values \(a/2\) and \(a-1\) to bracket the root.
One can speed the bisection method up by choosing the midpoint more carefully. The following function implements “Ridder's method” for picking the midpoint.
julia> function update_mid(a, b)
c = (a + b)/2
discr = f(c)^2 - f(a)*f(b)
c + (c - a) * sign(f(a) - f(b)) * f(c) / sqrt(discr)
end
Replace the update_mid
function in the bisection method and give it
a new name, say ridder
.
It takes 25 steps to do the bisection method for the function \(f(x) = \sin(x) - \cos^2(x)\)
julia> bisection_method( x -> sin(x) - cos(x)^2, 0, pi/2)
0.6662394364838946
How many steps does it take with Ridder's method?