Faster Polynomial Evaluation
This might not be one of your daily problems. Evaluating polynomials seems to be a simple task until the coefficients become big numbers, e.g. 1024 bits or more, and could impact your whole software component.
Classic evaluation#
A simple way to evaluate a polynomial is:
def classicCompute(variable: BigInt): BigInt = {
coefficients
.zipWithIndex
.foldLeft(BigInt(0)) {
case (accumulator, (coefficient, i)) =>
accumulator + coefficient * variable.pow(i)
}
}
Optimized evaluation#
In order to optimize a polynomial evaluation, one could use Horner’s method, which implies a significant improvement when using big numbers.
Translated into code, the evaluation looks like this:
def compute(variable: BigInt): BigInt = {
@tailrec
def loop(accumulator: BigInt, coefficients: List[BigInt]): BigInt = {
coefficients match {
case Nil => accumulator
case coefficient :: remaining =>
loop(coefficient + variable * accumulator, remaining)
}
}
loop(0, coefficients.reverse)
}
In the real world#
Given the two implementation, I’ve setup a benchmark test using a sbt plugin for the JMH (Java Microbenchmark Harness).
The results for 4096bit size numbers and 20 coefficients on my machine (Intel i7-7500U CPU @ 2.70GHz
) are:
Benchmark Mode Cnt Score Error Units
evaluatePolynomialClassic thrpt 3 0.065 ± 0.026 ops/ms
evaluatePolynomialHorner thrpt 3 0.135 ± 0.072 ops/ms
evaluatePolynomialClassic avgt 3 23.294 ± 48.218 ms/op
evaluatePolynomialHorner avgt 3 8.756 ± 30.397 ms/op
evaluatePolynomialClassic sample 1048 28.631 ± 1.211 ms/op
evaluatePolynomialClassic·p0.00 sample 12.550 ms/op
evaluatePolynomialClassic·p0.50 sample 25.330 ms/op
evaluatePolynomialClassic·p0.90 sample 45.181 ms/op
evaluatePolynomialClassic·p0.95 sample 53.549 ms/op
evaluatePolynomialClassic·p0.99 sample 67.898 ms/op
evaluatePolynomialClassic·p0.999 sample 102.723 ms/op
evaluatePolynomialClassic·p0.9999 sample 103.154 ms/op
evaluatePolynomialClassic·p1.00 sample 103.154 ms/op
evaluatePolynomialHorner sample 7838 3.823 ± 0.018 ms/op
evaluatePolynomialHorner·p0.00 sample 3.453 ms/op
evaluatePolynomialHorner·p0.50 sample 3.678 ms/op
evaluatePolynomialHorner·p0.90 sample 4.293 ms/op
evaluatePolynomialHorner·p0.95 sample 4.645 ms/op
evaluatePolynomialHorner·p0.99 sample 6.070 ms/op
evaluatePolynomialHorner·p0.999 sample 7.196 ms/op
evaluatePolynomialHorner·p0.9999 sample 12.157 ms/op
evaluatePolynomialHorner·p1.00 sample 12.157 ms/op
evaluatePolynomialClassic ss 3 24.851 ± 33.086 ms/op
evaluatePolynomialHorner ss 3 17.525 ± 11.917 ms/op
As we can see in the outcome above, the classic evaluation is easily outperformed by the Horner’s method, sometimes being 8x faster than the classic one.
Later on we’ll see if there are any actual usages out there in the real world.