Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Comparison of Nth-root Finding Algorithms

A second example compares four generalized nth-root finding algorithms for various n-th roots (5, 7 and 13) of a single value 28.0, for four floating-point types, float, double, long double and a Boost.Multiprecision type cpp_bin_float_50. In each case the target accuracy was set using our "recommended" accuracy limits (or at least limits that make a good starting point - which is likely to give close to full accuracy without resorting to unnecessary iterations).

Function

Precision Requested

TOMS748

numeric_limits<T>::digits - 2

Newton

floor(numeric_limits<T>::digits * 0.6)

Halley

floor(numeric_limits<T>::digits * 0.4)

Schröder

floor(numeric_limits<T>::digits * 0.4)

Tests used Microsoft Visual Studio 2013 (Update 1) and GCC 4.9.1 using source code root_n_finding_algorithms.cpp.

The timing uncertainty (especially using MSVC) is at least 5% of normalized time 'Norm'.

To pick out the 'best' and 'worst' algorithms are highlighted in blue and red. More than one result can be 'best' when normalized times are indistinguishable within the uncertainty.

Program ..\example\root_n_finding_algorithms.cpp, Microsoft Visual C++ version 14.1, Dinkumware standard library version 650, Win32 Compiled in optimise mode., _X86_SSE2

Fraction of full accuracy 1

Table 10.3. 5th root(28) for float, double, long double and cpp_bin_float_50 types, using _X86_SSE2

float

double

long d

cpp50

   

Algo

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

TOMS748

7

457

2.00

0

11

860

3.54

1

11

806

3.02

1

12

226875

8.11

0

Newton

3

228

1.00

0

4

243

1.00

-1

4

298

1.12

-1

6

27968

1.00

0

Halley

2

250

1.10

0

3

268

1.10

0

3

267

1.00

0

4

52812

1.89

0

Schröder

2

256

1.12

0

3

271

1.12

-1

3

270

1.01

-1

4

61406

2.20

0


Table 10.4. 7th root(28) for float, double, long double and cpp_bin_float_50 types, using _X86_SSE2

float

double

long d

cpp50

   

Algo

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

TOMS748

12

825

3.06

1

15

1145

4.06

2

15

1159

4.17

2

14

295781

8.12

0

Newton

5

270

1.00

0

6

282

1.00

0

6

278

1.00

0

8

36406

1.00

0

Halley

4

303

1.12

0

5

329

1.17

0

5

335

1.21

0

6

78281

2.15

0

Schröder

5

340

1.26

0

6

432

1.53

0

6

367

1.32

0

7

85156

2.34

0


Table 10.5. 11th root(28) for float, double, long double and cpp_bin_float_50 types, using _X86_SSE2

float

double

long d

cpp50

   

Algo

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

TOMS748

12

714

3.16

-2

14

909

4.19

2

14

793

3.69

2

17

211718

9.28

2

Newton

6

226

1.00

0

7

217

1.00

0

7

215

1.00

0

9

22812

1.00

0

Halley

4

262

1.16

-1

5

260

1.20

0

5

260

1.21

0

6

40781

1.79

0

Schröder

6

332

1.47

0

7

314

1.45

0

7

310

1.44

0

8

67187

2.95

0


Program root_n_finding_algorithms.cpp, Microsoft Visual C++ version 12.0, Dinkumware standard library version 610, Win32 Compiled in optimise mode., _X64_AVX

Fraction of full accuracy 1

Table 10.6. 5th root(28) for float, double, long double and cpp_bin_float_50 types, using _X64_AVX

float

double

long d

cpp50

   

Algo

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

TOMS748

7

239

1.50

0

11

451

2.53

1

11

439

2.49

1

12

90312

7.51

0

Newton

3

159

1.00

0

4

178

1.00

-1

4

176

1.00

-1

6

12031

1.00

0

Halley

2

168

1.06

0

3

203

1.14

0

3

198

1.13

0

4

20937

1.74

0

Schröder

2

173

1.09

0

3

206

1.16

-1

3

203

1.15

-1

4

26250

2.18

0


Table 10.7. 7th root(28) for float, double, long double and cpp_bin_float_50 types, using _X64_AVX

float

double

long d

cpp50

   

Algo

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

TOMS748

12

385

2.19

1

15

635

3.13

2

15

621

3.17

2

14

114843

6.81

0

Newton

5

176

1.00

0

6

203

1.00

0

6

196

1.00

0

8

16875

1.00

0

Halley

4

209

1.19

0

5

254

1.25

0

5

246

1.26

0

6

32343

1.92

0

Schröder

5

223

1.27

0

6

273

1.34

0

6

275

1.40

0

7

45156

2.68

0


Table 10.8. 11th root(28) for float, double, long double and cpp_bin_float_50 types, using _X64_AVX

float

double

long d

cpp50

   

Algo

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

TOMS748

12

467

2.42

-2

14

648

3.06

2

14

640

2.99

2

17

170000

8.85

2

Newton

6

193

1.00

0

7

212

1.00

0

7

214

1.00

0

9

19218

1.00

0

Halley

4

209

1.08

-1

5

256

1.21

0

5

250

1.17

0

6

32656

1.70

0

Schröder

6

248

1.28

0

7

306

1.44

0

7

298

1.39

0

8

53437

2.78

0


Program ..\example\root_n_finding_algorithms.cpp, GNU C++ version 7.1.0, GNU libstdc++ version 20170502, Win32 Compiled in optimise mode., _X64_SSE2

Fraction of full accuracy 1

Table 10.9. 5th root(28) for float, double, long double and cpp_bin_float_50 types, using _X64_SSE2

float

double

long d

cpp50

   

Algo

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

TOMS748

7

206

2.24

0

11

460

4.04

1

9

554

4.40

0

12

57656

8.39

0

Newton

3

92

1.00

0

4

114

1.00

-1

5

126

1.00

0

6

6875

1.00

0

Halley

2

106

1.15

0

3

134

1.18

0

3

178

1.41

0

4

12500

1.82

0

Schröder

2

126

1.37

0

3

143

1.25

-1

3

198

1.57

0

4

15312

2.23

0


Table 10.10. 7th root(28) for float, double, long double and cpp_bin_float_50 types, using _X64_SSE2

float

double

long d

cpp50

   

Algo

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

TOMS748

12

345

2.09

1

15

615

3.14

2

13

875

3.98

0

14

70937

7.32

0

Newton

5

165

1.00

0

6

196

1.00

0

7

220

1.00

0

8

9687

1.00

0

Halley

4

193

1.17

0

5

239

1.22

0

5

298

1.35

0

6

19062

1.97

0

Schröder

5

217

1.32

0

6

270

1.38

0

6

367

1.67

0

7

27343

2.82

0


Table 10.11. 11th root(28) for float, double, long double and cpp_bin_float_50 types, using _X64_SSE2

float

double

long d

cpp50

   

Algo

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

Its

Times

Norm

Dis

TOMS748

12

412

2.15

-2

14

646

2.96

2

14

1054

4.22

1

17

107187

9.53

2

Newton

6

192

1.00

0

7

218

1.00

0

7

250

1.00

0

9

11250

1.00

0

Halley

4

200

1.04

-1

5

243

1.11

0

5

345

1.38

0

6

19687

1.75

0

Schröder

6

254

1.32

0

7

321

1.47

0

7

471

1.88

0

8

33281

2.96

0


Some tentative conclusions can be drawn from this limited exercise.

Clearly, your mileage will vary, but in summary, Newton-Raphson iteration seems the first choice of algorithm, and effort to find a good 'guess' the first speed-up target, especially for Boost.Multiprecision. And of course, compiler optimisation is crucial for speed.


PrevUpHomeNext