1. Fix a bug in psqrt and make it return 0 for +inf arguments.
2. Simplify handling of special cases by taking advantage of the fact that the
   builtin vrsqrt approximation handles negative, zero and +inf arguments correctly.
   This speeds up the SSE and AVX implementations by ~20%.
3. Make the Newton-Raphson formula used for rsqrt more numerically robust:

Before: y = y * (1.5 - x/2 * y^2)
After: y = y * (1.5 - y * (x/2) * y)

Forming y^2 can overflow for very large or very small (denormalized) values of x, while x*y ~= 1. For AVX512, this makes it possible to compute accurate results for denormal inputs down to ~1e-42 in single precision.

4. Add a faster double precision implementation for Knights Landing using the vrsqrt28 instruction and a single Newton-Raphson iteration.

Benchmark results: https://bitbucket.org/snippets/rmlarsen/5LBq9o
4 files changed
tree: a20a4945bf0083ffe1a4d4a617a7a2c4740ba00a
  1. bench/
  2. blas/
  3. cmake/
  4. debug/
  5. demos/
  6. doc/
  7. Eigen/
  8. failtest/
  9. lapack/
  10. scripts/
  11. test/
  12. unsupported/
  13. .hgeol
  14. .hgignore
  15. CMakeLists.txt
  16. COPYING.BSD
  17. COPYING.GPL
  18. COPYING.LGPL
  19. COPYING.MINPACK
  20. COPYING.MPL2
  21. COPYING.README
  22. CTestConfig.cmake
  23. CTestCustom.cmake.in
  24. eigen3.pc.in
  25. INSTALL
  26. README.md
  27. signature_of_eigen3_matrix_library
README.md

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

For more information go to http://eigen.tuxfamily.org/.

For pull request please only use the official repository at https://bitbucket.org/eigen/eigen.

For bug reports and feature requests go to http://eigen.tuxfamily.org/bz.