AGENTS.md

Guidance for AI coding agents (Claude Code, Cursor, Copilot Workspace, Aider, OpenAI Codex, etc.) working in this repository. Human contributors should start from README.md and the project sites it links to (https://eigen.tuxfamily.org, https://libeigen.gitlab.io, and the upstream repo at https://gitlab.com/libeigen/eigen); this file is the agent-facing condensation.

If your tool also reads its own per-tool config file, treat that as a thin pointer and rely on this file for substance.

Agent guidelines (read first)

These are the cross-cutting rules that catch agents most often:

  1. Provenance and attribution — no plagiarism, no license laundering. Eigen code must be original or derived from publicly published MPL-2.0-compatible material. Do not copy — verbatim, paraphrased, or “translated” to another syntax — from incompatibly-licensed sources (proprietary, NDA-encumbered, prior-employer internal); paraphrasing is still a derivative work. Cite published references inline when they inform an implementation — LAPACK / LAWN, ACM TOMS / SIAM papers, Higham, Golub & van Loan, textbook algorithms, Boost components, vendor application notes — by name (author, year, identifier; a comment or Doxygen \note block suffices). Same discipline for AI-suggested code: cite the source, or rewrite from a known reference and cite that, or drop it. Ideas aren't copyrightable, specific expressions are — learn from the source, write your own, credit it.
  2. Header-only contract. Eigen ships as headers only. Never include anything under Eigen/src/... or unsupported/Eigen/src/... directly — InternalHeaderCheck.h makes that a hard compile error. User code reaches implementation only through the umbrella module headers (Eigen/Core, Eigen/Dense, Eigen/SVD, …). When moving or renaming files inside any src/ subtree, delete the old file outright; only the public umbrella headers ever get back-compat forwarding shims.
  3. Preserve EIGEN_DEVICE_FUNC. It is pervasive on coefficient-level methods so the same code compiles for host and CUDA / HIP / SYCL device. Dropping it silently breaks GPU builds and is rarely caught locally.
  4. Don't apply general C++ “modernize” advice. Do not propose modernize-* or cppcoreguidelines-* clang-tidy fixes, replace EIGEN_STRONG_INLINE with inline, “fix” Eigen's macro indentation, or reorder includes. Eigen has its own conventions encoded in .clang-format (SortIncludes: false, custom StatementMacros / AttributeMacros); CI will diff against them.
  5. Format before you commit. Style is clang-format-17 exactly (Google base, 120 cols). Other versions diff against CI. Run scripts/format.sh (whole tree) or clang-format-17 -i <file> (one file) before pushing. Format failures are the single most common reason an MR is red.
  6. Tests are not gtest (today). They use Eigen's own framework (test/main.h): VERIFY_* assertions, CALL_SUBTEST_N / EIGEN_TEST_PART_N for splitting, EIGEN_DECLARE_TEST(<name>) { ... } as the entry point. See “Adding a test” below. In-flight migration to Google Test: MR 2159 (draft) replaces EIGEN_DECLARE_TEST / CALL_SUBTEST_N with gtest TEST / TYPED_TEST, brings in gtest 1.15.2 via FetchContent, and bridges VERIFY / VERIFY_IS_APPROX to gtest expectations. When that lands, this rule flips — write new tests as gtest fixtures and use the bridged VERIFY_* macros.
  7. Tests are not in the default all target. Bare ninja builds nothing relevant. Use the named targets (buildtests, check, BuildOfficial, BuildUnsupported, buildsmoketests, buildtests_gpu, check_gpu).
  8. auto traps. auto x = A + B; captures a lazy expression holding references that can dangle. Use .eval() or an explicit type. Likewise be careful with .noalias() — it must only be used when the destination doesn't appear on the right side.
  9. Stage commits explicitly. Don't git add -A / git add . inside an Eigen working copy. Repo roots commonly accumulate untracked dotfiles and tool config (.vscode/, .idea/, .claude/, etc.) that must not enter commits. Add files by path or with targeted globs (e.g. git add 'Eigen/src/Cholesky/').
  10. Pause before pushing or filing an MR. External-system writes (push, MR creation, MR comments) have non-trivial blast radius. After a local commit, summarize what changed and wait for the human to say “push” or “file the MR” before doing so. The same applies to scope decisions like bundling/splitting commits.
  11. Tests and benchmarks ship with the code. New functionality lands with its tests; performance-sensitive changes land with a benchmark. Don't defer either to a follow-up MR.
  12. Benchmarking discipline. Benchmarks on a loaded system are useless. Before running any benchmark: check uptime shows a low load average, finish/cancel background builds, and never run two benchmark binaries in the same shell invocation (parallel or chained with &&) — run each in its own shell, take medians within a binary, and optionally alternate (A, B, A, B) across separate invocations to detect drift.
  13. Tensor module is foundational to TensorFlow. unsupported/Eigen/Tensor and the work-stealing thread pool in Eigen/ThreadPool together form TensorFlow's core compute backend. “Unsupported” here means “looser API-stability guarantees” — it does not mean low-traffic or low-stakes. Breaking changes (signatures, header layout, semantics, or performance regressions on contraction / reduction / morphing kernels) ripple into every TensorFlow build and from there into every project that pulls TensorFlow as a dependency. Treat Tensor and ThreadPool as load-bearing: prefer additive changes, keep header paths stable, run downstream Tensor tests (unsupported/test/tensor_*), and call out any behavior change prominently in the MR description.

Repository overview

Eigen is a header-only C++ template library for linear algebra: dense and sparse matrices, vectors, decompositions, geometry, iterative solvers. The library itself does not need to be built — consumers just #include <Eigen/Dense> (or other module headers). CMake is required only for tests, BLAS/LAPACK shims, demos, and docs.

Upstream is GitLab: https://gitlab.com/libeigen/eigen. Bug reports, feature requests, and merge requests go there — the GitHub mirror is read-only. Minimum standard is C++14 (target_compile_features(eigen INTERFACE cxx_std_14) in CMakeLists.txt); SYCL builds force C++17. The project aspires to bump the baseline to C++17 in a future release. The pace on baseline bumps is deliberately slow so that new Eigen improvements remain available to users on embedded platforms, whose toolchains are often several years behind mainline. License: MPL2 for the bulk, with a few files under other compatible licenses (COPYING.* at the root).

Quality bar

Eigen aspires to state-of-the-art results on two axes — performance and numerical accuracy / IEEE-754 conformance — and treats them as separate goals that are sometimes in tension.

Performance. Eigen Core emphasizes single-core throughput via two levers: SIMD through per-architecture packet backends (see “SIMD / packet math layer”), and memory-hierarchy use through blocked algorithms — cache-aware blocking and panel/kernel decompositions in Eigen/src/Core/products/, tile-based traversal in the Tensor module. Eigen Core is mostly optimized for single-core throughput; multi-core in Core is opt-in (OpenMP or EIGEN_GEMM_THREADPOOL) and covers a subset of operations. The Tensor module is the opposite — designed for multi-core via ThreadPoolDevice, which dispatches across worker threads of the work-stealing thread pool. See “Multi-threading” below.

Numerical accuracy. The bar is LAPACK-level for linear algebra (decompositions, solvers — backward stability, pivoting, conditioning) and C++ standard-library level for standard math functions (exp, log, sin, pow, …) on scalars. For special values — IEEE-754 entities (NaN, ±0, ±∞, subnormals) and function-specific edge cases (singularities, branch boundaries; e.g. log(0), pow(0, 0)) — the bar is exact conformance to IEEE 754 / ISO C / C++ specifications, with the cppreference page for each function as the authoritative spec (e.g. std::pow). On regular inputs a few ULPs of error in vectorized math in exchange for SIMD throughput is the long-standing trade-off; larger deviations need explicit justification. Special-value handling is not subject to the few-ULPs trade-off — it must match spec exactly.

When adding or modifying a numerical kernel:

  • Test numerical corner cases, not just typical inputs: ±0, ±∞, NaN, subnormals, values at and around the function's domain boundaries (e.g. log(0), log(-x), pow(0, 0)), values near overflow / underflow, denormalized results, signed-zero preservation, and ULP behavior near hard cases (e.g. argument-reduction breakdown for sin/cos at large arguments).
  • Matrix coverage matters as much as scalar coverage. Decomposition, solver, and matrix-function tests should exercise ill-conditioned inputs across a range of condition numbers (well-conditioned through near-singular through singular) and matrices with structure relevant to the algorithm: Hilbert, Vandermonde, Pascal, Wilkinson‘s W, Frank, Lehmer, KMS / Toeplitz, banded, rank-deficient, defective and near-defective (Jordan blocks), positive-definite-but-barely, etc. Standard references: **Higham’s Accuracy and Stability of Numerical Algorithms and Functions of Matrices** (and MATLAB gallery(), largely drawn from those books) and Golub & van Loan, Matrix Computations. Where the algorithm has a LAPACK counterpart, match its TESTING/ category coverage as the bar.
  • Verify behavior across all enabled packet backends — test/packetmath.cpp and unsupported/test/special_packetmath.cpp are the canonical entry points; a missing or divergent backend specialization usually shows up there first.
  • Quantify accuracy regressions in ULPs against the scalar reference, not just relative error. Sollya / MPFR are the standard tools for ground-truth and polynomial generation; do the verification in C++ with MPFR rather than Python.
  • Performance-sensitive changes ship with a benchmark (under benchmarks/ or unsupported/benchmarks/). See “Benchmarking discipline” in the agent guidelines above.

For decompositions and solvers: the bar is matching LAPACK on conditioning, pivoting strategy, and backward stability. Don't trade numerical robustness for speed in those code paths without explicit sign-off.

Build / test

In-flight test-framework migration: MR 2159 migrates the test framework from Eigen's custom EIGEN_DECLARE_TEST / CALL_SUBTEST_N macros to Google Test. The “Adding a test” and “Test split” subsections below describe the current framework on master. Once 2159 lands, tests become gtest TEST / TYPED_TEST fixtures, per-N executable splitting goes away, and VERIFY_* survives as a bridge to gtest expectations.

Tests are intentionally not in the default all target — ninja (or make) on its own builds nothing relevant. Drive everything through the named targets:

mkdir -p build && cd build
cmake -G Ninja ..                        # plain config

ninja buildtests                         # build all unit tests
ninja buildtests_gpu                     # GPU-only tests
ninja BuildOfficial                      # only test/      (subproject "Official")
ninja BuildUnsupported                   # only unsupported/test/ (subproject "Unsupported")
ninja buildsmoketests                    # the MR smoke set
ninja check                              # = buildtests + ctest
ninja check_gpu                          # = buildtests_gpu + ctest -L gpu

# generated wrappers (after `cmake` configure; run from inside the build dir):
./buildtests.sh <regex>                  # build tests whose name matches
./check.sh <regex>                       # build + run tests whose name matches

Common CMake knobs

ISA / vectorization (turn on per-ISA test compile flags — see CMakeLists.txt and cmake/EigenTesting.cmake for the authoritative list):

  • x86: EIGEN_TEST_SSE2, EIGEN_TEST_SSE3, EIGEN_TEST_SSSE3, EIGEN_TEST_SSE4_1, EIGEN_TEST_SSE4_2, EIGEN_TEST_AVX, EIGEN_TEST_AVX2, EIGEN_TEST_AVX512, EIGEN_TEST_AVX512DQ, EIGEN_TEST_AVX512FP16, EIGEN_TEST_FMA, EIGEN_TEST_F16C, EIGEN_TEST_X87, EIGEN_TEST_32BIT
  • ARM: EIGEN_TEST_NEON, EIGEN_TEST_NEON64
  • PowerPC: EIGEN_TEST_VSX, EIGEN_TEST_ALTIVEC
  • IBM Z (s390x): EIGEN_TEST_Z13, EIGEN_TEST_Z14
  • LoongArch: EIGEN_TEST_LSX
  • MIPS: EIGEN_TEST_MSA
  • GPU / SYCL: EIGEN_TEST_CUDA, EIGEN_TEST_CUDA_CLANG, EIGEN_TEST_CUDA_NVC (NVHPC), EIGEN_TEST_HIP, EIGEN_TEST_SYCL
  • Negative / behavioral: EIGEN_TEST_NO_EXPLICIT_VECTORIZATION, EIGEN_TEST_NO_EXPLICIT_ALIGNMENT, EIGEN_TEST_NO_EXCEPTIONS

Test-wide knobs:

  • EIGEN_TEST_MAX_SIZE=320 (default) — clamp the random matrix sizes used by tests.
  • EIGEN_SPLIT_LARGE_TESTS=ON (default) — splits any test using CALL_SUBTEST_N / EIGEN_TEST_PART_N into per-N executables (foo_1, foo_2, …); see “Test split” below.
  • EIGEN_DEFAULT_TO_ROW_MAJOR=ON — re-runs the suite with row-major default storage.
  • EIGEN_LEAVE_TEST_IN_ALL_TARGET=ON — adds tests back to all (used by some CI harnesses driving ctest's automatic build path).
  • EIGEN_TEST_CUSTOM_CXX_FLAGS=…, EIGEN_TEST_CUSTOM_LINKER_FLAGS=… — extra flags applied only to test targets (handy for working around codegen bugs).
  • EIGEN_TEST_OPENMP=ON — link tests against OpenMP.
  • EIGEN_TEST_EXTERNAL_BLAS=ON — exercise the EIGEN_USE_BLAS path against an external BLAS implementation; without it, the in-tree eigen_blas from blas/ is used. (An external-LAPACK equivalent is not yet wired up — test/CMakeLists.txt carries a TODO do the same for EXTERNAL_LAPACK, so EIGEN_TEST_EXTERNAL_LAPACK currently has no effect.)

Auxiliary trees:

  • EIGEN_BUILD_BLAS=ON / EIGEN_BUILD_LAPACK=ON (default ON only for top-level builds) — build the Eigen-backed BLAS/LAPACK shim libraries under blas/ and lapack/.
  • EIGEN_BUILD_DOC=ON — Doxygen documentation (ninja doc).
  • EIGEN_BUILD_DEMOS=ON — the small demos under demos/.

Running tests

ctest --parallel --output-on-failure          # everything that's been built
ctest -L Official                              # only tests under test/
ctest -L Unsupported                           # only tests under unsupported/test/
ctest -L gpu                                   # GPU-tagged tests
ctest -L smoketest                             # the MR smoke set
ctest -R '^cholesky'                           # regex over test names
ctest -R '^bdcsvd_3$' --output-on-failure -V   # one specific split

Subproject labels come from set_property(GLOBAL PROPERTY EIGEN_CURRENT_SUBPROJECT "Official"|"Unsupported") in test/CMakeLists.txt and unsupported/test/CMakeLists.txt. Build-group targets (BuildOfficial, BuildUnsupported) are kept in sync with the ctest labels.

A test binary can be invoked directly to control seed / repeat:

./test/foo_3 r5 s1234            # repeat 5 times, fixed seed 1234
EIGEN_REPEAT=10 EIGEN_SEED=1 ./test/foo_3

Test split (important)

A test source file containing CALL_SUBTEST_N(...) or EIGEN_TEST_PART_N macros is compiled into N separate executables named <testname>_1, <testname>_2, …, each built with -DEIGEN_TEST_PART_<N>=1 (logic in cmake/EigenTesting.cmakeei_add_test). The umbrella <testname> target builds all parts. ctest -R ^<testname>$ does not match individual parts; use ctest -R <testname> for the regex form. Set EIGEN_SPLIT_LARGE_TESTS=OFF to fold them into a single binary if you need to debug across parts.

Adding a test

  1. Create test/<name>.cpp (or unsupported/test/<name>.cpp). Include main.h. Use the VERIFY* macros (listed below). For multi-part tests, structure the body as CALL_SUBTEST_1(...) ... CALL_SUBTEST_N(...).
  2. End the file with the entry point: EIGEN_DECLARE_TEST(<name>) { ... } — that macro expands to the per-binary main() (random seed / repeat handling, signal handlers, etc.). It is not gtest.
  3. Register with ei_add_test(<name>) in the matching CMakeLists.txt near similar tests. The second/third args are extra compile flags / libraries (e.g. ei_add_test(packetmath "-DEIGEN_FAST_MATH=1")).
  4. Re-run CMake configure; the new target is then in buildtests and ctest.

Test assertion macros (defined in test/main.h):

  • VERIFY(cond) — assert condition
  • VERIFY_IS_APPROX(a, b), VERIFY_IS_NOT_APPROX(a, b) — approximate floating-point equality
  • VERIFY_IS_EQUAL(a, b), VERIFY_IS_NOT_EQUAL(a, b) — exact equality
  • VERIFY_IS_MUCH_SMALLER_THAN(a, b)
  • VERIFY_RAISES_ASSERT(expr) — assert that eigen_assert fires

failtest/ holds compile-failure tests: each has an _ok and _ko target — _ok must compile, _ko must fail to compile (driven via -DEIGEN_SHOULD_FAIL_TO_BUILD).

Benchmarks

Benchmarks under benchmarks/ are not part of the main test build — benchmarks/CMakeLists.txt is a standalone CMake project (project(EigenBenchmarks CXX)) that depends on Google Benchmark (find_package(benchmark REQUIRED)) and finds Eigen as a sibling header-only include. Configure and build it separately (e.g. cmake -G Ninja -S benchmarks -B build-bench && ninja -C build-bench), not through the buildtests/check targets. CI builds them in the dedicated benchmark stage via ci/scripts/build.benchmark.sh. (See “Benchmarking discipline” in the agent guidelines for how to run them meaningfully.)

Specify arg grids declaratively with Args / Range / ArgsProduct, never a hand-written Apply() callback. Google Benchmark‘s Apply(fn) passes its registration object as benchmark::internal::Benchmark* — an internal type (note the internal namespace) that is not part of the public API and is easy to misname. The frequent mistake is writing void MyArgs(benchmark::Benchmark* b) or ::benchmark::Benchmark* (no such type — the public alias is only benchmark::internal::Benchmark), which fails to compile and broke the whole benchmark CI stage. Reach for the chainable macros on the registration itself instead: ->Args({a, b}) for one point, ->Range(lo, hi) / ->DenseRange(lo, hi, step) for one swept dimension, and ->ArgsProduct({{...}, {...}}) for the Cartesian product of several dimensions (the declarative form of nested for loops calling b->Args(...)). These keep the grid on the registration, need no internal types, and read more clearly. Only fall back to Apply() for genuinely computed grids that the macros can’t express, and then spell the parameter benchmark::internal::Benchmark*.

Formatting and lint

Style is clang-format-17 (Google base, 120 cols, see .clang-format). The version is hard-coded — newer or older clang-format will diff against CI.

scripts/format.sh                                 # reformat the whole tree
clang-format-17 -i <file>                         # reformat one file
clang-format-17 --dry-run --Werror <file>         # check (CI's `checkformat:clangformat`)
git clang-format --diff --commit <base-sha>       # diff what CI will diff
codespell --config setup.cfg                      # spell-check (also a CI job)

.clang-format registers Eigen-specific macros (EIGEN_STATIC_ASSERT, EIGEN_INITIALIZE_COEFFS_IF_THAT_OPTION_IS_ENABLED, EIGEN_INTERNAL_DENSE_STORAGE_CTOR_PLUGIN, etc.) as StatementMacros, and EIGEN_STRONG_INLINE / EIGEN_ALWAYS_INLINE / EIGEN_DEVICE_FUNC / EIGEN_DONT_INLINE / EIGEN_DEPRECATED / EIGEN_UNUSED as AttributeMacros. Don't “fix” their indentation or strip them. SortIncludes: false — include order is meaningful in this codebase.

clang-tidy runs in the checkformat:clangtidy MR job. Locally: cmake -G Ninja -S . -B .tidy-build -DCMAKE_EXPORT_COMPILE_COMMANDS=ON && clang-tidy -p .tidy-build <files>. Eigen has its own conventions — do not apply modernize-* or cppcoreguidelines-* checks.

SPDX / REUSE. Every new source file (.h, .cpp, .cu, .inc, .cmake, CMakeLists.txt, etc.) must carry inline copyright + license headers; the checkformat:reuse CI job (reuse lint) blocks otherwise. The standard Eigen header is:

// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) <year> <Your Name> <your.email@example.com>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
// SPDX-License-Identifier: MPL-2.0

An alternative SPDX-style header — // SPDX-FileCopyrightText: The Eigen Authors plus // SPDX-License-Identifier: MPL-2.0, with no individual Copyright (C) line — is also accepted and is in active use (e.g. files attributed to “The Eigen Authors” and some top-level umbrella headers). The Copyright (C) <year> <name> form above remains the dominant convention; either passes reuse lint.

Top-level docs (*.md), generated files (*.in), and binary assets that can‘t carry inline headers are covered by path annotations in REUSE.toml — add the path if you’re creating one.

Module layout

In-flight rename: MR 2522 renames the top-level unsupported/ directory to contrib/. Paths below track master (still unsupported/). When 2522 lands, substitute contrib/ for unsupported/ mechanically — it doesn't affect the canonical/forwarding-shim distinction on the unsupported/Eigen/ bullet.

  • Eigen/ — supported public headers. Each filename without an extension (Eigen/Core, Eigen/Dense, Eigen/SVD, …) is the umbrella include for one module; implementation lives in Eigen/src/<Module>/. Never include anything under Eigen/src/... directly — that is a hard error (each implementation header includes InternalHeaderCheck.h to enforce it).
  • Eigen/src/Core/arch/{SSE,AVX,AVX512,NEON,SVE,AltiVec,GPU,HIP,SYCL,LSX,RVV10,HVX,MSA,ZVector,clang,Default}/ — per-architecture packet-math (vectorization) backends. GenericPacketMath.h defines the internal::p* API each backend specializes. (PowerPC VSX support lives inside AltiVec/, no separate VSX/ directory.)
  • Eigen/src/Core/products/ — gemm/gemv kernels (GeneralBlockPanelKernel.h, triangular / self-adjoint variants, BLAS bridges in *_BLAS.h).
  • Eigen/src/Core/util/ — meta-programming, macros, memory, ForwardDeclarations.h. Macros.h and ConfigureVectorization.h set the compile-time feature flags.
  • Eigen/ThreadPool (with Eigen/src/ThreadPool/) — work-stealing thread pool (NonBlockingThreadPool, RunQueue, EventCount, ForkJoin, CoreThreadPoolDevice). Originally developed for TensorFlow; now part of Core. It is the backend behind EIGEN_GEMM_THREADPOOL and the ThreadPoolDevice used by the Tensor module.
  • unsupported/Eigen/ — modules with looser API-stability guarantees, but not low-traffic. The headliner is Tensor (umbrella unsupported/Eigen/Tensor, sources under unsupported/Eigen/src/Tensor/) — TensorFlow‘s core compute backend; treat as load-bearing (see agent guideline 13). Other modules: TensorSymmetry, AutoDiff, Polynomials, MatrixFunctions, NNLS, FFT, GPU (cuBLAS / cuSOLVER dispatch), Splines, NumericalDiff, etc. Tests go under unsupported/test/. Anything under unsupported/Eigen/CXX11/ is backward-compatibility forwarding shims only — don’t include those paths in new code or add new headers there.
  • test/ — main test sources and test/main.h (the test framework: VERIFY_*, CALL_SUBTEST_N, EIGEN_TEST_PART_N, EIGEN_DECLARE_TEST).
  • failtest/ — compile-failure tests.
  • blas/, lapack/ — Eigen's BLAS/LAPACK shim libraries (eigen_blas, eigen_lapack), built only when EIGEN_BUILD_BLAS / EIGEN_BUILD_LAPACK are on. These get linked by sparse-solver tests (CHOLMOD, UMFPACK, KLU, SuperLU, …) when those packages are present.
  • cmake/EigenTesting.cmake, EigenConfigureTesting.cmake define the ei_add_test / ei_add_failtest machinery. Also the Find*.cmake modules for optional backends.
  • ci/ — GitLab CI (*.gitlab-ci.yml per stage) and shell drivers under ci/scripts/.
ModuleHeaderContents
CoreEigen/CoreMatrix, Array, basic linear algebra, Map, Block, Ref
LUEigen/LUFullPivLU, PartialPivLU, inverse, determinant
CholeskyEigen/CholeskyLLT, LDLT
QREigen/QRHouseholderQR, ColPivHouseholderQR, FullPivHouseholderQR
SVDEigen/SVDJacobiSVD, BDCSVD
EigenvaluesEigen/EigenvaluesSelfAdjointEigenSolver, EigenSolver, ComplexEigenSolver
GeometryEigen/GeometryQuaternion, AngleAxis, Transform, Hyperplane, cross()
SparseEigen/SparseSparseMatrix, sparse solvers (SparseLU, SparseQR, SimplicialCholesky)
IterativeLinearSolversEigen/IterativeLinearSolversConjugateGradient, BiCGSTAB, LeastSquaresConjugateGradient (additional iterative solvers — GMRES, DGMRES, IDRS, BiCGSTABL — live in unsupported/Eigen/IterativeSolvers)

External backend umbrella headers (Eigen/<Pkg>Support): AccelerateSupport, CholmodSupport, KLUSupport, MetisSupport, PaStiXSupport, PardisoSupport, SPQRSupport, SuperLUSupport, UmfPackSupport. Intel MKL and AMD AOCL are not umbrella headers — activate them by defining EIGEN_USE_MKL_ALL / EIGEN_USE_AOCL_ALL (and friends like EIGEN_USE_BLAS, EIGEN_USE_LAPACKE, EIGEN_USE_AOCL_VML, EIGEN_USE_AOCL_BLAS) before including Core or Dense; glue in Eigen/src/Core/util/MKL_support.h and AOCL_Support.h. Convenience headers: Eigen/Dense = Core + all dense solvers; Eigen/Eigen = everything supported.

Architecture

Expression templates and lazy evaluation

Eigen's central design pattern is expression templates with lazy evaluation. Arithmetic operations do not compute immediately — they return lightweight expression objects that store references to operands and encode the operation. Evaluation happens only on assignment:

// v + w returns CwiseBinaryOp<scalar_sum_op, VectorXf, VectorXf>
// No computation until assigned — then fused into a single vectorized loop
VectorXf u = v + w;

This eliminates temporaries and lets the compiler fuse operations into single-pass, SIMD-vectorized loops. Key expression types:

  • CwiseUnaryOp, CwiseBinaryOp, CwiseTernaryOp — element-wise operations
  • Product — matrix product (special eager rules, see below)
  • CwiseNullaryOp — procedural matrices (Zero, Identity, Random, custom functors)
  • Block, Transpose, Diagonal, Reshaped, IndexedView, Map, Ref — view-style expressions
  • Inverse, Solve — solver-result expressions

When Eigen evaluates eagerly into temporaries:

  1. Matrix products on assignmentmat = mat * mat auto-creates a temporary to prevent aliasing. Use .noalias() to suppress when safe: m1.noalias() += m2 * m3;
  2. Nested products — in mat1 = mat2 * mat3 + mat4 * mat5, each product evaluates to a temporary before combining.
  3. Cost-based — sub-expressions are cached when recomputation would be more expensive than storage (e.g. mat1 = mat2 * (mat3 + mat4) evaluates the sum once).

Use .eval() to force evaluation; .noalias() to override automatic temporary creation.

Evaluator system

The expression-template engine is implemented through evaluator<> traits (in Eigen/src/Core/CoreEvaluators.h and Eigen/src/Core/ProductEvaluators.h). Assignment goes through Eigen/src/Core/AssignEvaluator.h and Assign.h. New expression types must specialize evaluator<> (and often assign_op / nested_eval). Operations are lazy by default — work happens at assignment time inside AssignEvaluator, which picks between scalar / vectorized / linear / inner / outer traversal strategies based on Flags.

Class hierarchy (CRTP)

Eigen avoids virtual functions. Polymorphism is compile-time via CRTP — each class inherits from a base templated on itself (e.g. Matrix inherits MatrixBase<Matrix>).

EigenBase                          — root for anything evaluable to a matrix
├── DenseCoeffsBase                — coefficient accessors
│   └── DenseBase                  — shared dense ops (block, reshape, visitors)
│       ├── MatrixBase             — linear algebra ops (all dense matrix/vector expressions)
│       │   └── PlainObjectBase    — manages storage and resizing
│       │       ├── Matrix         — concrete dense matrix (linear algebra semantics)
│       │       └── Array          — concrete dense array (coefficient-wise semantics)
│       └── ArrayBase              — coefficient-wise ops (all array expressions)
└── SparseMatrixBase               — sparse matrix expressions
    └── SparseCompressedBase       — compressed sparse storage (CSC/CSR)
        └── SparseMatrix, SparseVector

Every expression type (Block, Transpose, Map, CwiseBinaryOp, Product, …) inherits from MatrixBase<Derived> or ArrayBase<Derived> without owning storage. PlainObjectBaseDenseStorage manages actual memory for Matrix and Array.

Matrix vs Array: Matrix types live in MatrixBase-world (linear algebra semantics: * is matrix multiply). Array types live in ArrayBase-world (coefficient-wise semantics: * is element-wise multiply). Convert with .array() and .matrix().

Writing functions that accept Eigen types

Because each expression has a unique type, functions should accept base-class references to avoid forcing evaluation:

// Good: accepts any dense matrix expression, no temporaries
template <typename Derived>
void foo(const Eigen::MatrixBase<Derived>& x);

// Also good: non-templated, uses Ref to avoid copies when layouts match
void bar(const Eigen::Ref<const Eigen::MatrixXf>& x);

// Hierarchy of genericity:
// EigenBase > DenseBase > MatrixBase/ArrayBase > concrete types

For writable parameters, take const MatrixBase<Derived>& and const_cast internally — that is the standard Eigen pattern. The reason is that callers commonly want to pass expression arguments like m.row(i) or m.block(...) as out-params; those are temporaries whose const-ness is a language artifact, not a semantic restriction. The const-ref-plus-const_cast idiom lets you accept them without forcing the user to materialize a named lvalue.

SIMD / packet math layer (src/Core/arch/)

Vectorization is abstracted through a “packet” layer. Each scalar type maps to a platform-specific SIMD vector type via internal::packet_traits<Scalar>::type. Architecture backends provide specializations of packet operations (padd, pmul, pload, pstore, pblend, …):

  • GenericPacketMath.h — generic scalar fallback API
  • arch/Default/ — shared helpers (GenericPacketMathFunctions.h, BFloat16.h, Half.h)
  • x86: arch/SSE/, arch/AVX/, arch/AVX512/
  • ARM: arch/NEON/, arch/SVE/
  • RISC-V: arch/RVV10/ (scalable vector, multiple LMUL)
  • PowerPC: arch/AltiVec/ (includes MMA support)
  • IBM Z: arch/ZVector/
  • Other: arch/MSA/ (MIPS), arch/LSX/ (Loongson), arch/HVX/ (Qualcomm Hexagon)
  • GPU: arch/GPU/ (CUDA), arch/HIP/, arch/SYCL/
  • arch/clang/ — generic clang vector-extension backend

Packets are selected at compile time; the assignment loop splits into an aligned vectorized path plus a scalar remainder. New packet-math intrinsics get added in every backend that supports the type. arch/Default/ holds generic SIMD implementations shared across backends; scalar fallbacks live in Eigen/src/Core/GenericPacketMath.h and Eigen/src/Core/MathFunctions.h. test/packetmath.cpp (and unsupported/test/special_packetmath.cpp) exercises them across all enabled backends — failures there often indicate a missing or divergent specialization.

Guard intrinsics by ISA feature macro. Inside a backend directory, an intrinsic is only available when its ISA is enabled — arch/AVX/ is compiled when EIGEN_VECTORIZE_AVX is set, but AVX2 / FMA / AVX512* intrinsics within those files must each be guarded by their own EIGEN_VECTORIZE_* macro (#ifdef EIGEN_VECTORIZE_AVX2, EIGEN_VECTORIZE_FMA, EIGEN_VECTORIZE_AVX512DQ, etc.), with a fallback for the un-guarded path. Full list in Eigen/src/Core/util/ConfigureVectorization.h. Same discipline applies elsewhere (EIGEN_VECTORIZE_NEON_FP16, EIGEN_VECTORIZE_VSX, …). Missing guards typically compile fine locally and break CI on narrower ISA targets.

CUDA / HIP / SYCL

Eigen has two independent GPU stories, and conflating them causes confusion:

  1. In-kernel use of Eigen types. When Eigen headers are included from .cu / HIP / SYCL files, most functions are automatically annotated __device__ __host__ via EIGEN_DEVICE_FUNC (unified under EIGEN_GPUCC for both CUDA and HIP). Only fixed-size types work in kernels. Host SIMD is disabled in .cu files — move expensive host-side Eigen code to .cpp. Define EIGEN_NO_CUDA or EIGEN_NO_HIP to suppress device annotations for the respective backend. On 64-bit systems, set EIGEN_DEFAULT_DENSE_INDEX_TYPE to int for device compatibility.
  2. Host-side dispatch to NVIDIA libraries (unsupported/Eigen/GPU). Plain .cpp files orchestrating cuBLAS / cuSOLVER / cuFFT / cuSPARSE / cuDSS calls on device-resident gpu::DeviceMatrix<Scalar>. Public API in Eigen::gpu, internals in Eigen::gpu::internal; solvers (gpu::LLT, gpu::LU, gpu::QR, gpu::SVD, gpu::SelfAdjointEigenSolver) wrap cuSOLVER. Not an expression-template system — every supported expression maps to a single library call, and DeviceMatrix does not inherit from MatrixBase. Tests compile as .cpp (not .cu) so NVCC doesn't instantiate Eigen CPU packet ops for CUDA vector types. See unsupported/Eigen/src/GPU/README.md.

Tensor GPU kernels live under unsupported/Eigen/src/Tensor/ (TensorReductionGpu.h, TensorContractionGpu.h, TensorDeviceGpu.h).

Multi-threading

Eigen parallelizes general dense matrix-matrix products, PartialPivLU, row-major sparse-dense products, and some iterative solvers (CG, BiCGSTAB, LeastSquaresCG) via OpenMP or the EIGEN_GEMM_THREADPOOL backend (mutually exclusive with OpenMP). Enable OpenMP with -fopenmp (GCC) or equivalent. Control threads with Eigen::setNbThreads(n) or OMP_NUM_THREADS. Limit to physical cores — hyperthreading hurts Eigen's cache-bound kernels. Eigen::initParallel() is deprecated and no longer needed.

The EIGEN_GEMM_THREADPOOL backend is Eigen‘s own work-stealing thread pool: umbrella Eigen/ThreadPool, implementation under Eigen/src/ThreadPool/. Headline class NonBlockingThreadPool (work-stealing pool over RunQueue per-thread deques, EventCount for parking), with CoreThreadPoolDevice wiring it into Eigen’s parallel-for loops and ThreadPoolInterface as the abstract base. Originally developed for TensorFlow and used by the Tensor module via ThreadPoolDevice — changes are subject to the same caution as Tensor itself.

Key preprocessor macros

Performance / behavior (define before including Eigen):

  • EIGEN_DONT_VECTORIZE — disable explicit SIMD vectorization
  • EIGEN_DONT_PARALLELIZE — disable multi-threading
  • EIGEN_FAST_MATH — exists, default 1 (current usage across the codebase is uneven; a future cleanup will either standardize or remove it)
  • EIGEN_NO_MALLOC / EIGEN_RUNTIME_NO_MALLOC — assert on heap allocation
  • EIGEN_UNROLLING_LIMIT — loop unrolling threshold (default: 110)
  • EIGEN_STACK_ALLOCATION_LIMIT — max stack buffer size (default: 128 KB)
  • EIGEN_DEFAULT_TO_ROW_MAJOR — change default storage from column-major to row-major
  • EIGEN_MAX_ALIGN_BYTES — alignment for dynamic/static data (auto-detected: 64 for AVX-512, 32 for AVX, 16 default)
  • EIGEN_USE_BLAS / EIGEN_USE_LAPACKE — delegate to external BLAS/LAPACK
  • EIGEN_USE_MKL_ALL / EIGEN_USE_AOCL_ALL — delegate broadly to MKL / AOCL

Debugging:

  • EIGEN_NO_DEBUG — disable runtime assertions (auto-set when NDEBUG is defined)
  • EIGEN_INITIALIZE_MATRICES_BY_NAN — initialize all matrices to NaN
  • EIGEN_INITIALIZE_MATRICES_BY_ZERO — initialize all matrices to zero
  • EIGEN_INTERNAL_DEBUGGING — enable assertions in internal routines

Extending Eigen:

  • EIGEN_MATRIXBASE_PLUGIN, EIGEN_MATRIX_PLUGIN, EIGEN_ARRAYBASE_PLUGIN, … — path to a header file #included inside the class body, adding custom methods to all expressions of that base.

Compiler annotations (used in Eigen source code):

  • EIGEN_STRONG_INLINE — force inline (__forceinline on MSVC)
  • EIGEN_ALWAYS_INLINE — stronger than STRONG_INLINE
  • EIGEN_DEVICE_FUNC — marks functions callable from CUDA/HIP/SYCL device code
  • EIGEN_DONT_INLINE — prevent inlining
  • EIGEN_DEPRECATED — deprecation marker

Conventions worth knowing

(Header-only contract and EIGEN_DEVICE_FUNC: see agent guidelines 2 and 3.)

  • Aliasing: aliasing safety is not a uniform invariant inside Eigen. The standard product-evaluation path inserts an auto-temporary so mat = mat * mat is safe; many other paths rely on the user being correct (.noalias() is a promise from the caller, not a check). Optimized fast paths that bypass the general assignment machinery have historically introduced aliasing bugs — when writing or modifying one, think explicitly about whether the LHS can alias the RHS, and prefer falling back to the general path when in doubt. User-facing rules are in “Common pitfalls” below.
  • Storage order: most expressions are templated on int Options carrying RowMajor / ColMajor. When writing new evaluators, propagate Flags & RowMajorBit correctly — it is the source of many subtle bugs.
  • Eigen::internal namespace: all internal implementation lives there. Public-facing types and functions stay in Eigen:: (or specific module namespaces).
  • Default storage order: column-major.
  • Index type: Eigen::Index (alias for std::ptrdiff_t).
  • Naming: classes PascalCase; methods camelCase; macros / constants EIGEN_UPPER_CASE.
  • Forward declarations: Eigen/src/Core/util/ForwardDeclarations.h is the canonical entry point for “where is type X declared?”. internal::traits<T> carries compile-time information (scalar type, dimensions, flags) without forward-declaration issues.
  • Assertions: use eigen_assert(cond) (defined in Eigen/src/Core/util/Macros.h), not raw assert() / static_assert() for runtime preconditions in library code. The test harness redefines eigen_assert so VERIFY_RAISES_ASSERT(expr) can verify failures, and it honors EIGEN_NO_DEBUG / NDEBUG. For internal-only invariants, eigen_internal_assert(cond) is gated on EIGEN_INTERNAL_DEBUGGING. For compile-time conditions, EIGEN_STATIC_ASSERT(cond, MSG_TOKEN) is preferred over plain static_assert because it integrates with Eigen's diagnostic-token machinery.

Common pitfalls

  • Aliasing: mat = mat * mat is safe (auto-temporary), but mat.noalias() = mat * mat is wrong. Only use .noalias() when the destination doesn't appear on the right side.
  • auto with expressions: auto x = A + B; captures a lazy expression holding references — the references may dangle. Use auto x = (A + B).eval(); or an explicit type.
  • Missing headers: some methods require additional includes (e.g. cross() needs Eigen/Geometry).
  • Ternary operator: cond ? exprA : exprB can fail with expression types because the two branches have different types. Use if/else.
  • template keyword: in dependent contexts, write x.template triangularView<Upper>() — without template, < is parsed as less-than.
  • Pass-by-value alignment: pre-C++17, passing fixed-size vectorizable Eigen objects by value can crash due to alignment. Pass by const reference. With C++17 and modern compilers (GCC ≥ 7, Clang ≥ 5, MSVC ≥ 19.12), over-aligned allocation handles this automatically.
  • Random() not thread-safe: DenseBase::Random() and setRandom() use std::rand internally and are not re-entrant. Use C++11 <random> generators via NullaryExpr for multi-threaded code.

CI (GitLab)

Pipeline stages: checkformatbuildtestbenchmarkdeploy. Configuration in .gitlab-ci.yml and ci/*.gitlab-ci.yml; shell drivers under ci/scripts/.

build jobs produce a .build/ artifact (test binaries) consumed by the matching test job — the test job only runs ctest, it does not rebuild. A test job that runs ctest without restricting via -L or -R will report Could not find executable for everything outside the build job's target. Test-job and build-job names must stay paired (see needs: in ci/test.linux.gitlab-ci.yml).

Test jobs filter via EIGEN_CI_CTEST_LABEL (consumed in ci/scripts/test.linux.script.sh as ctest -L $LABEL). The :official and :unsupported job-name suffixes are convention only — actual filtering is through that variable. The top-level .gitlab-ci.yml variables: block declares these with empty/global defaults (EIGEN_CI_BUILDDIR=.build, EIGEN_CI_BUILD_TARGET="", EIGEN_CI_CTEST_LABEL=""); the meaningful values are set per-job in ci/*.gitlab-ci.yml (e.g. EIGEN_CI_BUILD_TARGET=buildtests/BuildOfficial/buildtests_gpu in ci/build.linux.gitlab-ci.yml, EIGEN_CI_CTEST_LABEL=Official in the matching test jobs).

MR pipelines build / run only a smoke subset; scheduled (nightly) pipelines exercise the full matrix. Format failures (scripts/format.sh diff) are the single most common reason an MR is red — run it before pushing.

Commit message convention

Category: Short description (e.g. GPU: Fix special-function test coverage, TriangularView: alias-aware fallback for structured-diagonal product fast path).