Improve GEBP kernel: unified sub-blocking, arch-independent loops, relaxed L1 budget on x86

libeigen/eigen!2293

Co-authored-by: Rasmus Munk Larsen <rmlarsen@gmail.com>
1 file changed