tesseract: The ultimate N-dimensional tensor library in C++, embedded systems optimized.
Tesseract is a versatile C++ library for handling N-dimensional tensors. This library is templated, allowing efficient static and dynamic tensor operations for embedded systems, scientific computing, deep learning, and other applications requiring high-dimensional data manipulation. The TensorND class is optimized for mathematical operations, supports tensor arithmetic, slicing, and provides an intuitive interface for various tensor transformations.
- N-dimensional Support: Supports tensors with any number of dimensions specified at compile-time.
- Arithmetic Operations: Element-wise addition, subtraction, multiplication, and division with scalars or other tensors.
- Index-based Access: Allows flexible access with variadic index operators and array-based indexing.
- Transpose and Shape Manipulation: Easily transpose tensors and retrieve shape information.
- Utility Functions: Generate identity tensors, set diagonal elements, initialize tensors to zero or random values.
- Einsum-style Tensor Contraction: Efficient contraction for performing complex tensor multiplications.
- Memory Efficiency: Supports copy and move constructors for efficient memory management.
To use TensorND, simply include the TensorND.h header in your project and ensure that your compiler supports C++17 or later.
#include "TensorND.h"To create a tensor, specify its data type and dimensions. For example, here’s a 2D tensor of size 3x3:
TensorND<float, 3, 3> tensor;You can initialize a tensor with a specific value, access elements with indices, and perform arithmetic operations.
// Initialize all elements to a specific value
TensorND<int, 2, 2> tensor(5); // 2x2 tensor with all elements set to 5
// Accessing and modifying elements
tensor(0, 1) = 10; // Set element at (0,1) to 10
// Adding a scalar to each element
auto tensor2 = tensor + 3; // Each element incremented by 3
// Element-wise tensor addition
TensorND<int, 2, 2> result = tensor + tensor2;Here are a few examples of how to use TensorND to perform common tensor operations:
Addition, Subtraction, Multiplication, and Division:
TensorND<double, 3, 3> mat1, mat2;
mat1.setIdentity();
mat2.setIdentity();
auto addition = mat1 + mat2; // Tensor addition
auto subtraction = mat1 - mat2; // Tensor subtraction
auto multiplication = mat1 * mat2; // Tensor multiplication
auto scalarDivision = mat1 / 2.0; // Divide by a scalarFor a 2D tensor (matrix), you can transpose it easily:
TensorND<float, 2, 3> matrix;
// Initialize matrix values
matrix.transpose(True); // Only for 2D tensors, true for in-place transpose
// or
auto transposed = matrix.transpose(); // Transpose and return a new tensor (not in-place)
std::cout << "Shape: " << matrix.getShape(); // Get tensor shapeFor a higher-dimensional tensor, you can permute the axes:
TensorND<float, 2, 3, 4> tensor;
// Initialize tensor values
tensor.transpose([1, 2, 0], true); // Permute axes, true for in-place transpose
// or
auto permuted_tensor = tensor.transpose([1, 2, 0]); // Permute axes and return a new tensor (not in-place)
std::cout << "Shape: " << tensor.getShape(); // Get tensor shapeYou can quickly initialize a tensor as an identity matrix if it’s 2D and square.
auto identity = TensorND<float, 3, 3>::I();TensorND allows for element-wise operations between tensors of the same size:
TensorND<int, 2, 2> tensorA, tensorB;
// Fill tensorA and tensorB with values
TensorND<int, 2, 2> product = tensorA * tensorB;Setting Values:
tensor.setToZero(); // Set all elements to 0
tensor.setIdentity(); // Set as identityRandom Initialization:
tensor.setRandom(10, -10); // Random values between -10 and 10Diagonal Elements:
tensor.setDiagonal(1.0); // Set diagonal elements to 1.0Print 2D, 3D, or 4D tensors:
tensor2D.print(); // Prints 2D tensor
tensor3D.print(); // Prints 3D tensor
tensor4D.print(); // Prints 4D tensorCompare Tensors:
TensorND<double, 3, 3> mat1, mat2;
mat1.setIdentity();
mat2.setIdentity();
if (mat1 == mat2) {
std::cout << "Tensors are equal!" << std::endl;
}
// or
if (mat1 != mat2) {
std::cout << "Tensors are not equal!" << std::endl;
}Assign One Tensor to Another:
mat2 = mat1; // Assign mat1 to mat2Perform tensor contraction using the einsum function:
It is recommended to run the tests to ensure that the library is working correctly. To run the tests, simply run:
make -j 20 run_testThe following benchmarks compare the performance of FusedMatrix operations against Eigen library operations for both double and float data types. These test are executed on single-threaded mode to provide a fair comparison of the core computational efficiency of each library. Moreover, AVX2 optimizations are enabled to leverage SIMD capabilities for enhanced performance in both libraries.
Benchmarks - double
-------------------------------------------------------------------------------
benchmark name samples iterations est run time
mean low mean high mean
std dev low std dev high std dev
-------------------------------------------------------------------------------
FusedMatrix long operations 100 1 37.716 ms
377.387 us 376.776 us 379.341 us
5.03923 us 1.71825 us 11.2186 us
Eigen long operations 100 1 39.5535 ms
383.382 us 382.775 us 385.385 us
5.04039 us 1.54613 us 11.2372 us
FusedMatrix matmul 100 1 50.801 ms
548.602 us 530.716 us 567.471 us
93.448 us 84.2679 us 106.142 us
Eigen matmul 100 1 3.4708 ms
33.0388 us 32.5843 us 34.137 us
3.37313 us 1.7275 us 6.41646 us
FusedMatrix inverse 100 2 2.1964 ms
9.22072 us 9.16082 us 9.3357 us
406.015 ns 245.778 ns 636.794 ns
Eigen inverse 100 24 1.92 ms
804.018 ns 800.209 ns 811.26 ns
25.7764 ns 15.4741 ns 39.7277 ns
FusedMatrix Cholesky Decomposition 100 5 1.8955 ms
3.1683 us 3.15744 us 3.19468 us
83.607 ns 39.4999 ns 144.428 ns
Eigen Cholesky Decomposition 100 6 2.0226 ms
3.36218 us 3.35291 us 3.38386 us
68.4079 ns 34.4429 ns 118.772 ns
-------------------------------------------------------------------------------
Benchmarks - float
-------------------------------------------------------------------------------
benchmark name samples iterations est run time
mean low mean high mean
std dev low std dev high std dev
-------------------------------------------------------------------------------
FusedMatrix long operations 100 1 18.9086 ms
188.852 us 188.035 us 191.978 us
7.39626 us 1.33233 us 17.3781 us
Eigen long operations 100 1 20.5788 ms
191.755 us 190.711 us 195.182 us
8.54103 us 1.91308 us 18.7689 us
FusedMatrix matmul 100 1 57.0981 ms
572.09 us 554.786 us 593.386 us
98.0738 us 82.1576 us 114.87 us
Eigen matmul 100 2 3.5762 ms
16.602 us 16.4049 us 16.9708 us
1.33158 us 785.436 ns 1.98529 us
FusedMatrix inverse 100 2 2.1618 ms
9.05416 us 8.98452 us 9.17101 us
452.552 ns 290.792 ns 646.619 ns
Eigen inverse 100 25 1.8625 ms
749.428 ns 745.386 ns 757.463 ns
27.9641 ns 15.0259 ns 45.3381 ns
FusedMatrix Cholesky Decomposition 100 6 2.13 ms
2.92102 us 2.91226 us 2.93821 us
60.2316 ns 34.3328 ns 95.5562 ns
Eigen Cholesky Decomposition 100 11 1.9712 ms
1.79068 us 1.78596 us 1.80027 us
32.5313 ns 18.8424 ns 50.5642 ns /