std::hardware_destructive_interference_size, std::hardware_constructive_interference_size

De cppreference.com

Definido en el archivo de encabezado `<new>`
inline constexpr std::size_t hardware_destructive_interference_size = /definido por la implementación/;	(1)	(desde C++17)
inline constexpr std::size_t hardware_constructive_interference_size = /definido por la implementación/;	(2)	(desde C++17)

1) Desplazamiento mínimo entre dos objetos para evitar compartimiento falso. Se garantiza que sea al menos alignof(std::max_align_t)

struct keep_apart {
  alignas(std::hardware_destructive_interference_size) std::atomic<int> cat;
  alignas(std::hardware_destructive_interference_size) std::atomic<int> dog;
};

2) Tamaño máximo de memoria contigua para promover un compartimiento verdadero. Se garantiza que sea al menos alignof(std::max_align_t)

struct together {
  std::atomic<int> dog;
  int puppy;
};
struct kennel {
  // Otros datos miembro...
  alignas(sizeof(together)) together pack;
  // Otros datos miembro...
};
static_assert(sizeof(together) <= std::hardware_constructive_interference_size);

[editar] Notas

Estas constantes proporcionan una forma portátil de acceder al tamaño de la línea de la caché de datos L1.

[editar] Ejemplo

El programa usa dos hilos/subprocesos que (atómicamente) escriben en los datos miembro de los objetos globales dados. El primer objeto cabe en una línea de caché, lo que da como resultado una "interferencia de hardware". El segundo objeto mantiene sus datos miembro en líneas de caché separadas, por lo que se evita una posible "sincronización de caché" después de las escrituras de los hilos/subprocesos.

Ejecuta este código

#include <atomic>
#include <chrono>
#include <cstddef>
#include <iomanip>
#include <iostream>
#include <mutex>
#include <new>
#include <thread>
 
#ifdef __cpp_lib_hardware_interference_size
    using std::hardware_constructive_interference_size;
    using std::hardware_destructive_interference_size;
#else
    // 64 bytes en x86-64 │ L1_CACHE_BYTES │ L1_CACHE_SHIFT │ __cacheline_aligned │ ...
    constexpr std::size_t hardware_constructive_interference_size
        = 2 * sizeof(std::max_align_t);
    constexpr std::size_t hardware_destructive_interference_size
        = 2 * sizeof(std::max_align_t);
#endif
 
std::mutex cout_mutex;
 
constexpr int max_write_iterations{10'000'000}; // ajuste de tiempo de referencia
 
struct alignas(hardware_constructive_interference_size)
OneCacheLiner { // ocupa una línea de caché
    std::atomic_uint64_t x{};
    std::atomic_uint64_t y{};
} oneCacheLiner;
 
struct TwoCacheLiner { // ocupa dos líneas de caché
    alignas(hardware_destructive_interference_size) std::atomic_uint64_t x{};
    alignas(hardware_destructive_interference_size) std::atomic_uint64_t y{};
} twoCacheLiner;
 
inline auto now() noexcept { return std::chrono::high_resolution_clock::now(); }
 
template<bool xy>
void oneCacheLinerThread() {
    const auto start { now() };
 
    for (uint64_t count{}; count != max_write_iterations; ++count)
        if constexpr (xy)
             oneCacheLiner.x.fetch_add(1, std::memory_order_relaxed);
        else oneCacheLiner.y.fetch_add(1, std::memory_order_relaxed);
 
    const std::chrono::duration<double, std::milli> elapsed { now() - start };
    std::lock_guard lk{cout_mutex};
    std::cout << "oneCacheLinerThread() usó " << elapsed.count() << " ms\n";
    if constexpr (xy)
         oneCacheLiner.x = elapsed.count();
    else oneCacheLiner.y = elapsed.count();
}
 
template<bool xy>
void twoCacheLinerThread() {
    const auto start { now() };
 
    for (uint64_t count{}; count != max_write_iterations; ++count)
        if constexpr (xy)
             twoCacheLiner.x.fetch_add(1, std::memory_order_relaxed);
        else twoCacheLiner.y.fetch_add(1, std::memory_order_relaxed);
 
    const std::chrono::duration<double, std::milli> elapsed { now() - start };
    std::lock_guard lk{cout_mutex};
    std::cout << "twoCacheLinerThread() usó " << elapsed.count() << " ms\n";
    if constexpr (xy)
         twoCacheLiner.x = elapsed.count();
    else twoCacheLiner.y = elapsed.count();
}
 
int main() {
    std::cout << "__cpp_lib_hardware_interference_size "
#   ifdef __cpp_lib_hardware_interference_size
        " = " << __cpp_lib_hardware_interference_size << "\n";
#   else
        "no está definido\n";
#   endif
 
    std::cout
        << "hardware_destructive_interference_size == "
        << hardware_destructive_interference_size << '\n'
        << "hardware_constructive_interference_size == "
        << hardware_constructive_interference_size << '\n'
        << "sizeof( std::max_align_t ) == " << sizeof(std::max_align_t) << "\n\n";
 
    std::cout
        << std::fixed << std::setprecision(2)
        << "sizeof( OneCacheLiner ) == " << sizeof( OneCacheLiner ) << '\n'
        << "sizeof( TwoCacheLiner ) == " << sizeof( TwoCacheLiner ) << "\n\n";
 
    constexpr int max_runs{4};
 
    int oneCacheLiner_average{0};
    for (auto i{0}; i != max_runs; ++i) {
        std::thread th1{oneCacheLinerThread<0>};
        std::thread th2{oneCacheLinerThread<1>};
        th1.join(); th2.join();
        oneCacheLiner_average += oneCacheLiner.x + oneCacheLiner.y;
    }
    std::cout << "Tiempo promedio: " << (oneCacheLiner_average / max_runs / 2) << " ms\n\n";
 
    int twoCacheLiner_average{0};
    for (auto i{0}; i != max_runs; ++i) {
        std::thread th1{twoCacheLinerThread<0>};
        std::thread th2{twoCacheLinerThread<1>};
        th1.join(); th2.join();
        twoCacheLiner_average += twoCacheLiner.x + twoCacheLiner.y;
    }
    std::cout << "Tiempo promedio: " << (twoCacheLiner_average / max_runs / 2) << " ms\n\n";
}

Posible salida:

__cpp_lib_hardware_interference_size is not defined
hardware_destructive_interference_size == 64
hardware_constructive_interference_size == 64
sizeof( std::max_align_t ) == 32
 
sizeof( OneCacheLiner ) == 64
sizeof( TwoCacheLiner ) == 128
 
oneCacheLinerThread() usó 275.23 ms
oneCacheLinerThread() usó 330.37 ms
oneCacheLinerThread() usó 320.65 ms
oneCacheLinerThread() usó 389.14 ms
oneCacheLinerThread() usó 388.48 ms
oneCacheLinerThread() usó 448.34 ms
oneCacheLinerThread() usó 420.10 ms
oneCacheLinerThread() usó 459.01 ms
Tiempo promedio: 378 ms
 
twoCacheLinerThread() usó 123.79 ms
twoCacheLinerThread() usó 130.48 ms
twoCacheLinerThread() usó 119.03 ms
twoCacheLinerThread() usó 132.32 ms
twoCacheLinerThread() usó 116.26 ms
twoCacheLinerThread() usó 122.64 ms
twoCacheLinerThread() usó 116.42 ms
twoCacheLinerThread() usó 128.11 ms
Tiempo promedio: 123 ms

[editar] Véase también

hardware_concurrency [estático]	Devuelve el número de hilos simultáneos admitidos por la implementación. (función miembro estática pública de `std::thread`) [editar]
hardware_concurrency [estático]	Devuelve el número de hilos simultáneos admitidos por la implementación. (función miembro estática pública de `std::thread`) [editar]

Obtenido de «https://es.cppreference.com/mwiki/index.php?title=cpp/thread/hardware_destructive_interference_size&oldid=38031»

Apoyo de compiladores
Implementaciones independientes y albergadas
Lenguaje
Biblioteca estándar
Encabezados de la biblioteca estándar
Requisitos denominados
Macros de prueba de característica (C++20)
Biblioteca de apoyo del lenguaje
Biblioteca de conceptos (C++20)
Biblioteca de diagnósticos
Biblioteca de gestión de memoria
Biblioteca de metaprogramación (C++11)
Biblioteca de servicios generales
Biblioteca de contenedores
Biblioteca de iteradores
Biblioteca de rangos (C++20)
Biblioteca de algoritmos
Biblioteca de cadenas
Biblioteca de procesamiento de texto
Biblioteca numérica
Biblioteca de fecha y hora
Biblioteca de entrada/salida
Biblioteca del sistema de archivos (C++17)
Biblioteca de apoyo de concurrencia (C++11)
Biblioteca de apoyo de ejecución (C++26)
Especificaciones técnicas
Índice de símbolos
Bibliotecas externas

cppreference.com

Espacios de nombres

Variantes

Vistas

Acciones

std::hardware_destructive_interference_size, std::hardware_constructive_interference_size

[editar] Notas

[editar] Ejemplo

[editar] Véase también

Navegación

Herramientas