Memory Model & Atomics

Thread-safe programming with atomic operations and memory ordering

⚛️ What is the C++ Memory Model?

The C++ memory model defines how threads interact through memory, providing atomic operations and memory ordering guarantees for safe concurrent programming without data races.


#include <atomic>

std::atomic<int> counter{0};
counter.fetch_add(1);  // Thread-safe increment
                                    

Key Concepts

⚛️

Atomic Operations

Indivisible memory operations

std::atomic<int> x{0};
x.store(42);
🔄

Memory Ordering

Control operation sequencing

x.load(std::memory_order_acquire)
🚫

Data Race Prevention

Avoid undefined behavior

std::atomic<bool> flag{false};
🔒

Synchronization

Coordinate thread execution

std::atomic_thread_fence(
  std::memory_order_seq_cst);

🔹 Basic Atomic Operations

Atomic operations enable thread-safe updates to shared variables without locks. Types like std::atomic<int> guarantee that operations like load, store, increment, and compare-exchange are indivisible. This prevents data races and ensures consistent views across threads. Atomics are the building blocks for lock-free data structures and synchronization primitives. They are essential for implementing counters, flags, and shared state in concurrent programs where performance is critical and lock contention must be minimized.

#include <atomic>
#include <thread>
#include <iostream>
#include <vector>

std::atomic<int> counter{0};
std::atomic<bool> ready{false};

void worker(int id) {
    // Wait for signal
    while (!ready.load()) {
        std::this_thread::yield();
    }
    
    // Atomic increment
    for (int i = 0; i < 1000; ++i) {
        counter.fetch_add(1);
    }
    
    std::cout << "Worker " << id << " finished\n";
}

int main() {
    std::vector<std::thread> threads;
    
    // Start worker threads
    for (int i = 0; i < 4; ++i) {
        threads.emplace_back(worker, i);
    }
    
    // Signal all threads to start
    ready.store(true);
    
    // Wait for completion
    for (auto& t : threads) {
        t.join();
    }
    
    std::cout << "Final counter: " << counter.load() << std::endl;
    return 0;
}

Output:

Worker 0 finished
Worker 1 finished
Worker 2 finished
Worker 3 finished
Final counter: 4000

🔹 Memory Ordering Examples

Memory ordering specifies the visibility guarantees of atomic operations across threads. Choices like memory_order_relaxed (no ordering constraints), acquire/release (for synchronization), and seq_cst (sequentially consistent, the default) offer a trade-off between performance and strictness. Understanding these is vital for writing correct lock-free code. Incorrect memory ordering can lead to subtle bugs where operations appear out of order, violating algorithm invariants even in the absence of data races.

#include <atomic>
#include <thread>
#include <iostream>

std::atomic<int> data{0};
std::atomic<bool> flag{false};

void producer() {
    // Store data first
    data.store(42, std::memory_order_relaxed);
    
    // Then set flag with release semantics
    flag.store(true, std::memory_order_release);
}

void consumer() {
    // Wait for flag with acquire semantics
    while (!flag.load(std::memory_order_acquire)) {
        std::this_thread::yield();
    }
    
    // Data is guaranteed to be visible
    std::cout << "Data: " << data.load(std::memory_order_relaxed) 
              << std::endl;
}

int main() {
    std::thread prod(producer);
    std::thread cons(consumer);
    
    prod.join();
    cons.join();
    
    return 0;
}

Output:

Data: 42

🔹 Compare and Swap

Compare-and-swap (CAS) is a fundamental atomic operation for lock-free programming. It atomically checks if a variable's value equals an expected old value and, if so, updates it to a new value. This is the core of many non-blocking algorithms, enabling safe modifications without locks. CAS loops (retry on failure) implement operations like push/pop on lock-free stacks. Mastery of CAS is required for designing high-performance concurrent data structures where scalability under heavy contention is necessary.

#include <atomic>
#include <thread>
#include <iostream>
#include <vector>

class LockFreeStack {
private:
    struct Node {
        int data;
        Node* next;
        Node(int val) : data(val), next(nullptr) {}
    };
    
    std::atomic<Node*> head{nullptr};
    
public:
    void push(int value) {
        Node* newNode = new Node(value);
        Node* currentHead = head.load();
        
        do {
            newNode->next = currentHead;
        } while (!head.compare_exchange_weak(currentHead, newNode));
    }
    
    bool pop(int& result) {
        Node* currentHead = head.load();
        
        do {
            if (currentHead == nullptr) {
                return false;  // Stack is empty
            }
        } while (!head.compare_exchange_weak(currentHead, currentHead->next));
        
        result = currentHead->data;
        delete currentHead;
        return true;
    }
};

int main() {
    LockFreeStack stack;
    
    // Push some values
    stack.push(1);
    stack.push(2);
    stack.push(3);
    
    // Pop values
    int value;
    while (stack.pop(value)) {
        std::cout << "Popped: " << value << std::endl;
    }
    
    return 0;
}

Output:

Popped: 3
Popped: 2
Popped: 1

🧠 Test Your Knowledge

What does std::memory_order_acquire guarantee?