Lazy initialisation in C++ and Multi-threading

Table of Contents

In the previous post about lazy initialisation, we showed examples and differences between using raw pointers, unique_ptr and std::optional to store the object and create it later. However, we implemented the samples from the perspective of single-threaded scenarios.

In this post, we’ll try to fill the gap and show you how to make your lazy objects available in a multithreading environment.

Multithreading and Lazy Initialisation

If your application creates several threads that might access such “lazy” resource, you might run into troubles. How do you know if the resource is initialised only once? What if two threads try to invoke the init code?

To create a background, you’ll find an example below that operates on a vector of Employees. Each employee contains a record that will be fetched from a database. We want a lazy call to the database, so each object at the start knows only its ID and then connects to the DB if needed.

class Employee {
public:
    explicit Employee(size_t id, const CompanyDatabase& db) : _id(id), _db(&db) { }

    std::string Name() const { MakeSureWereReady(); return _rec->_name; }
    std::string Surname() const { MakeSureWereReady(); return _rec->_surname; }
    std::string City() const { MakeSureWereReady(); return _rec->_city; }
    TSalary Salary() const { MakeSureWereReady(); return _rec->_salary; }

    friend std::ostream& operator<<(std::ostream& os, const Employee& em) {...}
private:
    void MakeSureWereReady() const {
        if (!_rec)
            _rec = _db->FetchRecord(_id);
    }

    size_t _id{ CompanyDatabase::InvalidID };
    mutable std::optional<CompanyDatabase::EmployeeRecord> _rec;
    const CompanyDatabase* _db;
};

The class stores an observing pointer to a database, and in each getter, we make sure we have the data loaded before accessing it.

For a reference here’s the CompanyDatabase::EmployeeRecord structure that holds the data:

using TSalary = long; // might be replaced into decimal, fixed-point type...

struct EmployeeRecord { 
    std::string _name; 
    std::string _surname; 
    std::string _city; 
    TSalary _salary{ 0 };
};

CompanyDatabase is just a simple class that contains some preallocated data in a vector.

See the full code here: @Wandbox

To illustrate that we might have issues with multithreading, let’s look at the following use case:

void SalaryTask(const std::vector<Employee>& workers) {
    auto SalaryOp = [](TSalary curr, const Employee& em) {
        return curr + em.Salary();
    };
    const auto sumSalary = std::accumulate(std::cbegin(workers), std::cend(workers), 0, SalaryOp);
    std::cout << "Sum salary: " << sumSalary << '\n';
}

void CityTask(const std::vector<Employee>& workers) {
    std::map<std::string, int> mapByCity;
    for (auto& em : workers)
        mapByCity[em.City()]++;

    for (const auto& [city, num] : mapByCity)
        std::cout << city << ": " << num << '\n';
}

void OptionalTest() {
    CompanyDatabase db;
    std::vector<Employee> workers;
    for (size_t i = 0; i < db.MaxEntries(); ++i)
        workers.emplace_back(Employee{ i, db });

    std::thread t1(SalaryTask, std::cref(workers));
    std::thread t2(CityTask, std::cref(workers));
    t1.join();
    t2.join();
}

The code creates a vector of workers and then it passes the vector into two tasks: one that calculates the salary, and the other for some location stats.

If we’re lucky, and there are no “collisions” we might get the following output:

Fetching record: 0
Fetching record: 1
Fetching record: 2
Fetching record: 3
Sum salary: 440
Cracow: 2
Warsaw: 2

What we have here is a nice serial execution.

First, the salary thread kicks in and calls the getter of Salary() that causes to fetch record from the database. Each database access prints some output, so we can see which element is referenced. Later, the city thread starts and then there are no needs to get the data from the database.

It’s super simple, with only four elements… but still, on Wandbox I could get the following output:

Fetching record: Fetching record: 0
0
Fetching record: 1
Fetching record: 2
Fetching record: 3
Sum salary: 440
Cracow: 2
Warsaw: 2

The above output means that two threads tried to access the first element simultaneously!

note: we also don’t sync std::cout, so the output might even show more artefacts.

Or even

Fetching record: 0
Fetching record: 0
Fetching record: 1
Fetching record: 1
Fetching record: 2
Fetching record: 3
Sum salary: 440
Cracow: 2
Warsaw: 2

Now, we duplicated access for two elements…

The final computations are correct and such access is not harmful to our particular example, but some worse things could happen in a real application.

At this point we also have to make a disclaimer: for our test application, we assume that once the records are read from the DB, the code only reads the data and doesn’t modify it (doesn’t change the values for employees in the input vector). In other words, we focus only on the lazy init part.

OK, how to make our code safer?

Adding Mutexes

As with most of the multithreading scenarios, we should be aware of data races. It means that to have safe code, we need to wrap it in some form of a critical section.

Let’s try a first solution with a mutex:

class EmployeeMut {
public:
    explicit EmployeeMut(size_t id, const CompanyDatabase& db) : _id(id), _db(&db) { }

    std::string Name() const { MakeSureWereReady(); return _rec->_name; }
    std::string Surname() const { MakeSureWereReady(); return _rec->_surname; }
    std::string City() const { MakeSureWereReady(); return _rec->_city; }
    TSalary Salary() const { MakeSureWereReady(); return _rec->_salary; }

    friend std::ostream& operator<<(std::ostream& os, const EmployeeMut& em) { ... }

private:
    void MakeSureWereReady() const {
         std::scoped_lock lock(mut); // !! !!
         if (!_rec)
            _rec = _db->FetchRecord(_id);
     }
private:
    size_t _id{ CompanyDatabase::InvalidID };
    const CompanyDatabase* _db;

    mutable std::mutex mut;
    mutable std::optional<CompanyDatabase::EmployeeRecord> _rec;    
};

What I did here is a simple addition of std::mutex to the class…, and that’s all… but of course, when I tried to compile it, you’ll get an error. Do you know what’s wrong here?

std::mutex is not copyable nor moveable, so it means that if you want to use it as a class member, then you need to write custom copy ctors, assignments and other special functions.

As a basic solution, I used the following implementation:

~EmployeeMut() { }

EmployeeMut(const EmployeeMut& other) noexcept 
    : _id(other._id), _db(other._db), _rec(other._rec) { }
EmployeeMut& operator=(const EmployeeMut& other) noexcept 
    { _id = other._id; _db = other._db; _rec = other._rec; return *this; }
EmployeeMut(EmployeeMut&& other) noexcept 
    : _id(other._id), _db(other._db), _rec(std::move(other._rec)) { }
EmployeeMut& operator=(EmployeeMut&& other) noexcept 
    { _id = other._id; _db = other._db; _rec = std::move(other._rec); return *this; }

In the above code, I’m skipping the mutex, and I assume that such copy/move actions are only invoked in a well defined serial scenario.

To improve the implementation, you might want to check this solution suggested at StackOverflow: mutex - How should I deal with mutexes in movable types in C++? - Stack Overflow. It handles read and write scenarios.

Running the code

If we test the EmployeeMut, we should always get the correct order.

Fetching record: 0
Fetching record: 1
Fetching record: 2
Fetching record: 3
Cracow: 2
Warsaw: 2
Sum salary: 440

Full code at @Wandbox

Using `std::call_once()`

Since C++11 we can also use possibly simplified approach: std::call_once():

class EmployeeOnce {
public:
    explicit EmployeeOnce(size_t id, const CompanyDatabase& db) : _id(id), _db(&db) { }
    ~EmployeeOnce() { }

    EmployeeOnce(const EmployeeOnce& other) noexcept : _id(other._id), _db(other._db), _rec(other._rec) { }
    EmployeeOnce& operator=(const EmployeeOnce& other) noexcept { _id = other._id; _db = other._db; _rec = other._rec; return *this; }
    EmployeeOnce(EmployeeOnce&& other) noexcept : _id(other._id), _db(other._db), _rec(std::move(other._rec)) { }
    EmployeeOnce& operator=(EmployeeOnce&& other) noexcept { _id = other._id; _db = other._db; _rec = std::move(other._rec); return *this; }

    std::string Name() const { MakeSureWereReady(); return _rec->_name; }
    std::string Surname() const { MakeSureWereReady(); return _rec->_surname; }
    std::string City() const { MakeSureWereReady(); return _rec->_city; }
    TSalary Salary() const { MakeSureWereReady(); return _rec->_salary; }

    friend std::ostream& operator<<(std::ostream& os, const EmployeeOnce& em){ }

private:
    void MakeSureWereReady() const {
        if (!_rec) {
            std::call_once(_flag, [&]() {   // !!!
            if (!_rec)
              _rec = _db->FetchRecord(_id);
            });
        }
    }

private:
    size_t _id{ CompanyDatabase::InvalidID };
    const CompanyDatabase* _db;

    mutable std::once_flag _flag;
    mutable std::optional<CompanyDatabase::EmployeeRecord> _rec;    
};

To use call_once in our code, we need to store a flag that will indicate if the callable object was invoked or not. As you can see, this is _flag in EmployeeOnce. Later, we only changed MakeSureWereReady() which now calls std::call_once().

What I noticed is that once_flag is much smaller than std::mutex. It’s just 8 bytes in GCC 9.2, vs 30 bytes for a mutex.

The trouble is that in all of the special functions we cannot copy or reassign the value of the once flag, as it’s not copyable nor moveable. If you copy an object that is already initialised, then the copy will have an unset flag. Potentially that might cause call_once() to fire again. Still, we protect that with if (!_rec) condition, so we should be safe here… however I’m not 100% sure here.

Unfortunately, we can still have data races here…
To improve the code we’ll have to insert some memory barriers to be sure we’re accesing the state for double check locking…
You can also read the following guides:
Core Guidelines: CP.111: Use a conventional pattern if you really need double-checked locking Double-Checked Locking is Fixed In C++11

Full Working Code

Have a look at the working example below:

Summary

In the blog post, you’ve seen a scenario where unprotected lazy init code was fired twice when only two threads performed some actions. As a simple solution, we improved the pattern by protecting the initialisation step with a mutex and then with std::call_once. Still, the code is relatively simple and might fail when the data is modified and not only read. So at the moment, I need to leave the topic and wait for your input and feedback.

What patterns do you use for such lazy initialisation in a multithreaded environment?

You can also read the previous article that introduces the lazy initialisation in C++.