I. Multiple Threads

Multithreading in C++ allows concurrent execution of two or more parts of a program for maximum utilization of the CPU. We can create a thread using the std::thread class and wait for a thread using the std::thread::join() function. To create multiple threads, we can use a loop to create multiple thread objects and launch them

Example of multiple threads in C++

Example multiple_threads.cpp bellow:

multiple_threads.cpp

// Example of starting multiple threads
#include <iostream>
#include <thread>
#include <chrono>

void hello(int num)
{
	// Add a delay
	std::this_thread::sleep_for(std::chrono::seconds(num));
	std::cout << "Hello from thread " << num << '\n';
}

int main() {
	// Start 3 threads
	std::cout << "Starting 3 threads:\n";
	std::thread thr1(hello, 1);
	std::thread thr2(hello, 2);
	std::thread thr3(hello, 3);
	
	// Wait for the threads to finish
	thr1.join();
	thr2.join();
	thr3.join();
}

Output

Hello from thread 1
Hello from thread 2
Hello from thread 3

Data sharing threads

Sharing data between threads is a common task in multithreading applications. When multiple threads access the same data, there is a risk of data races, which can lead to undefined behavior. To avoid data races, we need to synchronize access to shared data using mutexes, locks, or other synchronization primitives. Here are some ways to share data between threads in C++:

To share data between threads in C++, the thread function must be able to access the definition of the variable that will hold the shared data. If we are using a global function for the task function, we have to use global or static variables. If we are using member functions of a class, we have to use a static class member, or global or static variables. If the task functions are lambda expressions, then they can capture a local variable by reference. However, if they capture by value, then each lambda expression, each thread, will have its own copy of the shared data, and it will not be able to see any changes that other threads make to their copies. To avoid data races and data corruption, we need to synchronize access to shared data using mutexes, locks, or other synchronization primitives. Threads can interleave their execution and interfere with each other’s actions, which is one of the main reasons why writing safe multi-threaded programs is much harder than writing safe single-threaded programs

Data races

A data race in C++ occurs when at least two threads access a shared variable simultaneously, and at least one thread tries to modify the variable. A race condition, on the other hand, is not necessarily bad and can be the reason for a data race.

Data Race
Data Race

However, a data race is an undefined behavior, and all reasoning about the program makes no sense anymore. When multiple threads access the same memory location concurrently, and at least one of the accesses is for writing, and the threads are not using any exclusive locks to control their accesses to that memory, the order of accesses is non-deterministic, and the computation may give different results from run to run depending on that order. The compiler and optimizer assume that there are no data races in the code and optimize it under that assumption, which may result in crashes or other completely surprising behavior. Debugging data races can be a real challenge and sometimes requires tools such as ThreadSanitizer or Concurrency Visualizer. To avoid data races, we need to synchronize access to shared data using mutexes, locks, or other synchronization primitives. Data races are a common problem in multithreaded programming,

Example data_races.cpp bellow:

data_races.cpp

// Unsynchronized threads which make conflicting accesses.
// But where is the shared memory location?
#include <thread>
#include <iostream>

void print(std::string str)
{
	// A very artificial way to display a string!
	for (int i = 0; i < 5; ++i) {
		std::cout << str[0] << str[1] << str[2] << std::endl;
	}
}

int main()
{
	std::thread thr1(print, "abc");
	std::thread thr2(print, "def");
	std::thread thr3(print, "xyz");

	// Wait for the tasks to complete
	thr1.join();
	thr2.join();
	thr3.join();
}

Output

abc
abc
abc
adbefc --> Data race (add more data)
def
def
ef     --> Data race (missing data)
xyz
xyz

Data races assignment

Write a task function that increments a global int variable 100,000 times in a for loop. Write a program that runs this function in concurrent threads. When all the threads have completed execution, print out the final value of the counter. Increase the number of threads until you see anomalous results.

Solution

data_races.cpp

#include <iostream>
#include <thread>

using namespace std;

void print(int x){
	for(int i =0; i <x ; i++){
		cout<< i << endl;
	}
}
int main(){
	//Thread call
	thread thr1(print, 10);
	thread thr2(print, 20);
	thread thr3(print, 30);
	
	// Wait for the tasks to complete
	thr1.join();
	thr2.join();
	thr3.join();

}

Another solution

data_races.cpp

#include <iostream>
#include <thread>
#include <vector>

using namespace std;

int counter {0};

void task() {
	for (int i = 0; i < 100'000; ++i) ++counter;
}

int main() {
	std::vector<std::thread> tasks;
	for (int i = 0; i < 10; ++i) tasks.push_back(std::thread{task});
	for (auto& t: tasks) t.join();
	cout << counter << endl;
}

Data Race Consequences

A data race in C++ occurs when at least two threads access a shared variable simultaneously, and at least one thread tries to modify the variable. Data races lead to undefined behavior, which means that all reasoning about the program makes no sense anymore. The consequences of a data race can be severe, and they can cause the breaking of invariants, blocking issues of threads, or lifetime issues of variables

Torn reads and writes can occur when two threads access the same variable simultaneously, and at least one thread tries to modify the variable.

Solution for prevent the data races:
  • Avoid sharing data between different threads
  • If unvoidable, synchronize the threads impose an ordering on the threads how to access the data
  • This has the substantial cost increase execution time and increase program complexity.

References

  1. https://www.modernescpp.com/index.php/race-condition-versus-data-race/
  2. https://en.cppreference.com/w/cpp/language/memory_model
  3. https://www.modernescpp.com/index.php/c-core-guidelines-sharing-data-between-threads/
  4. James Raynard, Learn Multithreading with Modern C++ Udemy.
  5. https://blog.andreiavram.ro/cpp-channel-thread-safe-container-share-data-threads/