A common operation in software is the copy of a block of memory. In C/C++, we often call the function memcpy for this purpose.
But what happens if, while you are copying the data, another thread is modifying either the source or the destination? The result is fundamentally unpredictable and almost surely a programming error.
Why would you ever code a copy function in such a way given that it is an error? Suppose you are implementing a JavaScript engine in C++, like Google v8. In JavaScript, we have SharedArrayBuffer instances that can be modified and copied from different threads. As the engineer working on the JavaScript engine, you cannot always prevent users from writing buggy code. They should use locks or another synchronization approach, but what if they did not?
In any case, you get a data race: two or more threads access the same memory location simultaneously, where at least one of the accesses is a write operation, without a synchronization mechanism to ensure that these operations occur in a specific order. You must have the following three ingredients to get a data race:
- Concurrent access: Multiple threads must be accessing the same variable or memory location at the same time.
- Write operation: At least one of these accesses must be a write operation.
- Lack of synchronization: There is no mechanism like locks, semaphores, or atomic operations.
What happens during a data race? The C++ standard states that a data race results in undefined behavior. In effect, the C++ language does not tell you what happens. A crash might occur. Of course, the JavaScript engineer would rather not see a crash.
Importantly, ‘undefined behavior’ also does not tell you that there is necessarily an er