why is Signals volatile ? #431

lucasart · 2015-09-19T08:32:43Z

So long as they are only read/written under lock protection, we are safe from unwanted compiler optimization.

Also, the bool members of SignalsType should be std::atomic_flag, or std::atomic<bool>, if we wanted to modify them outside lock protection.

Am I missing something ?

The text was updated successfully, but these errors were encountered:

mcostalba · 2015-09-29T06:44:55Z

I don't see signals are lock protected, for instance 'stop' is not.

Regarding atomic, why you think we should use atomic instead of a bool in this case? (please if possible specific answer, no general stuff).

lucasart · 2015-09-29T10:20:45Z

As far as I know, and correct me if I'm wrong:

the C standard does not even guarantee that a 4-byte variable will be written atomically. On x86 and x86-64, the actual 32-bit mov instruction does the write atomically, but a standard conforming compiler is perfectly allowed to compile the code instead as 4 1-byte writes in shuffled order. That seems stupid and crazy, but it's not forbidden. Who knows? There may be some weird platform or weird cases where it makes sense for a compiler to do that. Or maybe today there aren't such cases, but tomorrow there will be?
the compiler or hardware can do any kind of reordering. If you set signals.stop = true, then signals.stoponponderhit = true, the current thread may see these values change in one order, and another thread will see them change in a different order, due to cache coherency effects.
there is no guarantee that once a a thread sets a variable (even if you make it volatile) to some value, the new value is visible to everyone after any given amount of time. I think the C++ standard has a loose wording like "after a reasonable amount of time" or something like that.

In practice, perhaps the SF code is such that all the possible scenarios that can result from this do not cause any problem, and have been thought through correctly. But if that's the case, it makes the code much harder to understand (and easier to break by seemingly trivial modifications that fail to understand the whole logic). It really takes a lot more brain power to envisage all possibilities and be sure that all of them are handled correctly.

With lock/unlock or atomics, you have the standard to guarantee your expectations, and you have piece of mind, and a codebase that is future proof.

Is there a mesurable performance penalty in imposing the following constraints?

signals is an atomic
bit 0 means stop, bit 1 means stopOnPonderHit, etc.
signal can only be written under lock protrecion, and this must be done only once in the critical section (ie. calculate the new value and set signal = newValue only once).
signal can be read outside lock protection. This is the hot path where you don't want the lock to slow you down. Because modifications are atomic and done one shot, you know that you can never read a half baked signal where the thread modifying it has modified signal.stop but not yet signal.stopOnPonderHit (or maybe it has from the pov of the writing thread but not from the pov of the reading thread).

Edit: The "after a reasonable amount of time" is only for atomics. For volatile, it seems that no guarantee is given.

mcostalba · 2015-09-29T20:24:52Z

Yes, there is a possible slowdown http://stackoverflow.com/questions/13135834/why-is-stdatomicbool-much-slower-than-volatile-bool But this should not be a problem in our case. The practical difference is that atomic enforces also a memory barrier, volatile bool does not. But also this should not be problem in our case, nor required. The signals should not necessarily set under lock protection, indeed there is not need for this. For instance stop can raise at any time and code checking for stop will simply see it as soon as (or a little bit after it is raised), but because rising is a random event, there is no need of synchronize anything here. The most sensible reason to use atomic instead of bool is to improve self-documenting quality of the code.

mstembera · 2015-09-29T21:12:10Z

Marco, I'm not familiar with this code but your explanation makes complete sense to me.
Thank you! All I want to add is that if atomic was used for self-documenting purposes and
I was reading the code on my own without your explanation it would actually be harder for
me to realize it because I would assume atomic was in fact necessary.

lucasart · 2015-09-29T22:29:07Z

@mcostalba: By default atomics use memory_order_seq_cst. You could use memory_order_relaxed when reading the signal (on the hot path). But, of course, signals.load(memory_order_seq_cst) is more verbose than just signals. At least the C++ gives you a couple of guarantees that volatile does not:

read/write are atomic (in practice it would be the case anyway, but now it's guaranteed by the standard). so you can't have a torn write observed half-way through by another thread. I don't know if there are platforms where bool is not atomic, but I do know that there are platforms where 1-byte read/writes are non-atomic...
writes are visible to other threads after a reasonable amount of time.

lucasart · 2015-09-29T22:34:23Z

anyway. i'm closing this, because it's not a bug it's a feature.

lucasart · 2015-10-12T05:55:25Z

@mcostalba: atomics with memory_order_acquire/release are completely free on x86. no memory barriers. it's only if you use the default memory_order_seq_cst that you have memory barriers. if you don't believe me, write a trivial program, and look at the assembly generated. it's only on ARM that you will see some memory barriers.

I submitted tests with and without memory barriers on fishtest.

My point is two-fold:

the theoretical cost of a memory barrier is not even mesurable in practice (that's for ARM where we don't currently test).
correctness can be achieved on x86 without barriers, and without locks either.

vondele · 2015-10-16T16:26:27Z

Actually, there is now a good tool to detect data races in code, please try -fsanitize=thread with a recent gcc (using 5.2 here) on linux. It does trigger often in stockfish, with quite clean output.

For a good discussion on data races by the author of the implementation in gcc, see:
https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
by the firefox team:
https://blog.mozilla.org/nnethercote/2015/02/24/fix-your-damned-data-races/

So, I think performance is one aspect, but code correctness an other one, and as mentioned by lucasart, on x86 atomics are partially for free depending on the memory_order.

For example, the first race found by the tool (on stockfish benchmark with 1 thread):

WARNING: ThreadSanitizer: data race (pid=46204)
Write of size 1 at 0x7d1c0000dff9 by thread T2:
#0 MainThread::think() /data/vjoost/gnu/tst/Stockfish-atomic_signals/src/search.cpp:302 (stockfish+0x00000041b8b9)
#1 MainThread::idle_loop() /data/vjoost/gnu/tst/Stockfish-atomic_signals/src/thread.cpp:157 (stockfish+0x00000043738b)
#2 void std::Mem_fn_base<void (MainThread::*)(), true>::operator()<, void>(MainThread) const /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:600 (stockfish+0x00000043d876)
#3 void std::Bind_simple<std::Mem_fn<void (MainThread::)()> (MainThread)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:1531 (stockfish+0x00000043d876)
#4 std::_Bind_simple<std::Mem_fn<void (MainThread::)()> (MainThread*)>::operator()() /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:1520 (stockfish+0x00000043d876)
#5 _M_run /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/thread:115 (stockfish+0x00000043d876)
#6 execute_native_thread_routine (libstdc++.so.6+0x0000000f6fcd)

Previous read of size 1 at 0x7d1c0000dff9 by thread T1 (mutexes: write M79):
#0 TimerThread::idle_loop() /data/vjoost/gnu/tst/Stockfish-atomic_signals/src/thread.cpp:108 (stockfish+0x000000437627)
#1 void std::Mem_fn_base<void (TimerThread::*)(), true>::operator()<, void>(TimerThread) const /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:600 (stockfish+0x00000043d7ee)
#2 void std::Bind_simple<std::Mem_fn<void (TimerThread::)()> (TimerThread)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:1531 (stockfish+0x00000043d7ee)
#3 std::_Bind_simple<std::Mem_fn<void (TimerThread::)()> (TimerThread*)>::operator()() /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:1520 (stockfish+0x00000043d7ee)
#4 _M_run /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/thread:115 (stockfish+0x00000043d7ee)
#5 execute_native_thread_routine (libstdc++.so.6+0x0000000f6fcd)

This one can be fixed by using
std::atomic run = ATOMIC_VAR_INIT(false);
in struct TimerThread

Leading to the next:

WARNING: ThreadSanitizer: data race (pid=46785)
Write of size 8 at 0x00000065b268 by thread T2:
#0 init /data/vjoost/gnu/tst/Stockfish-atomic_signals/src/timeman.cpp:105 (stockfish+0x000000419789)
#1 MainThread::think() /data/vjoost/gnu/tst/Stockfish-atomic_signals/src/search.cpp:229 (stockfish+0x000000419789)
#2 MainThread::idle_loop() /data/vjoost/gnu/tst/Stockfish-atomic_signals/src/thread.cpp:157 (stockfish+0x0000004373a3)
#3 void std::Mem_fn_base<void (MainThread::*)(), true>::operator()<, void>(MainThread) const /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:600 (stockfish+0x00000043d87e)
#4 void std::Bind_simple<std::Mem_fn<void (MainThread::)()> (MainThread)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:1531 (stockfish+0x00000043d87e)
#5 std::_Bind_simple<std::Mem_fn<void (MainThread::)()> (MainThread*)>::operator()() /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:1520 (stockfish+0x00000043d87e)
#6 _M_run /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/thread:115 (stockfish+0x00000043d87e)
#7 execute_native_thread_routine (libstdc++.so.6+0x0000000f6fcd)

Previous read of size 8 at 0x00000065b268 by thread T1:
#0 TimeManagement::elapsed() const /data/vjoost/gnu/tst/Stockfish-atomic_signals/src/timeman.h:36 (stockfish+0x0000004377af)
#1 check_time /data/vjoost/gnu/tst/Stockfish-atomic_signals/src/search.cpp:1551 (stockfish+0x0000004377af)
#2 TimerThread::idle_loop() /data/vjoost/gnu/tst/Stockfish-atomic_signals/src/thread.cpp:113 (stockfish+0x0000004377af)
#3 void std::Mem_fn_base<void (TimerThread::*)(), true>::operator()<, void>(TimerThread) const /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:600 (stockfish+0x00000043d7f6)
#4 void std::Bind_simple<std::Mem_fn<void (TimerThread::)()> (TimerThread)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:1531 (stockfish+0x00000043d7f6)
#5 std::_Bind_simple<std::Mem_fn<void (TimerThread::)()> (TimerThread*)>::operator()() /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/functional:1520 (stockfish+0x00000043d7f6)
#6 _M_run /data/vjoost/gnu/gcc-libstd/gcc-5.2.0/install/include/c++/5.2.0/thread:115 (stockfish+0x00000043d7f6)
#7 execute_native_thread_routine (libstdc++.so.6+0x0000000f6fcd)

Location is global 'Time' of size 32 at 0x00000065b260 (stockfish+0x00000065b268)

zamar · 2015-10-25T09:49:09Z

Signals are now atomic.

Closing this issue.

Support two kings variant (wild9)

Add subvariant twokingsymmetric official-stockfish#431

lucasart closed this as completed Sep 29, 2015

lucasart reopened this Oct 12, 2015

zamar closed this as completed Oct 25, 2015

niklasf pushed a commit to niklasf/Stockfish that referenced this issue Nov 16, 2017

Merge pull request official-stockfish#431 from ianfab/twokings

e089160

Support two kings variant (wild9)

niklasf pushed a commit to niklasf/Stockfish that referenced this issue Nov 16, 2017

Add subvariant twokingsymmetric official-stockfish#431

39d1508

niklasf pushed a commit to niklasf/Stockfish that referenced this issue Nov 16, 2017

Merge pull request official-stockfish#440 from ianfab/twokingssymmetric

a25ca49

Add subvariant twokingsymmetric official-stockfish#431

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why is Signals volatile ? #431

why is Signals volatile ? #431

lucasart commented Sep 19, 2015

mcostalba commented Sep 29, 2015

lucasart commented Sep 29, 2015

mcostalba commented Sep 29, 2015 via email

mstembera commented Sep 29, 2015

lucasart commented Sep 29, 2015

lucasart commented Sep 29, 2015

lucasart commented Oct 12, 2015

vondele commented Oct 16, 2015

zamar commented Oct 25, 2015

why is Signals volatile ? #431

why is Signals volatile ? #431

Comments

lucasart commented Sep 19, 2015

mcostalba commented Sep 29, 2015

lucasart commented Sep 29, 2015

mcostalba commented Sep 29, 2015 via email

mstembera commented Sep 29, 2015

lucasart commented Sep 29, 2015

lucasart commented Sep 29, 2015

lucasart commented Oct 12, 2015

vondele commented Oct 16, 2015

zamar commented Oct 25, 2015