-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: starvation with frequent timer resets #38119
Comments
This reverts commit c88a690. Unfortunately, go 1.14's new scheduler and timer changes have a few bugs that need to be ironed out before it's ready for production use. See: * libp2p/go-libp2p#858 * golang/go#38119
Sorry, I should have called this out more prominently in the issue. I believe this is a new issue. I tested master as of af7eafd (this morning) and, while it no longer deadlocks and hangs completely, it still starves the main thread. Please take a look at the "what did you see instead". I tested on go 1.13.8, go 1.14.1, and master.
|
Sorry, I missed that. |
At first glance, what seems to have changed here is that in 1.13 resetting a timer required taking a lock on a timer bucket. That effectively serialized the goroutines, so they ran in order, and they also alternated with the timer goroutine that took the same lock. On tip, resetting a timer does not require any lock. So you have 4000 goroutines running flat out with no locks. 100 microseconds isn't enough time for them all to finish, so they are always running and contending with each other. That seems to be enough to slow down the timer checks. If I add a call to So I think the key difference in 1.14 is that a loop calling I can't decide whether there is a real bug here or not, or whether this is something that should be fixed by changing the program to not busy loop. It would be interesting to see whether the real program has the same busy looping behavior. |
Ah, got it. The actual code that spawned this bug report writes to a socket occasionally so it shouldn't exhibit this issue. I was mostly concerned that my test program for reproducing the bug worked on 1.13.8 but not on the latest master. Given your explanation, I can't think of a useful program that would have this problem. |
(sorry, didn't mean to close this for you) |
I'm OK with closing this, unless and until we see some real code that runs into this problem. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes. Tested with the latest master and the latest release (go 1.14.1).
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Frequently reset a 100us timer in 4K goroutines with:
I'm investigating this bug as part of libp2p/go-libp2p#858. I believe this is triggered by https://github.com/libp2p/go-yamux/pull/4/files#diff-f2fccfb96064097b73fbe8eeae3703eeR457 (I'm testing with the patch reverted at the moment and everything seems fine).
What did you expect to see?
Go prints "running: 4000" (ish) every few seconds at most.
What did you see instead?
Profiles
I've included a sample of the thread states. The first is in go 1.13.8, and the second is from the latest master.
Go 1.13.8
Included as a reference for what this looked like when it worked well.
CPU Profile:
go-1.13.8-cpu.gz
Threads from delve:
Thread 21907
Thread 21912
Thread 21913
Thread 21914
Thread 21915
Thread 21916
Thread 21917
Thread 21918
Thread 21920
Go master
CPU Profile:
go-master-cpu.gz
Thread 23755
Thread 23760
Thread 23761
Thread 23762
Thread 23763
Thread 23809
The text was updated successfully, but these errors were encountered: