-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
time: make Timer/Ticker channels not receivable with old values after Stop or Reset returns #37196
Comments
For an example program illustrating why a user might reasonably expect (and attempt to make use of) this behavior, see https://play.golang.org/p/mDkMG67ehAI. |
Thanks for filing this. I also found this behavior puzzling and apparently I got it wrong too as I was expecting no values to be sent after If, as I understand now, a value might be sent after the call to |
If you know that no other goroutine is receiving on the (The |
In the current implementation, this can probably be done by changing |
Added to proposal process to resolve whether to do this, after mention in #38945. |
This made its first appearance in the proposal minutes last week.
This would succeed today, but if we clear the channel during On the other hand, there is technically a race here, and it's not guaranteed that this snippet never blocks, especially on a loaded system. The correct snippet, even today, is:
That example would be unaffected, because Stop would pull the buffered element back out of the channel and then return We could make the change at the start of a release cycle and be ready to roll it back if real (as opposed to hypothetical) problems arose. Thoughts? |
While I like the behavior of pulling the value back out of the channel and having For example, this program today is not reported as a race and the receive is guaranteed to never block, but would deadlock if package main
import (
"fmt"
"time"
)
func main() {
t := time.NewTimer(1 * time.Second)
for len(t.C) == 0 {
time.Sleep(time.Millisecond)
}
t.Stop()
<-t.C
fmt.Println("ok")
} That is why I had restricted my original proposal to guarantee only that no value is sent after |
Sorry, I'm not sure I understand what it means to guarantee that no value is sent on the channel after |
@ianlancetaylor, the program given in https://play.golang.org/p/Wm1x8DmYoQo should run indefinitely without reporting any delayed times. Instead, it reports a nonzero rate of delayed timer sends. On my Xeon workstation:
In all cases, the failure mode is the same: at some point after Instead, I propose that the send to the channel, if it occurs at all, ought to happen before the return from |
Thanks for the example. If the code is written per the documentation of And now I see how this happens. It's because in I don't see an easy fix in the current implementation. We can't leave the timer in |
Would it make sense to change the timer to the (Or would that potentially induce starvation for a |
The same send-after- (Code in https://play.golang.org/p/Y_Hz4xkYr07, but it doesn't reproduce there due to the Playground's |
It's actually a bit more complicated than I thought. If we leave the timer status as |
Let's try to separate the discussion of semantics from implementation. The docs say:
The proposed change in semantics is to make it that no receive can ever happen after t.Stop returns. That would mean that under the assumption - "assuming the program has not received from t.C already" - t.Stop would never return false anymore. So the above code would no longer be required (and if left alone would just never execute the if body). This would avoid the problem of people not knowing to write that fixup code or not understanding the subtlety involved. Certainly we've all confused ourselves enough just thinking through this over the years. That's the proposed semantic change. There are also questions about how to implement that - it's not just the easy "pull the item back out of the buffer" that I suggested before - but certainly it is possible. Let's figure out if we agree about the semantic change. (Another side-effect would probably be that len(t.C) would always return 0, meaning the channel would appear to be unbuffered. But we've never promised buffering in the docs. The buffering is only there because I was trying to avoid blocking the timer thread. It's a hack.) |
I agree that if we could change the semantics to ensure that no receive is possible after However, I don't see any way to do that without breaking the example in https://play.golang.org/p/r77N1PfXuu5. That program explicitly relies on the buffering behaviors of Disallowing a receive after So the only way I see to make that change in semantics would be to declare that this example program is already incorrect, because it is relying on undocumented behavior regardless of the stability of that behavior. I could be wrong, but given how long this behavior has been in place I suspect that changing it would break real programs. |
Do you know of any programs that call |
I do not know of any specifically, but given the 9-year interval and Hyrum's law I would be surprised if they did not exist. If we are willing to assume that such programs do not exist (or declare that they are invalid if they do exist, because the buffering behavior was never documented), then preventing the receive seems ok. |
Change https://go.dev/cl/568341 mentions this issue: |
Change https://go.dev/cl/568375 mentions this issue: |
An upcoming CL will give this call more to do. For now, separate out the compiler change that stops inlining the computation. For #37196. Change-Id: I965426d446964b9b4958e4613246002a7660e7eb Reviewed-on: https://go-review.googlesource.com/c/go/+/568375 LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Matthew Dempsky <[email protected]> Auto-Submit: Russ Cox <[email protected]>
Looks like this may happen for Go 1.23, finally. In the discussion above #37196 (comment) I said the channel would "appear to be unbuffered", but I don't think I realized at the time (or I forgot) that that's literally the entire semantic change: timer channels will now be synchronous aka unbuffered. All the simplification follows from that. (The implementation is more subtle, including a hidden buffer as an optimization, but the semantic effect is simply that timer channels will now behave as if unbuffered.) For a graceful rollout, this change and the change for #8898 can be disabled by GODEBUG=asynctimerchan=1. |
Just found this thread while trying to track down a test flake - the test was trying to drain the channel with a |
With tip, the finalizer of a timer will not get executed. Is it intended? package main
import "time"
import "runtime"
func main() {
timer := time.NewTimer(time.Second)
runtime.SetFinalizer(timer, func(c *time.Timer){
println("done")
})
timer.Stop()
runtime.GC()
time.Sleep(time.Second)
}
|
BTW, since Go 1.23, does the Stop method of a Timer still need to be called to make the Timer collectable as earlier as possible? |
|
@go101 The finalizer change sounds like a bug to me. Please open a new issue for that. Thanks. |
@ianlancetaylor I'm not sure whether or not it is a bug, because the comment in the PR mentions a Timer object might be not allocated on heap, Anyway, I created the issue. |
Thanks for creating the issue. Seems to me it is a bug either in the implementation or in the documentation. |
With this change, I'm not sure if it will be possible for fake clocks to be used for deterministic tests and behave similarly to the stdlib timer methods. Given some production code, say: func delayedPush(clock clock.Clock) {
c := clock.Timer(time.Second)
<-c
push()
} And the following test: func TestDelayedPush(t *testing.T) {
fake := clock.NewFake()
go delayedPush(fake)
fake.Advance(time.Second)
if !pushed {
t.Errorf("expected push")
}
} This test is flaky: it passes if the background goroutine has created the timer before the fake advances time, but fails otherwise since. This is a really common issue so tests often use This problem can be solved with fakes that support blocking until the timer is created, e.g., func TestDelayedPush(t *testing.T) {
fake := clock.NewFake()
go delayedPush(fake)
fake.BlockUntil(1) // wait for background Timer call
fake.Advance(time.Second)
if !pushed {
t.Errorf("expected push")
}
} Prior to this change, since the channel was buffered, the I'm currently working on a fake clock library, primarily to help with writing deterministic, non-flaky tests, and with this change, I'm not sure that'll be possible within the current API surface. |
The code that uses the timers shouldn't care whether the channel is buffered or not. I don't see why a fake clock library would have to use an unbuffered channel. Please see #67434 for some recent ideas on how to test timers and the like. |
The fake can use a buffered channel, but it's behavior will not be able to match the stdlib:
I was hoping to build a fake that matches the stdlib behavior closely for high fidelity testing. In this case, it's not a huge issue since the new stdlib behavior is less error-prone, and continues to work with code written against the previous model. After reading the time PR more closely, it's likely that the fake can emulate the stdlib more closely by using the expensive approach that the change emulates -- use a separate goroutine to write to the channel, cancelling it when |
To me it looks like test for this code will be racy in any case - this just can't be solved by doing something with timers, because all you can get as result is sync with |
…tation example no longer leads to deadlocks, as Timer/Ticker channels not receivable with old values after Stop or Reset returns. golang/go#27169 golang/go#37196 golang/go#14383
…tation example no longer leads to deadlocks, as Timer/Ticker channels not receivable with old values after Stop or Reset returns. golang/go#27169 golang/go#37196 golang/go#14383
…tation example no longer leads to deadlocks, as Timer/Ticker channels not receivable with old values after Stop or Reset returns. golang/go#27169 golang/go#37196 golang/go#14383
…tation example no longer leads to deadlocks, as Timer/Ticker channels not receivable with old values after Stop or Reset returns. golang/go#27169 golang/go#37196 golang/go#14383
…tation example no longer leads to deadlocks, as Timer/Ticker channels not receivable with old values after Stop or Reset returns. golang/go#27169 golang/go#37196 golang/go#14383
A proposal discussion in mid-2020 on #37196 decided to change time.Timer and time.Ticker so that their Stop and Reset methods guarantee that no old value (corresponding to the previous configuration of the Timer or Ticker) will be received after the method returns. The trivial way to do this is to make the Timer/Ticker channels unbuffered, create a goroutine per Timer/Ticker feeding the channel, and then coordinate with that goroutine during Stop/Reset. Since Stop/Reset coordinate with the goroutine and the channel is unbuffered, there is no possibility of a stale value being sent after Stop/Reset returns. Of course, we do not want an extra goroutine per Timer/Ticker, but that's still a good semantic model: behave like the channels are unbuffered and fed by a coordinating goroutine. The actual implementation is more effort but behaves like the model. Specifically, the timer channel has a 1-element buffer like it always has, but len(t.C) and cap(t.C) are special-cased to return 0 anyway, so user code cannot see what's in the buffer except with a receive. Stop/Reset lock out any stale sends and then clear any pending send from the buffer. Some programs will change behavior. For example: package main import "time" func main() { t := time.NewTimer(2 * time.Second) time.Sleep(3 * time.Second) if t.Reset(2*time.Second) != false { panic("expected timer to have fired") } <-t.C <-t.C } This program (from #11513) sleeps 3s after setting a 2s timer, resets the timer, and expects Reset to return false: the Reset is too late and the send has already occurred. It then expects to receive two values: the one from before the Reset, and the one from after the Reset. With an unbuffered timer channel, it should be clear that no value can be sent during the time.Sleep, so the time.Reset returns true, indicating that the Reset stopped the timer from going off. Then there is only one value to receive from t.C: the one from after the Reset. In 2015, I used the above example as an argument against this change. Note that a correct version of the program would be: func main() { t := time.NewTimer(2 * time.Second) time.Sleep(3 * time.Second) if !t.Reset(2*time.Second) { <-t.C } <-t.C } This works with either semantics, by heeding t.Reset's result. The change should not affect correct programs. However, one way that the change would be visible is when programs use len(t.C) (instead of a non-blocking receive) to poll whether the timer has triggered already. We might legitimately worry about breaking such programs. In 2020, discussing #37196, Bryan Mills and I surveyed programs using len on timer channels. These are exceedingly rare to start with; nearly all the uses are buggy; and all the buggy programs would be fixed by the new semantics. The details are at [1]. To further reduce the impact of this change, this CL adds a temporary GODEBUG setting, which we didn't know about yet in 2015 and 2020. Specifically, asynctimerchan=1 disables the change and is the default for main programs in modules that use a Go version before 1.23. We hope to be able to retire this setting after the minimum 2-year window. Setting asynctimerchan=1 also disables the garbage collection change from CL 568341, although users shouldn't need to know that since it is not a semantically visible change (unless we have bugs!). As an undocumented bonus that we do not officially support, asynctimerchan=2 disables the channel buffer change but keeps the garbage collection change. This may help while we are shaking out bugs in either of them. Fixes #37196. [1] golang/go#37196 (comment) Change-Id: I8925d3fb2b86b2ae87fd2acd055011cbf7bd5916 Reviewed-on: https://go-review.googlesource.com/c/go/+/568341 Reviewed-by: Austin Clements <[email protected]> Auto-Submit: Russ Cox <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
In #14383 (comment), @rsc said:
As far as I can tell, no different bug was filed: the documentation for
(*Timer).Stop
still says:go/src/time/sleep.go
Lines 57 to 66 in 7d2473d
and that behavior is still resulting in subtle bugs (#27169). (Note that @rsc himself assumed that a
select
should work in this way in #14038 (comment).)I think we should tighten up the invariants of
(*Timer).Stop
to eliminate this subtlety.CC @empijei @matttproud @the80srobot @ianlancetaylor
The text was updated successfully, but these errors were encountered: