Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal error: runtimer: bad p when resetting the value of a ticker pointer #45008

Closed
kyrozetera opened this issue Mar 15, 2021 · 7 comments
Closed
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@kyrozetera
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.15.10 linux/amd64

Does this issue reproduce with the latest release?

Yes, tried this on 1.14, 1.15, and 1.16

What operating system and processor architecture are you using (go env)?

Running on Ubuntu 20.04 kernel version 5.4.0-66-generic
Docker container running Debian 10
This was done using the base docker image golang:1.15

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build893808163=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Dockerfile

FROM golang:1.15 as final

RUN apt-get update
RUN apt-get install git gcc 

WORKDIR /go/src/app
COPY . .

RUN go get -d -v ./...
RUN go build test_timer.go

CMD ["./test_timer"]

test_timer.go

package main

import (
    "sync"
    "time"
)

func main() {
    testTicker()
}

func testTicker() {
    waitGroup := sync.WaitGroup{}

    for i := 0; i <= 2; i++ {
        waitGroup.Add(1)
        go func () {
            defer waitGroup.Done()
            timeout := 10 * time.Millisecond
            ticker := time.NewTicker(timeout)

            for {
                select {
                case <-ticker.C:
                    ticker.Stop()
                    *ticker = *time.NewTicker(timeout)
                }   
            }   
        }() 
    }   

    waitGroup.Wait()
}

This is the simplest code I could build to reproduce the error. This code runs fine on my local machine (Ubuntu 20.04), but throws a fatal error: runtimer: bad p in the golang:1.15 docker container. It only happens if the code is built on the docker container. Building it locally then copying to the docker container doesn't trigger the issue, so it could be related to the version of shared libs on the distro. Adding apt upgrade -y to the Dockerfile didn't help.
It happens regardless of what kernel it's built on. Same thing happens on an amazon linux 2 ~4.14 machine.

I was doing *ticker = *time.NewTicker(timeout) instead of ticker = time.NewTicker(timeout) because the reset was happening inside a function call. Workaround was to reorganize so that it returns the reset ticker instead. Go 1.15 also introduces the Ticker.Reset function so that can be used moving forward (the existing codebase is on 1.14), but wanted to report this issue in case it helps someone else who runs into a similar issue, or it's evidence of something else going on.

What did you expect to see?

The program run and reset the ticker for the next iteration.

What did you see instead?

fatal error: runtimer: bad p

runtime stack:
runtime.throw(0x488b61, 0xf)
	/usr/local/go/src/runtime/panic.go:1116 +0x72
runtime.runtimer(0xc00001c000, 0x162b8b19476c8, 0x7f7bdaebe108)
	/usr/local/go/src/runtime/time.go:751 +0x28c
runtime.checkTimers(0xc00001c000, 0x162b8b19476c8, 0x162b8b05efb89, 0x0, 0x0)
	/usr/local/go/src/runtime/proc.go:2782 +0x12c
runtime.findrunnable(0xc00001c000, 0x0)
	/usr/local/go/src/runtime/proc.go:2206 +0x8c5
runtime.schedule()
	/usr/local/go/src/runtime/proc.go:2683 +0x2d7
runtime.park_m(0xc000001380)
	/usr/local/go/src/runtime/proc.go:2851 +0x9d
runtime.mcall(0x0)
	/usr/local/go/src/runtime/asm_amd64.s:318 +0x5b

goroutine 1 [semacquire]:
sync.runtime_Semacquire(0xc000018048)
	/usr/local/go/src/runtime/sema.go:56 +0x45
sync.(*WaitGroup).Wait(0xc000018040)
	/usr/local/go/src/sync/waitgroup.go:130 +0x65
main.testTicker()
	/go/src/app/test_timer.go:32 +0x95
main.main()
	/go/src/app/test_timer.go:9 +0x25

goroutine 5 [chan receive]:
main.testTicker.func1(0xc000018040)
	/go/src/app/test_timer.go:24 +0x85
created by main.testTicker
	/go/src/app/test_timer.go:17 +0x78

goroutine 6 [chan receive]:
main.testTicker.func1(0xc000018040)
	/go/src/app/test_timer.go:24 +0x85
created by main.testTicker
	/go/src/app/test_timer.go:17 +0x78

goroutine 7 [chan receive]:
main.testTicker.func1(0xc000018040)
	/go/src/app/test_timer.go:24 +0x85
created by main.testTicker
	/go/src/app/test_timer.go:17 +0x78
@golang golang deleted a comment Mar 15, 2021
@davecheney
Copy link
Contributor

davecheney commented Mar 15, 2021

Smaller reproduction

package main

import (
        "time"
)

func main() {
        const N = 3 // larger numbers trigger panic more reliably
        for i := 0; i <= N; i++ {
                go func() {
                        timeout := 10 * time.Millisecond
                        ticker := time.NewTicker(timeout)

                        for {
                                select {
                                case <-ticker.C:
                                        ticker.Stop()
                                        *ticker = *time.NewTicker(timeout)
                                }
                        }
                }()
        }

        select {}
}

@davecheney
Copy link
Contributor

I'm almost certain that *ticker = *time.NewTicker(timeout) is the culprit, but the race detector doesn't seem to complain -- or the program crashes before it can open its mouth.

@davecheney davecheney added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 15, 2021
@kyrozetera
Copy link
Author

Oh, yeah, I've had many iterations of this code. *ticker = *time.NewTicker(timeout) is the only constant. Thanks for removing the waitgroup here, I was going to do that and forgot.

Also, not sure what the playground is running on, but I just tried it and it blows up there too: https://play.golang.org/p/SrcOyRpdncx

@davecheney
Copy link
Contributor

*ticker = *time.NewTicker(timeout)

why do you want to overwrite the destination of the ticker pointer with a copy of the results of NewTicker? Why doesn't ticker = time.NewTicker(timeout) work for you?

@kyrozetera
Copy link
Author

Because I was originally passing the ticker into another function that had logic to determine if the ticker needed to be reset at a certain point. It was working locally so I didn't think anything of it, then it started crashing on staging and it took a while to track down that that was the issue.
I've reorganized the code and the function can now return the updated ticker instead of overwriting it.
But I wanted to report the issue as it was difficult to track down, being environment specific. If it's not something that needs fixing (especially with the ticker.Reset option in newer versions), it could be worth documenting at least.

@ericlagergren
Copy link
Contributor

If it's not something that needs fixing (especially with the ticker.Reset option in newer versions), it could be worth documenting at least.

It already is: if you have a data race in your program, all bets are off. It's not unique to time.Ticker.

https://blog.golang.org/race-detector

@ianlancetaylor
Copy link
Member

Sorry, you can't use time.Ticker in this way. The time.Ticker type has internal information that this code is clobbering. It's true that this got worse in the 1.14 release, but it was never supported. I'm going to close this issue because I don't think there is anything we can change to fix it. Please comment if you disagree.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants