Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/clickhouse] Integration test failing due to time out #32275

Closed
crobert-1 opened this issue Apr 9, 2024 · 13 comments
Closed

[exporter/clickhouse] Integration test failing due to time out #32275

crobert-1 opened this issue Apr 9, 2024 · 13 comments

Comments

@crobert-1
Copy link
Member

crobert-1 commented Apr 9, 2024

Component(s)

exporter/clickhouse

Describe the issue you're reporting

CI/CD link here.

Failure output:

Running target 'mod-integration-test' in module 'exporter/clickhouseexporter' as part of group 'exporter-0'
make --no-print-directory -C exporter/clickhouseexporter mod-integration-test
running go integration test ./... in /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter
/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/.tools/gotestsum --rerun-fails=1 --packages="./..." -- -race -timeout 360s -parallel 4 -tags=integration,"" -run=Integration
go: downloading github.com/jmoiron/sqlx v1.3.5
go: downloading github.com/testcontainers/testcontainers-go v0.29.1
go: downloading dario.cat/mergo v1.0.0
go: downloading github.com/cpuguy83/dockercfg v0.3.1
go: downloading github.com/containerd/containerd v1.7.12
go: downloading github.com/moby/patternmatcher v0.6.0
go: downloading github.com/moby/term v0.5.0
go: downloading golang.org/x/exp v0.0.0-20230711023510-fffb14384f22
go: downloading go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.45.0
go: downloading github.com/containerd/log v0.1.0
go: downloading github.com/moby/sys/sequential v0.5.0
go: downloading github.com/morikuni/aec v1.0.0
go: downloading github.com/felixge/httpsnoop v1.0.3
go: downloading github.com/moby/sys/user v0.1.0
∅  internal/metadata
∅  internal (1.018s)
✖  . (6m0.053s)

=== Failed
=== FAIL: .  (0.00s)
panic: test timed out after 6m0s
ERROR rerun aborted because previous run had a suspected panic and some test may not have run
running tests:
	TestIntegration (6m0s)
	TestIntegration/test_clickhouse_23-alpine (5m28s)

goroutine 172 [running]:
testing.(*M).startAlarm.func1()
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:2259 +0x259
make[2]: *** [../../Makefile.Common:[142](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8619482433/job/23624324767?pr=32271#step:5:143): mod-integration-test] Error 3
make[1]: *** [Makefile:165: exporter/clickhouseexporter] Error 2
make: *** [Makefile:122: gointegration-test] Error 2
created by time.goFunc
	/opt/hostedtoolcache/go/1.21.9/x64/src/time/sleep.go:176 +0x45

goroutine 1 [chan receive, 6 minutes]:
testing.(*T).Run(0xc0000a2820, {0x182feec, 0xf}, 0x18f1488)
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1649 +0x871
testing.runTests.func1(0x0?)
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:2054 +0x85
testing.tRunner(0xc0000a2820, 0xc0005a7998)
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1595 +0x262
testing.runTests(0xc00030d860?, {0x238a6e0, 0x14, 0x14}, {0xc0005a7a68?, 0x448785?, 0x2399080?})
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:2052 +0x8ae
testing.(*M).Run(0xc00030d860)
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1925 +0xcd8
go.uber.org/goleak.VerifyTestMain({0x1ab2620, 0xc00030d860}, {0x0, 0x0, 0x0})
	/home/runner/go/pkg/mod/go.uber.org/[email protected]/testmain.go:53 +0x65
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter.TestMain(...)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter/package_test.go:13
main.main()
	_testmain.go:89 +0x317

goroutine 20 [chan receive, 5 minutes]:
testing.(*T).Run(0xc0000a29c0, {0x183a982, 0x19}, 0xc0001f8a50)
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1649 +0x871
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter.TestIntegration(0x0?)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter/integration_test.go:44 +0x17a
testing.tRunner(0xc0000a29c0, 0x18f1488)
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1595 +0x262
created by testing.(*T).Run in goroutine 1
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1648 +0x846

goroutine 180 [IO wait]:
internal/poll.runtime_pollWait(0x7f36d170ec90, 0x72)
	/opt/hostedtoolcache/go/1.21.9/x64/src/runtime/netpoll.go:343 +0x85
internal/poll.(*pollDesc).wait(0xc000412520, 0xc000410000?, 0x0)
	/opt/hostedtoolcache/go/1.21.9/x64/src/internal/poll/fd_poll_runtime.go:84 +0xb1
internal/poll.(*pollDesc).waitRead(...)
	/opt/hostedtoolcache/go/1.21.9/x64/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000412500, {0xc000410000, 0x1000, 0x1000})
	/opt/hostedtoolcache/go/1.21.9/x64/src/internal/poll/fd_unix.go:164 +0x405
net.(*netFD).Read(0xc000412500, {0xc000410000, 0x1000, 0x1000})
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/fd_posix.go:55 +0x4b
net.(*conn).Read(0xc0003a61f0, {0xc000410000, 0x1000, 0x1000})
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/net.go:179 +0xad
net/http.(*persistConn).Read(0xc0003bb0e0, {0xc000410000, 0x1000, 0x1000})
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/transport.go:1954 +0x105
bufio.(*Reader).fill(0xc00039f020)
	/opt/hostedtoolcache/go/1.21.9/x64/src/bufio/bufio.go:113 +0x29a
bufio.(*Reader).ReadSlice(0xc00039f020, 0xd1?)
	/opt/hostedtoolcache/go/1.21.9/x64/src/bufio/bufio.go:379 +0x85
net/http/internal.readChunkLine(0xc000410000?)
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/internal/chunked.go:156 +0x34
net/http/internal.(*chunkedReader).beginChunk(0xc00050c030)
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/internal/chunked.go:49 +0x5a
net/http/internal.(*chunkedReader).Read(0xc00050c030, {0xc0002586a4, 0x195c, 0x195c})
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/internal/chunked.go:125 +0x2b4
net/http.(*body).readLocked(0xc000512000, {0xc0002586a4, 0x195c, 0x195c})
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/transfer.go:839 +0xa9
net/http.(*body).Read(0xc000512000, {0xc0002586a4, 0x195c, 0x195c})
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/transfer.go:831 +0x185
net/http.(*bodyEOFSignal).Read(0xc000512040, {0xc0002586a4, 0x195c, 0x195c})
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/transport.go:2810 +0x[143](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8619482433/job/23624324767?pr=32271#step:5:144)
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*wrappedBody).Read(0xc000426040, {0xc0002586a4, 0x195c, 0x195c})
	/home/runner/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/transport.go:173 +0x6b
io.ReadAll({0x7f36d137d2a8, 0xc0002c2040})
	/opt/hostedtoolcache/go/1.21.9/x64/src/io/io.go:704 +0x87
github.com/testcontainers/testcontainers-go.(*DockerProvider).attemptToPullImage(0xc000114f70, {0x1abe768, 0x23cb500}, {0x184b098, 0x26}, {0x0, {0xc00048a0d0, 0xd0}, 0x0, {0x0, ...}})
	/home/runner/go/pkg/mod/github.com/testcontainers/[email protected]/docker.go:1272 +0x392
github.com/testcontainers/testcontainers-go.(*DockerProvider).CreateContainer(_, {_, _}, {{{0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, ...}, ...}, ...})
	/home/runner/go/pkg/mod/github.com/testcontainers/[email protected]/docker.go:1044 +0x1f68
github.com/testcontainers/testcontainers-go.GenericContainer({_, _}, {{{{0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, ...}, ...}, ...})
	/home/runner/go/pkg/mod/github.com/testcontainers/[email protected]/generic.go:73 +0x350
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter.getContainer(_, {{{0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, ...}, ...}, ...})
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter/integration_test.go:78 +0xd4
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter.TestIntegration.func1(0xc00040a000)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter/integration_test.go:52 +0x308
testing.tRunner(0xc00040a000, 0xc0001f8a50)
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1595 +0x262
created by testing.(*T).Run in goroutine 20
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1648 +0x846

goroutine 194 [chan receive, 5 minutes]:
github.com/testcontainers/testcontainers-go.(*Reaper).Connect.func1({0x1ac69d0?, 0xc00006e6e8})
	/home/runner/go/pkg/mod/github.com/testcontainers/[email protected]/reaper.go:358 +0xa0e
created by github.com/testcontainers/testcontainers-go.(*Reaper).Connect in goroutine 180
	/home/runner/go/pkg/mod/github.com/testcontainers/[email protected]/reaper.go:323 +0x2ec

goroutine 183 [select, 5 minutes]:
net/http.(*persistConn).writeLoop(0xc0003bb0e0)
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/transport.go:2421 +0x1bc
created by net/http.(*Transport).dialConn in goroutine 181
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/transport.go:1777 +0x266b

goroutine 182 [select, 5 minutes]:
net/http.(*persistConn).readLoop(0xc0003bb0e0)
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/transport.go:2238 +0x[153](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8619482433/job/23624324767?pr=32271#step:5:154)d
created by net/http.(*Transport).dialConn in goroutine 181
	/opt/hostedtoolcache/go/1.21.9/x64/src/net/http/transport.go:1776 +0x25da
FAIL	github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter	360.053s

=== FAIL: . TestIntegration/test_clickhouse_23-alpine (unknown)

=== FAIL: . TestIntegration (unknown)
@crobert-1 crobert-1 added the needs triage New item requiring triage label Apr 9, 2024
Copy link
Contributor

github-actions bot commented Apr 9, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1 crobert-1 added the bug Something isn't working label Apr 9, 2024
@crobert-1
Copy link
Member Author

This may just be a test issue, but I'm labelling as a bug until it's been investigated 👍

@SpencerTorres
Copy link
Member

I might have seen this even when running locally. The tests do pass, this is likely a one-off error with a container failing to start in time.

Is this the first/only run for the CI? @crobert-1

Any thoughts on how we could make this more robust? @Frapschen

@crobert-1
Copy link
Member Author

That's the first run on that PR. I think I might have seen it before, but I don't definitively remember. I'm aware of general stability issues around testcontainers-go, and from the stack trace it looks like the panic might have been hit because the docker image itself wasn't able to be downloaded in time, potentially.

github.com/testcontainers/testcontainers-go.(*DockerProvider).attemptToPullImage(0xc000114f70, {0x1abe768, 0x23cb500}, {0x184b098, 0x26}, {0x0, {0xc00048a0d0, 0xd0}, 0x0, {0x0, ...}})

Just a couple initial thoughts.

@SpencerTorres
Copy link
Member

looks like the panic might have been hit because the docker image itself wasn't able to be downloaded in time

This was my interpretation as well. It's what usually happens when docker integration tests fail. If the CI has a shared image cache we could try using an exact image hash instead of always pulling the latest for that tag. Quickest workaround for now is to re-run and see if docker is quicker.

Maybe we can shorten the timeout for testcontainers-go so that it can retry instead of hanging? I don't like the solution being "just run it again", I would like consistency in the CI

@crobert-1
Copy link
Member Author

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jun 25, 2024
@SpencerTorres
Copy link
Member

Should be investigated with #32530

As suggested in a comment over there these could probably be fixed in a single PR

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Aug 26, 2024
@SpencerTorres
Copy link
Member

Similar to #32530, still needs investigation

@github-actions github-actions bot removed the Stale label Aug 27, 2024
@atoulme atoulme added flaky test a test is flaky and removed needs triage New item requiring triage labels Sep 3, 2024
Copy link
Contributor

github-actions bot commented Nov 4, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Nov 4, 2024
Copy link
Contributor

github-actions bot commented Jan 3, 2025

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants