Issue with Opamp after operator upgrade on 0.44.2 version #985

flenoir · 2023-12-21T13:22:07Z

Hi,

i'm trying to upgrade operator to version 0.44.2 of helm chart.

I get some errors regarding opampbridge but couldn't find how to solve it.

pod logs reports :

`{"level":"error","ts":"2023-12-21T13:20:30Z","logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: autoscaling/v2: the server could not find the requested resource","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/source/kind.go:68\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2\n\t/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/loop.go:73\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/loop.go:74\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel\n\t/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/poll.go:33\nsigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/source/kind.go:56"}

`
{"level":"error","ts":"2023-12-21T13:21:30Z","msg":"Could not wait for Cache to sync","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","error":"failed to wait for opentelemetrycollector caches to sync: timed out waiting for cache to be synced for Kind *v2.HorizontalPodAutoscaler","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:203\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:208\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/runnable_group.go:223"}
{"level":"error","ts":"2023-12-21T13:21:30Z","logger":"setup","msg":"problem running manager","error":"failed to wait for opentelemetrycollector caches to sync: timed out waiting for cache to be synced for Kind *v2.HorizontalPodAutoscaler","stacktrace":"main.main\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/main.go:311\nruntime.main\n\t/opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:267"}

I did uninstall, re-install everything, patched CRDS but can't make it work. i did found this issue
but still struggling on CR yaml file. i found this example => #938 (comment)

apiVersion: opentelemetry.io/v1alpha1
kind: OpAMPBridge
metadata:
  name: test
spec:
  image: "ghcr.io/open-telemetry/opentelemetry-operator/operator-opamp-bridge:v0.88.0"
  endpoint: ws://opamp-server:4320/v1/opamp
  capabilities:
    AcceptsOpAMPConnectionSettings: true
    AcceptsOtherConnectionSettings: true
    AcceptsRemoteConfig: true
    AcceptsRestartCommand: true
    ReportsEffectiveConfig: true
    ReportsHealth: true
    ReportsOwnLogs: true
    ReportsOwnMetrics: true
    ReportsOwnTraces: true
    ReportsRemoteConfig: true
    ReportsStatus: true
  componentsAllowed:
    receivers:
    - otlp
    processors:
    - memory_limiter
    exporters:
    - logging

but still searching on this endpoint.

Should it be a websocket ? (ws://) should the port stay on 4320 as my service only seems to expose port 80 ?

An example of CR opamp file would be helpfull

The text was updated successfully, but these errors were encountered:

JaredTan95 · 2023-12-21T14:41:30Z

pls following open-telemetry/opentelemetry-operator#2314.

flenoir · 2023-12-21T15:16:39Z

yes, i did uninstall operator through helm. i also did updated manual the 3 crds but still have issue. should crds be applied after or before helm update ?

Then should a custom resource like below should applied ?

apiVersion: opentelemetry.io/v1alpha1
kind: OpAMPBridge
metadata:
  name: otelbridge
spec:
  image: "ghcr.io/open-telemetry/opentelemetry-operator/operator-opamp-bridge:v0.90.0"
  endpoint: ws://opamp-server:4320/v1/opamp
  capabilities:
    AcceptsOpAMPConnectionSettings: true
    AcceptsOtherConnectionSettings: true
    AcceptsRemoteConfig: true
    AcceptsRestartCommand: true
    ReportsEffectiveConfig: true
    ReportsHealth: true
    ReportsOwnLogs: true
    ReportsOwnMetrics: true
    ReportsOwnTraces: true
    ReportsRemoteConfig: true
    ReportsStatus: true
  componentsAllowed:
    receivers:
      - otlp
      - jaeger
      - kafka/traces_fab
      - kafka/traces_prod
      - zipkin
      - kafka/metrics_fab
      - kafka/metrics_prod
      - prometheus/receiver
    processors:
      - memory_limiter
      - span/statuscode
    exporters:
      - debug
      - otlp/tempo
      - otlphttp
      - prometheusremotewrite
      - otlp/vm
      - otlphttp

TylerHelmuth · 2023-12-21T15:26:56Z

Can you try completely removing the crds

JaredTan95 · 2023-12-21T22:57:15Z

Then should a custom resource like below should applied ?

Not necessarily, and I don't think that's the root cause of the error

Duanjax · 2024-04-01T10:01:30Z

Having the same issue "failed to wait for opampbridge caches to sync: timed out waiting for cache to be synced for Kind *v1alpha", the manager container restarts every couple of minutes.

try the solution open-telemetry/opentelemetry-operator#2314, seems not work.

any updates here?

alibahramian · 2024-06-18T09:45:09Z

Seems like missing autoscaling/v2 in k8s cluster is the issue and in my case I had autoscaling/v1, you may check it with kubectl api-versions | grep autoscaling

JaredTan95 mentioned this issue Apr 15, 2024

Add Bridge to opentelemetry-kube-stack #1138

Merged

TylerHelmuth mentioned this issue May 3, 2024

Collector CRD 0.99.0 requires templated conversion webhook #1167

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Opamp after operator upgrade on 0.44.2 version #985

Issue with Opamp after operator upgrade on 0.44.2 version #985

flenoir commented Dec 21, 2023

JaredTan95 commented Dec 21, 2023

flenoir commented Dec 21, 2023 •

edited

Loading

TylerHelmuth commented Dec 21, 2023

JaredTan95 commented Dec 21, 2023

Duanjax commented Apr 1, 2024

alibahramian commented Jun 18, 2024

Issue with Opamp after operator upgrade on 0.44.2 version #985

Issue with Opamp after operator upgrade on 0.44.2 version #985

Comments

flenoir commented Dec 21, 2023

JaredTan95 commented Dec 21, 2023

flenoir commented Dec 21, 2023 • edited Loading

TylerHelmuth commented Dec 21, 2023

JaredTan95 commented Dec 21, 2023

Duanjax commented Apr 1, 2024

alibahramian commented Jun 18, 2024

flenoir commented Dec 21, 2023 •

edited

Loading