Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent duplicate telemetry when using both library and auto instrumentation #903

Open
trask opened this issue Aug 6, 2020 · 11 comments · Fixed by #2661
Open

Prevent duplicate telemetry when using both library and auto instrumentation #903

trask opened this issue Aug 6, 2020 · 11 comments · Fixed by #2661

Comments

@trask
Copy link
Member

trask commented Aug 6, 2020

I think this is an important, e.g. you may start out using library instrumentation, and then decide to throw on the javaagent later (or your ops team decides to throw on the javaagent), and I think part of the goal of maintaining both library and auto instrumentation in the same repo is so we can make them play nicely together.

(this includes preventing duplicate @WithSpan telemetry when using both spring-aop instrumentation module and javaagent)

@anuraaga
Copy link
Contributor

anuraaga commented Aug 6, 2020

For reference, I guess this is along the lines of the gRPC logic to prevent the interceptor from being added if it already exists

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/master/instrumentation/grpc-1.5/src/main/java/io/opentelemetry/auto/instrumentation/grpc/server/GrpcServerBuilderInstrumentation.java#L71

which interestingly seems extremely unlikely until we extract the library instrumentation :D

@anuraaga
Copy link
Contributor

anuraaga commented Aug 6, 2020

Also I guess we can have a pattern of including a duplicate test in the auto project which configures the same way the library test does, and it should still pass (currently Armeria wouldn't I think, heh)

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/master/instrumentation/armeria-1.0/auto/src/test/groovy/io/opentelemetry/auto/instrumentation/armeria/v1_0/ArmeriaServerTest.groovy#L28

@trask
Copy link
Member Author

trask commented Aug 6, 2020

If possible, I think ideally the auto-instrumentation would take precedence over the library instrumentation (where they are both official packages from this repo). The reason for this is that it's generally easier to upgrade to the latest javaagent (e.g. done by ops to fix a bug or get a new feature) than to upgrade to the latest library instrumentation.

[EDIT: just adding thoughts here]

@trask
Copy link
Member Author

trask commented Aug 8, 2020

[adding more thoughts here]

see #920 (comment):

another option for suppressing library instrumentation that might be easier to generalize, is to make the coordination more explicit, e.g. have a method in instrumentation-api like isAutoEnabled(instrumentationName), and the library instrumentation can explicitly check that to see if there is auto instrumentation with the same name enabled

@anuraaga
Copy link
Contributor

anuraaga commented Aug 8, 2020

That's a nice idea. I'm a bit on the fence about introducing logic in library instrumentation to support auto - it's harmless but does seem like a break in separations of concerns.

But it gives me the idea that I didn't have at all, instead of instrumenting the library to suppress, I could instrument the library instrumentation itself. So stub out newdecorator to return a no op, using byte buddy. It still has the problems of shading the constant and the tediousness of replacing a method in byte buddy so not sure if it's worth it just for the concern of separation of concerns.

@trask
Copy link
Member Author

trask commented Aug 8, 2020

I'm a bit on the fence about introducing logic in library instrumentation to support auto

ya i understand. the way i view is that it's also a big selling point of this project, that we are developing auto and library instrumentation in concert, so that they (1) produce the same telemetry and (2) play nice together. so, not so strange that we would produce "auto-compatible" library instrumentation

@mateuszrzeszutek
Copy link
Member

If choosing library instrumentation over javaagent instrumentation is an option, we could add a classloader matcher to javaagent instrumentations that would prevent them from being applied if the library instrumentation is already present on the classpath (preferably in a separate method, e.g. libraryInstrumentationMatcher()).
The downside of this is that library instrumentation will always win. On another hand, calling sth like isAutoEnabled(instrumentationName) may not be really feasible in library instrumentation that's applied using SPI (like log4j or AWS SDK), but preventing the javaagent instrumentation in this case is trivial.

@trask
Copy link
Member Author

trask commented Mar 4, 2021

We discussed in SIG meeting today and agreed to have the javaagent instrumentation back off, and have the library instrumentation take precedence, because

  • In the future when some libraries (hopefully) start emitting OTel directly, we will want to do the same for that instrumentation (have javaagent back off)
  • It's easy to implement
  • The user may have configured the library instrumentation (e.g. set certain options in builder), and it's nice to respect those settings
  • If user really wants javaagent to take precedence, they have a reasonable workaround to remove the library instrumentation

@mateuszrzeszutek
Copy link
Member

Turning off instrumentation just based on the library class presence is not a good solution for most cases; reopening.

We'll probably have to stick with shading library instrumentation classes and have javaagent advices detect them manually; I don't have any idea for a common solution to this problem.

@lmolkova
Copy link
Contributor

lmolkova commented Jun 16, 2021

I'm working on Azure SDK instrumentation and have a problem with double instrumentation on the HTTP layer. Here's the scenario:

  • we want to support users who auto-instrument and those who don't - it's not in our control (so we instrument HTTP for them)
  • we instrument public logical API calls like uploading a blob (so users know what happens on the public surface of the library and can map spans to their code). Public API calls can translate into a number of HTTP calls (retries, auth, complex operations).
  • we want to provide additional details on top of HTTP client span that auto-instrumentation would not include: they are specific to Azure and are public Azure service properties users need to get support. Additionally, we occasionally want to strip some sensitive parameters. They are per-http request and have no place on parent logical span

So what we have now is:

  • [manual/agent] SERVER span
    • [with/without agent] uploadBlob span (INTERNAL)
      • [with/without agent] HTTP spans (CLIENT) with extra stuff specific to Azure
    • [agent] HTTP spans (parent depends on how we create HTTP span in the client lib, so it's flexible)

Suppressing instrumentation in the way #2661 was done, doesn't help - we want HTTP to be suppressed. Suppressing all HTTP (netty) instrumentation either - users still use netty for their calls.

Solutions:

  1. Final instrumentation. One idea we've previously entertained with OpenCensus community is terminal span - I have somewhat outdated proposal - as a way to coordinate instrumentations

  2. Backoff if the context is injected. Another idea that may be a short-term mitigation step: we'll have more issues like that with any protocol. If the context is already injected, there is nothing good this layer of instrumentation can do - everything would be wrong, just back-off

Since this is not a language-specific issue, I'll create an issue in spec repo.

@lmolkova
Copy link
Contributor

One more thing that probably won't work is #1822.
Suppressing nested CLIENT spans is some hidden contract that would be very hard to explain to users and a huge support burden - users would manually instrument and would not see their spans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants