-
Notifications
You must be signed in to change notification settings - Fork 895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add context propagation requirements to HTTP conventions #1783
Conversation
Open questions:
|
@@ -111,6 +112,11 @@ from the `net.peer.name` | |||
used to look up the `net.peer.ip` that is actually connected to. | |||
In that case it is strongly recommended to set the `net.peer.name` attribute in addition to `http.host`. | |||
|
|||
### Context propagation | |||
|
|||
- context created for HTTP client span MUST be injected on outgoing request using configured [propagator](../../context/api-propagators.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this addition can be for not just http, but every instrumentation dealing with out-of-proc communication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, if this direction is supported by the community, I can update other relevant specs, I can also update propagators doc to mention it.
### Context propagation | ||
|
||
- context created for HTTP client span (if valid) MUST be injected on outgoing request header using configured [propagator](../../context/api-propagators.md) | ||
- Exception: if outgoing HTTP request already has valid context (for configured propagator), it cannot be changed and new span MUST NOT be recorded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I oppose this change. #1738 made in clear and explicit that nested CLIENT spans are allowed. There are totally valid cases for having them: e.g. DB call which used HTTP as its transport. In this case I want both a span with DB semantic convention and spans with HTTP semantic conventions.
I totally agree that only the outer-most or first CLIENT span must have its context propagated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point and agree nested client spans should be allowed.
This change does not prohibit nested client spans. This is an edge case of 2 HTTP instrumentations fighting for the same request. Arguably there is no real use-case behind 2 identical spans created on different API levels.
At the same time, DB spans should not inject context into http requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is worth clarifying in this spec that only HTTP instrumentation should inject context into http request? I.e. if OTel instrumentation injects headers into HTTP requests it MUST also create an HTTP span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is worth clarifying in this spec that only HTTP instrumentation should inject context into http request? I.e. if OTel instrumentation injects headers into HTTP requests it MUST also create an HTTP span?
Why? Let's take another example: messaging system with HTTP transport. PRODUCER span is created by it and it definitely has to be injected and propagated. Hm, but it probably will be injected into the message and not http request... But then do we want for http instrumentation to inject its context into http headers? Probably not?
Arguably there is no real use-case behind 2 identical spans created on different API levels
Agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then do we want for http instrumentation to inject its context into http headers? Probably not?
We probably don't need it but it probably won't hurt if we have it in both HTTP and the message.
As a special case, for AWS messaging systems (SQS, SNS), setting the right HTTP header will actually end up being translated to a message property. https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-message-metadata.html#sqs-message-system-attributes
With messaging, you have to allow separate context on message and transport. Reasons:
Service meshes that handle messaging are the ultimate example of all the above.
Assuming there is instrumentation for all almost HTTP clients in nearly every language, having consistent high-quality spans created for HTTP requests will bring a better user experience - vendors always know how to visualize it, build metrics and alerts based on it. |
@iNikem @bogdandrutu please take a look. |
|
||
`Context` created for HTTP client span (if valid) MUST be injected on outgoing request headers using configured [propagator](../../context/api-propagators.md). Instrumentation that injects `Context` into HTTP request headers MUST also emit a span that complies with other client requirements in this specification. | ||
|
||
**Exception**: if outgoing HTTP request already has valid `Context` (for configured propagator), it cannot be changed and new span MUST NOT be recorded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can this be determined using the current OpenTelemetry API? Would all http client instrumentation be required to query the propagator, grab the fields()
and check to see if they exist already? How would that work if the http request object was being re-used and the fields were left over from a previous request? (see the documentation on the fields()
function of the propagator to see what I'm referring to).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can this be determined using the current OpenTelemetry API?
I was thinking about having this logic within the propagator and have 2 options:
- Sub-optimal, no propagator change requires: instrumentation calls
extract
and checks if it got validcontext
. It's cheap when there is one layer of instrumentation, but if there are multiple, it involves extracontext
parsing and allocation. - Optimal: add
CanExtract
(naming is always hard) method that only checks for context presence and may do minimum validity checks.
This way decision on context presence and validity stays within the propagator and multiple HTTP (and other protocol) instrumentations don't have to do it.
I suggest starting with 1 as perf hit affects relatively rare scenarios (multiple instrumentation layers) and is not very big in the general case of single transparent
. Optimization (option 2) can come later.
I'm also open to starting with option 2 right away add a new method on propagators if there is any consensus around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would that work if the http request object was being re-used and the fields were left over from a previous request?
Great point! I believe the current spec requires to clean up the context for the reused carriers, Is there an assumption that it's not always possible?
I think this is also worth mentioning here that context should be cleaned up after HTTP try if the HTTP client allows reusing the same request instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec says "successive calls should clear these fields first.", so your proposed call to extract
would still be operating on the previous contents of the headers, meaning that the resulting Context would always be valid, so I don't think this is a feasible approach, with either options 1 or 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, you're right. what's the feeling about changing it to clean up after rather than before? this way everyone cleans up for what they've done and never breaks anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh. I have no idea...I don't think I've ever seen a web client that re-uses request objects like this, so I don't have the context for what it's even referring to. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems it was there from the very beginning (#147) and there was no discussion/explanation around this choice. It also seems that at least golang http client allows to reuse http request instances between tries. I'll raise it as a separate issue and until then this PR is blocked.
Ok, I agree with you about messaging use-case. Different contexts propagated via message and via http connection make sense. Although I am not sure our current ecosystem (both instrumentations and backends) handles this correctly. But I am not convinced about tying together "should I create a new CLIENT span" and "should I propagate this span". This proposal forbids creating new CLIENT span if it will not be propagated. Take the following use-case.
Both libraries have auto-instrumentations in Java. I argue that the valid outcome should be the following:
This proposal forbids creating those nested spans by Netty. I think it is wrong. |
My understanding of this PR is that there is some desire to handle competing "http" client spans; to collapse them into one sensible span rather than multiple duplicate spans. We've been exploring this in opentelemetry-ruby, trying to de-duplicate http client spans in our auto-instrumentation. We have an approach that works well for us, by modifying the context around a block of code. Then, our HTTP auto-instrumentation libraries merge those context values into the attributes of the outgoing client span. The result is one outgoing span, annotated with relevant attributes from a higher-level instrumentation. In pseudocode, it looks like: # Higher-level HTTP-like instrumentation
OpenTelemetry::Common::HTTP::ClientContext.with_attributes({ foo: "bar" }) do
# do work, usually calling `super`
end # Lower-level HTTP client instrumentation
attributes = OpenTelemetry::Common::HTTP::ClientContext.attributes
tracer.in_span("GET", kind: :client, attributes: attributes.merte({ other: "attributes" }) do |span|
# do more work
end You can see this in practice in the "higher-level" koala instrumentation (a facebook HTTP client, I believe), and the "lower-level" Net::HTTP instrumentation (standard ruby HTTP client, used by many things). The context modification bits can be found here. I would submit this as an alternate approach to prohibiting nested spans; this allows instrumentation that knows it is just an extremely thin wrapper around an HTTP call to decorate spans with relevant attributes without confusing duplication. This approach has downsides and does not solve all of the things you're setting out to do, but I think it is worth considering. |
@iNikem thanks for the useful input!
I think there is an assumption that retries and redirects should be represented as a nested span but have to share the same context on the wire. Why so? Redirects may or may not be traceable and users configure HTTP clients to allow/prevent auto-redirects. So we should assume any HTTP client instrumentation will create new spans and propagate new context for redirects (if app logic handles them). For the sake of consistency, we should make auto-instrumentation do the same. Retries are part of application logic in the majority of HTTP clients. It's not feasible to instrument most of the HTTP clients in the way to have a higher-level span to group retries without the user's help. So I believe we have to agree each retry has to have its own span AND context to bring consistent and understandable experience. |
@ahayworth this PR attempts to address corner case: what happens if http request is already instrumented. There is a broader discussion here #1767 with context-based instrumentation suppression. Thanks for sharing your approach! |
@lmolkova Yes, I understand - and I think the context propagation requirements in this PR are useful. I mentioned the context-based span augmentation that the ruby SDK does because I feel as though it can address this part of the PR:
I just wanted to raise that an alternate way to address it would be through context-based augmentation: while it creates an implicit assumption that lower-level instrumentation will be present, at least in the ruby world we presume that low-level HTTP implementation will be something that many folks want. Taking that approach, there is no already-instrumented request in progress; requests are only instrumented at the lowest level, and it solves that part of the problem you've raised. Put differently: we could instead advise instrumentation not to do this in the first place, and would perhaps avoid some of the difficulties and problems that others have raised. (edit: I will engage in the PR you mentioned! 😄 ) |
I'm closing this PR based on @trask feedback on #1811 (comment) - reusable requests are much more common than I expected and requiring context clean up will bring perf hit on the happy popular case of no-retries. I'm trying to come up with |
Blocked by #1811
Changes
Motivation
From #1767 (comment)
Related issues #
#530
#1767