-
Notifications
You must be signed in to change notification settings - Fork 833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Context mixup when using wreck library to make downstream request #1479
Comments
I doubt this is an issue with the context manager, but with the plugin itself most likely. My best guess is that the span created in the middleware is being marked as the active span during that middleware execution, but not being removed as the active span when the middleware completes. |
The express middleware gets its context from the previous middleware's |
@blumamir thanks for sharing your insight. I did turn on express instrumentation but still saw the same issue. |
@vmarchaud currently our logging middleware creates a new context object for every request using the the cls-hooked run function. Now that I think of it, since we don't store any I did some further debugging with different scenarios and the issue seems to be reproducible using plain app.use((req, res, next) => {
const r = http.request('http://www.github.com/')
r.on('response', () => {
next();
})
r.end();
}) Interestingly enough, the issue doesn't surface when using a promise based library such as fetch in the middleware app.use((req, res, next) => {
fetch('http://www.github.com/')
.then(response => {
next()
})
}); I added some breakpoints and it seems like the promise executed in the |
@maxmil7 I agree that the trace tree can look confusing and might not be what you expect to see. Your second http call (or middleware invocation) depends on the result of the first invocation, so there is a sense in setting it as the child at least in the purely technical perspective. |
Yeah thats the behavior of promises based on the ECMA specs, downstreams contexts are created before executing the promises. @blumamir is right on the fact that technically the second http call to |
I think this idea deserves exploration. In express, middlewares can can be thought of in parent-child relations, but more often I think of them as more of a processing pipeline. If you have |
I agree about the desired behavior in this case. |
I'm not against making a "fix" to express but i think this a broader issue, technically step C is a child of B (and the whole context management API assumes this). There will be other cases in the future where some users might find the spans relationships weird even though they are technically correct and we will need to draw a line at which we stop trying to "fix" them. @blumamir I would be interested in other examples where the relationships between spans were weird, either here or by private message on gitter ? |
Sure, I'll post here some examples: setTimeout const mainSpan = tracer.startSpan('main');
tracer.withSpan(mainSpan, () => {
setTimeout( () => tracer.startSpan('timer span').end(), 10);
})
mainSpan.end();
const pool = () => {
http.get('http://dummy.restapiexample.com/api/v1/employee/1', () => {
// process the response and call pool again only after process completes
setTimeout(pool, 1000);
})
}
pool(); It will create one infinite trace of HTTP calls one after the other. promise vs callback sqs.receiveMessage(params, ((err: AWS.AWSError, data: AWS.SQS.Types.ReceiveMessageResult) => {
// here we expect the context of the receive operation
}) But we can also use promise function and sqs.receiveMessage(params).promise().then(data => {
// here the context is NOT set to the receive operation
}) In the aws-sdk plugin, I actually patched the returned promise to solve this issue, but it only works for the first bind context on event emitter const server = http.createServer((req, res) => {
const span1 = tracer.startSpan('span1');
tracer.withSpan(span1, () => {
req.on('data', () => {});
req.on('end', () => {
tracer.startSpan('span2').end();
res.writeHead(200);
res.end('Hello, World!');
});
});
span1.end();
});
server.listen(8080); You might expect The examples above are short so they fit in this github issue, but in real life cases, the problem is usually buried under a stack of libraries, plugins and code which makes it very hard to understand and solve. I usually just get some weird structure traces and need to start digging in. |
Hi All, looking at this issue I wonder if it is something that OpenTelemetry as the infrastructure layer should resolve? I think this question of whether those spans should have child <> parent relation or sibling is a question of perspective rather than real technical questions. In the technical aspect, one middleware is invoking the next meaning they have child <> parent relation. But each one of us will “visualize” it in a different way. It also could vary from different use cases and different plugins. For example, with express having three different middlewares, A & B are regular ones where C is an error middleware. Middleware C is reporting using HTTP the error it captured thrown by middleware A, how would you expect to see the HTTP call made in Middleware C? Or
I think you can argue for both... Because it is a perspective issue I think OpenTelemetry should define a default but to allow the end-user to modify it, that way as an OpenTelemetry user I can decide for each use case how I want my code to be instrumented.
I can work on a more technical suggestion on how to implement it but wanted to raise the discussion first. |
You shouldn't have issues if you are using native Promise though. Context propagation should works across multiple
That would require to access to the context stack (which is internal currently). I think it's worth checking how other languages handle those cases and if it's a common pain, see how the spec could be changed to allow this. |
Make sure you are calling |
I don't think so. Just think about how the code would look like with
For me the root is the incoming request and it has two childs. The timing tells if they run in parallel or sequential. |
I also would think of it this way. |
This code is a syntactic sugar for the following code: function onRequest() {
ClientRequest().then( serverResponse => {
DbRequest(serverResponse).then( dbResponse => {
sendResult(dbResponse);
});
});
} Each operation relies on the previous operation result (even if the result data is not used, it still relies on it to succeed and not throw). I don't think there is one right answer thought. Maybe the solution is to allow users to configure if they want the callback / promise to be run in the context of the plugin span or not. Something like this: const provider = new NodeTracerProvider({
plugins: {
http: {
enabled: true,
path: '@opentelemetry/plugin-http',
setOutgoingCallbackContext: false
}
}
}); That will also solve the original issue when configured in this way. |
Yes, it relies on the result of the previous but it is not a part of the operation. Consider a outgoing http span and the corresponding incoming - they have a parent-child relationship as the one is a part of the other.
|
I am using native promises, but patch the promise so the private _bindPromise(target: Promise<any>, span: Span) {
const thisPlugin = this;
const origThen = target.then;
target.then = function (onFulfilled, onRejected) {
const newOnFulfilled = thisPlugin._tracer.bind(onFulfilled, span);
const newOnRejected = thisPlugin._tracer.bind(onRejected, span);
return origThen.call(this, newOnFulfilled, newOnRejected);
};
return target;
} Would love to discuss better alternatives to do the binding to the promise which supports chining |
I tend to agree in this case, although I believe it should be configurable. Maybe the default should be not to set context on outgoing callback, and user can choose to keep the default behavior in plugin configuration? |
+1 I think this would be a great option for plugins which create and enter with a new span context. At least for http plugin, we might have to account for the following scenarios:
and promised based request libraries Also, like the idea of setting |
Please note that even if we add this option it will not work together with It is possible to monkey patch Besides that such configurations make it harder for tools receiving the spans to interpret/analyse the structure and detect issues automatically. If receivers just display spans but have no logic besides that it's not a problem. If it is really required to have specific span linking in your app I recommend to disable automatism and pass context manually. |
This comment has been minimized.
This comment has been minimized.
You mean those tools won't be able to work consistently on traces and will need to assume both cases for the two possible configuration options? |
I think we need to remove the outgoing http span context from the event emitter callbacks as well. What do you think? |
I think so, or at least it's much harder for them. if child and sibling spans are mixed based on a configuration not known/supported within the tool they can not do any analysis based on this. |
Hmm, so that would mean instrumenting the http calls manually instead of using the plugins correct?
Agreed, it might be possible. However, I think the issue will surface outside of express middlewares as well. Wherever we start chaining the http calls
If everyone agrees, I think this might be a good approach |
I think so yes. I may be needed to set the parent span of the HttpClient span as context in callbacks/emitter of the client req/res. |
This comment has been minimized.
This comment has been minimized.
I feel dumb for not seeing that the express plugin wasn't enabled in the code you given and that the problem was simply the context set by the http plugin. Anyway i've opened a PR to change the behavior. EDIT: I'm removing the bug label since the current behavior is not bugous in itself |
What version of OpenTelemetry are you using?
What version of Node are you using?
v12.17.0
What did you do?
We have a legacy service invocation library which uses wreck@v12 to invoke downstream services.
This library is used by express middlewares to make service calls to downstream services when an application receives a request. The application is based on ExpressJS
I tried to integrate the application with Opentelemetry so we can deprecate our legacy service instrumentation with Opentelemetry auto instrumentation
What did you expect to see?
I was hoping that all the outgoing http requests would be auto instrumented and the parent-child relationship between the spans created for the service requests would be correct
What did you see instead?
I noticed that for some service calls the parent-child relationship wasn't correct. Specifically, for the following flow:
The span for call to service B was showing up as child of span created for service A request
Ideally, both service A and service B calls should be children of the original request span
Additional context
I have created a very small express to demonstrate the issue here:
https://github.com/maxmil7/async-demo
The steps to reproduce are in the readme of the app (just requires cloning the app and hitting the index page)
Add any other context about the problem here
FWIW, I tried to use AsyncLocalStorage context manager instead of AsyncHooks and the same issue is visible there as well
Currently, we are using cls-hooked for context-management and this issue is not reproducible with cls-hooked.
The text was updated successfully, but these errors were encountered: