-
Notifications
You must be signed in to change notification settings - Fork 895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ideas on how to InstrumentationLibrary -> Scope #2307
Comments
This goes a step further than was discussed in the last spec SIG meeting. A smaller step would be to only add a "subscope_name" as an additional field to "instrumentation_library_name/version" and call the message with these 3 fields a Scope. But perhaps "attributes" field+semantic conventions for scopes is a better approach if we can come with other use-cases when recording additional attributes is necessary.
I think this one people disagree with because this dilutes the semantics of the existing API. I tend to agree with this position. I think we need to keep existing We can add something like |
I like this proposal. It solves the problem posed in open-telemetry/oteps#78. |
@jmacd I wish I would have understood better your proposal back then, anyway good time now to fix that. |
I generally like this approach, as a scope would give us a place to store instrumentation library as well as additional metadata. There was a lengthy discussion at the spec SIG where concerns were raised around backwards compatibility, and ways to work around incompatibilities were suggested. Is there any chance we can document what the concerns were and how they could be resolved? |
I can document at least my concern: I am concerned that expanding what the name is allowed to be will lose the information of which library actually generated a log line. Loggers can easily be named I, (and I believe @Aneurysm9 although I would let him speak for himself) am not completely convinced by this. At least in JavaScript I have not seen any consistent use of logger name, and certainly would not expect fully qualified class name to be commonly used as a logger name (JavaScript doesn't even have a common idiom for globally referencing a class anyway). Some popular JS logging libraries (winston, npmlog) don't even have the concept of a logger name, and those that do offer little to no guidance as to what a logger name should be (bunyan, pino) |
I do share Dan's concern that assumptions about user behavior are being made without adequate evidentiary support. I also think that, even if those assumptions are correct, this solution is overly broad and does more than it needs to do. It seems that the current concern is that It is proposed that:
With some further discussion of what this means for the existing APIs and SDKs. Instead, I think there is a single, simple change required:
Doing this, coupled with defining semantic conventions for those attributes, would allow for further narrowing the scope of the |
Last week I thought we had agreed on a variation of this where scope was going to be added as an additional field without removing the existing instr lib name/version fields. That was fine with me and this version proposed by @Aneurysm9 is equally acceptable to me. |
Adding attributes to |
Right. I think we're imagining that instrumentation library name would be deprecated in favor of As for the standard logger name concept we're referring to, it seems to be a kind of component namespace. Maybe it's
I think this deserves emphasis, fwiw. Are we being led by a Java-specific concern? When I look at the origin of |
Adding a library name such as At the same time, in languages where using named loggers is common, we do expect to have many many log messages associated with a single logger name, and the name tends to be a very important aspect of triaging logs since it often is closely mapped to a component of concern, so it does seem useful to group the log messages by it within the protocol, for compression reasons if nothing else. Adding Thinking a bit more on servlet, I think the equivalent to logger name there would be the name of the class that |
For @jmacd
For @dyladan
There are couple of problems that we have to solve:
For the "wire format" where we can rename messages and fields, we can have something like (based on the current documentation in the proto we can probably simply not doing this at all, see what is documented https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/common/v1/common.proto#L77): // Scope is a message representing the instrumentation scope
message Scope {
// An empty instrumentation library name means the name is unknown.
string **instrumentation_name** = 1; // deprecated, will be available as a semantic convention.
string **instrumentation_version** = 2; // deprecated, will be available as a semantic convention.
string name = 3; // The new relaxed "name". TBD if a special field or also a semantic convention. For me it seems important to be first class citizen.
// Set of labels that describe the instrumentation scope.
repeated opentelemetry.proto.common.v1.AttributeKeyValue attributes = 4;
}
// A collection of Metrics from a Scope within a Resource.
message ScopeMetrics {
// The Instrumentation scope information for the metrics in this message.
// If this field is not set then no Scope is known.
Scope scope = 1;
// A list of metrics that originate from an Instrumentation scope.
repeated Metric metrics = 2;
} For the "get(string)" API, here is what is documented:
I can easily make a case that ignoring the name at all is a valid case
Here it clearly allows me to use anything I want "module/class/component". I am not sure we need to debate anything based on this comment. I can clearly make a case that this does not need to be "unique" (which is not documented at all in the current text). Besides all these glitches in the documentation that allow us to mostly change anything we want in terms of the semantics of that field, the proposal I think is to actually consolidate on the important requirement (ensure uniqueness which is not documented) and relax on the "artificial" requirement (to represent instrumentation library). I think the important changes here are:
|
Set of attributes is totally fine with me as long as the information isn't lost
Seems reasonable % bogdan's comment about schema url
I don't think this is a Java specific concern. The idea of a logger name is definitely not universal, but certainly I see the usefulness of it and if we're designing a logging API I think it is a worthy inclusion. Some assumptions made about the solution may be specific to Java.
Seems fine.
This seems like a creative interpretation to me. Sure, maybe to the letter of the rules it can be done, but removing the uniqueness and meaning from it removes a lot of its usefulness. I believe @Aneurysm9 would agree with me here, but again I'd let him speak for himself.
Uniqueness is not documented because it wasn't required within a single library. Within a library you can have 3 tracers all with the same name. It just needs to not be the same as the name from any other library. Obviously global uniqueness also satisfies this requirement. Maybe a compromise is to say the name should be globally unique, and one suggested way to do that is to use the fully qualified class name (or similar fully qualified identifier)? side note: github issues need threads 😆 |
I would like to challenge the assumption that we can't have getLogger and getTracer with different signatures. If logger name is a well-established concept that's all well and good, but tracer and meter name are not likely to be familiar concepts to new users and already will require some education. I don't think we can just assume users will not be able to understand the differences here. |
+1
Not sure about where you see "inside a single library" and where that scope comes from. I am saying that as it is right now I can use "io.grpc" as identifier for my "instrumentation library", it does not say to be unique. @dyladan here is an offer that you cannot refuse :)) Here is a more crazy idea (inspired by also the fact that right now the "name", can be |
Even in your example |
/cc @z1c0 @arminru can we verify if this is ok for us?
I was just saying that there is no uniqueness. Within the same library the name would of course always be the same and of course is not unique, but it is unique to that library. 2 different library authors would never clash names if the library name is used. This is the type of "uniqueness" that is important.
I think relaxing the term "library" is fine as long as we say we need something fully qualified. I don't want the case where the name is just the name of the class (not fully qualified) and clashes with a different library that happens to have a class with the same name. You're right that the "library" term isn't important to us, as long as we can uniquely identify the component which is generating the telemetry. |
100% I want to preserve the ability to isolate an "instrumentation" (being it class/package/library/etc) using that name, same requirements were on the bases of logs as well :) |
This is what I am trying to say that is not documented right now. |
It's not documented it's just a natural consequence of using instrumentation library name |
I think we should keep this out of this issue and defer to #586.
Also not sure if this should be discussed separately from the ability to add arbitrary attributes, which I think is the main topic of this issue(?). But let's try discussing it here: It is true that the spec right now wants you to have the actual library/artifact name. But I think this was never really followed consistently across languages (e.g. Python uses the module/file name I think the actual source / format of the name is not that important, as long as we agree on what it should be useful for, which is IMHO, identifying the technical source of some telemetry for troubleshooting telemetry-related code, usually instrumentation code (not for assigning it to some logical monitored system component -- other span and resource attributes should be used for that like service.name, http.route, code.function, etc.). Ideally, it is not too fine-grained, or if it is fine grained, I think it would make sense to recommend having a common prefix. This would allow discarding / skipping analysis of telemetry data generated by known-faulty/too noisy sources. |
If that is the case, well, that is just a value the user can pass to a method that is part of the public API. In that case, I think the spec does not "apply" since it is left as a choice for the user. |
I mean, the spec can't force users to do anything, they can also write a new GUID into the span name for each span if they want, but that would break everybody who expects that the span name is what the spec says it should be. But in that case, I think |
If we want the span name to be defined by the spec, shouldn't we add to the spec a requirement that is to be implemented by a non-user defined field somewhere? |
We don't want the span name to be defined by the spec, we want the meaning of the span name to be defined by the spec. The same goes for the instrumentation library name. Having a defined meaning should lead to consistency in usage and the ability to reliably make decisions based on the information it conveys rather than treating it simply as bits of data with no informational value. |
Please review #2276 |
This is a try to document what is the ideal scenario to extend the protocol to support the notion of "instrumentation scope", and allow the tracer/meter/logger to share the same concepts.
There were multiple proposals, disagreements, ideas in lots of places:
OTLP ideal end goal, based on lots of discussions/comments:
How to get there?
There are few places where changes are necessary:
The text was updated successfully, but these errors were encountered: