-
Notifications
You must be signed in to change notification settings - Fork 1.6k
System for managing own telemetry attributes within pipeline components #12217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I like the general idea. A couple of questions:
|
We can leave it out. There may be some uses out there already, but they would be able to access it on the embedded zap logger.
We have 5 different attributes defined in the RFC, which differ based on component kind. Additionally, I would hope that in practice we see the need for removing attributes only a few specific cases. So my opinion is that such methods would be either inapplicable (e.g. receiver doesn't have a |
I think we need a new module which eventually contains:
Constraints:
The best I can come up with is |
Draft solution: #12259 |
Sounds reasonable to me |
Implements the logger described in #12217 Alternative to #12057 Resolves #11814 `component/componentattribute`: - Initializes new module - Defines constants for component telemetry attribute keys - Defines a `zapcore.Core` which can remove attributes from the root logger `service`: - Rebases component instantiation on attribute sets - Internal constructors for attribute sets for each component type - Constructs loggers from `componentattribute` `otlpreceiver`: - Uses `componentattribute` to remove `otelcol.signal` attribute from logger `memorylimiter`: - Uses `componentattribute` to remove `otelcol.signal`, `otelcol.pipeline.id` and `otelcol.component.id` attributes from logger
We have the solution for the logger now. However, we will likely need to apply a similar approach to the meter and tracer in the future. If we keep adding Also, I think it could be one mutation call to |
|
The meter/tracer provider don't contain attributes themselves, so to propagate the component attributes we would probably want to add separate I do however think the |
Since Am I understanding correctly, that you'd like us to have a single
I don't think it's necessary vs just listing constants, e.g., in memory limiter:
Minor notes:
Personally I like (1) best as I believe component authors should be thinking about the attributes on their telemetry, rather than thinking about an abstraction which they may not recognize as impacting their telemetry. |
If the intent is to use metric-level attributes for this, I suspect it will be a bit complex to do this for Since the discussion includes the possibility of modifying the recently added method |
I tried doing the MeterProvider wrapper: you need at least 21 wrapper structs (MeterProvider + Meter + each instrument + the observables for async instruments) to make it work. We could do it if we want to, it just makes me feel like this is not the right solution, given how complicated this is. |
Concretely: I wonder if metric attributes are not the right level for this, and these should be instrumentation attributes or resource attributes. The RFC is not very clear about what type of attributes these should be. |
Assuming we can find a clean way to move towards this approach and avoid overcomplication, I would prefer the option (3)
It'll help us control the migration of some attribute keys (e.g.,
We need to think about what the resource is in our case (which entity). We've been considering the collector as the resource. If we want to move some attributes to the resource level, we need to reconsider this. I think we can move all the attributes to the resource level if we consider the resource to be component instance. However, this can complicate re-aggregation. Currently, the collector doesn't provide any processor to do aggregation by resource attributes as far as I know, only by the datapoint attributes . So I'm not sure it that's the right move, but we still can consider it if that's required to proceed with the RFC. |
Since the library we are wrapping is stable, this seems reasonable to me. However, I think it's worth clarifying first whether we want to model component instances as resources or entities. IMO instrumentation is not technically correct because we intend to have external code (e.g. in the service graph) produce signals describing the component instances, but not originating from within them. |
My impression is that resource attributes describe where the telemetry is coming "from" as well, so I don't think it would be correct either. I'm not sure there is a convention for instrumentation libraries (here, the pipeline) to indicate which user (a component) of the instrumented library (the Consumer API?) caused the observed event. My opinion is that perhaps the least bad solution would be to set the instrumentation library name to |
@djaglowski could we add the clarification of what kind of attributes these are to the RFC? I don't have a strong opinion, but I think we can close the discussion that way. Also, @djaglowski @dmitryax I filed #12334 so that we can keep on experimenting on this while being able to mark component as 1.0 |
I agree we should update the RFC but I'd like to have some idea of what to propose. I'm unclear myself on what type of attributes are reasonable.
Resource attributes don't necessarily have to be applied from within the same code that the signals describe. For example, most pull-based receivers attribute signals to a resource, despite the receiver not being part of that resource. If we were to say that the RFC describes resource attributes, how is this applied for loggers? |
I would guess OTLP logs export would use the resource while zap Logger export would flatten things and set those as attributes? |
As suggested in #12217 (comment) Co-authored-by: Pablo Baeyens <[email protected]>
As suggested in #12217 (comment) Co-authored-by: Pablo Baeyens <[email protected]>
I'm reconsidering if wrapping Providers (and Meters, and every instrument...) to add span/metric attributes is the most maintainable way to go about this. I think there's a few more options we haven't yet fully explored:
What are your thoughts on these two proposals? |
Setting aside the details of this design discussion for a moment, I think it's important to acknowledge that the while the Collector is part of the OpenTelemetry ecosystem the task of instrumenting it should be similar to what we would ask of any OpenTelemetry instrumentation user. With that in mind, here is the vision statement for OpenTelemetry which should be quite relevant to instrumentation users. Particularly relevant to our discussion:
The main question in my mind is why our experience using OpenTelemetry instrumentation isn't living up to the vision. Why is it so difficult (especially for us)? Why are we considering solutions that produce non-correlated telemetry? Why are we considering solutions which would require us to use our positions as community members to insert special cases into the instrumentation codebase? IMO the root cause is the asymmetry between the traditional logger, which doesn't understand our data model, and the Have others arrived at the same problem when attempting to implement telemetry which can be correlated across signals? Does the project have an answer to this yet? Maybe something @tigrannajaryan, @jmacd, or someone else on the TC could answer? |
If we're searching for root causes for the complexity of this issue:
Unfortunately, solving point 1 would require breaking changes to a fundamental part of our API, and would not solve point 2. And solving point 2 would require large changes to the OTel API and Data model. In other words, I think it's too late, and we have to resign ourselves to workarounds. |
Summary of discussion about this on the Collector stability meeting:
|
I opened this issue on the Go repo to ask if the Go SIG has ideas for a better implementation. |
This is being changed, see open-telemetry/opentelemetry-specification#4352 and open-telemetry/opentelemetry-specification#4438
@djaglowski I haven't read the entire thread, anything else you wanted me to help answer other than the question about log appenders? @pellared may be able to help with Go Logs SDK. I think |
open-telemetry/opentelemetry-go#6404 (comment) I replied to the new issue posted by @jade-guiton-dd, since it narrows the discussion. I see at least three ideas being discussed there, and all of them are backed by a stalled OpenTelemetry discussion. To summarize:
I'd like to see more energy in the OpenTelemetry Specification SIG discussing these topics. It's great that we're finally seeing these problems within our own codebase. |
(Apologies if my questions are answered already - there's a lot of context to take in here.)
Is this something that could be changed? For example, could we replace the
That will still mean a lot of changes to all the components, but it seems achievable. (I realise there's a bunch of issues to solve here, just looking for an easy win!) |
@jmacd To react to your point 1: @axw This would be a huge breaking change, and unfortunately, the I've been trying to prototype how to recreate a |
I see, that's unfortunate - this seems like a pretty fundamental thing to get right. As an alternative, the
FWIW, I don't think semantics should be compromised due to what vendors do right now. It's unclear to me that instrumentation scope is a good fit for all of the universal telemetry attributes though. I've always understood instrumentation scope as being related to static source code, and not some instantiation of that code. Am I wrong on that? If I'm not wrong, then I think at least the following attributes don't really belong at the instrumentation scope level:
|
After bringing up the topic in a Spec SIG meeting, it sounds like the current intent for instrumentation scope attributes would very much encompass our use case, so scope attributes should be fine semantics-wise. With that issue out of the way, I think the only big remaining concern with switching to scope attributes is their poor support by backend vendors. I think it would be reasonable to push for backends to support them, considering they have been in the spec for a while, their handling should be quite similar to resource attributes, and support would benefit other use cases. We may want to provide a compatibility solution for users who want to access them before support is ready (the transform processor should be fine if they have a Collector on the path, but maybe there are more general solutions?), but I think this could be handled separately. To recap the benefits of the scope attribute approach vs. the current metric/log/span approach:
And I don't think there are other approaches we could make use of right now in the current state of things. For that reason, I'm planning to bring the topic up again in next week's Collector stability and Collector SIG meetings, and unless there are strong objections, I will keep working on a new PR using scope attributes and file it so people can compare with the old PR code-wise. |
Still work in progress, but I filed a PR with the scope attribute implementation: #12617 |
I filed an issue on collector-contrib to track support for instrumentation scope attributes across the exporters there: open-telemetry/opentelemetry-collector-contrib#38744 |
#### Description Fork of #12384 to showcase how component attributes can be injected into scope attributes instead of log/metric/span attributes. See that PR for more context. To see the diff from the previous PR, filter changes starting from the "Prototype using scope attributes" commit. #### Link to tracking issue Resolves #12217 Also incidentally resolves #12213 and resolves #12117 #### Testing I updated the existing tests to check for scope attributes, and did some manual testing with a debug exporter to check that the scope attributes are added/removed properly. --------- Co-authored-by: Pablo Baeyens <[email protected]>
Some types of components are fully or partially shared across multiple instances. (e.g. otlp receiver / memory limiter)
I have previously pursued some optimistic models in which the collector would define certain ways in which sharing is "supported", but the reality is that sharing is an unbounded problem. e.g. Someone could write a processor that shares a data structure between logs and traces pipelines only if it's instantiated on a Wednesday and the component name is "foo". I am convinced that there are too many novel cases to model ahead of time, so here is a proposal to normalize attributes as much as possible, while still allowing component authors to both add or subtract attributes from their telemetry when appropriate.
I will use Loggers as the example here, but I believe the same general ideas can be applied to TracerProviders and MeterProviders.
General Idea:
Create a "Logger" struct specifically for use within components. This logger satisfies the same interface and behaves essentially the same as what we have today. However, it has the following important differences:
Without(key ...string)
method is provided, which allows component authors to remove attributes as appropriate for portions of the code that are shared. This is opposite the typical pattern used by loggers, but necessary if we want our default set of attributes to be correct.For example:
Without("otelcol.signal")
during construction to create a logger appropriate for the shared instance.Without("otelcol.signal", "otelcol.component.id", "otelcol.pipeline.id")
.Without("otelcol.signal", "otelcol.component.id")
to adjust the scope.This implementation looks something like this:
TracerProvider and MeterProvider can expose a similar
With
/Without
interface, but the implementation will be different.The text was updated successfully, but these errors were encountered: