Skip to content

System for managing own telemetry attributes within pipeline components #12217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
djaglowski opened this issue Jan 30, 2025 · 41 comments · Fixed by #12617
Closed

System for managing own telemetry attributes within pipeline components #12217

djaglowski opened this issue Jan 30, 2025 · 41 comments · Fixed by #12617
Assignees

Comments

@djaglowski
Copy link
Member

djaglowski commented Jan 30, 2025

Some types of components are fully or partially shared across multiple instances. (e.g. otlp receiver / memory limiter)

I have previously pursued some optimistic models in which the collector would define certain ways in which sharing is "supported", but the reality is that sharing is an unbounded problem. e.g. Someone could write a processor that shares a data structure between logs and traces pipelines only if it's instantiated on a Wednesday and the component name is "foo". I am convinced that there are too many novel cases to model ahead of time, so here is a proposal to normalize attributes as much as possible, while still allowing component authors to both add or subtract attributes from their telemetry when appropriate.

I will use Loggers as the example here, but I believe the same general ideas can be applied to TracerProviders and MeterProviders.


General Idea:

Create a "Logger" struct specifically for use within components. This logger satisfies the same interface and behaves essentially the same as what we have today. However, it has the following important differences:

  1. The attributes are preset for the component as if it uses no sharing of parts with other instances of the component. (As defined in this RFC)
  2. A Without(key ...string) method is provided, which allows component authors to remove attributes as appropriate for portions of the code that are shared. This is opposite the typical pattern used by loggers, but necessary if we want our default set of attributes to be correct.

For example:

  • OTLP receiver: Since the entire component struct is shared, the factory will use Without("otelcol.signal") during construction to create a logger appropriate for the shared instance.
  • Memory limiter processor: For any code outside of the shared part, it can use the logger which it was given. For the shared part, is can create a new logger by calling Without("otelcol.signal", "otelcol.component.id", "otelcol.pipeline.id").
  • Custom component mentioned above: Typically uses the given logger, but if creating a logs or traces instance on a Wednesday and the component name is "foo", then it calls Without("otelcol.signal", "otelcol.component.id") to adjust the scope.

This implementation looks something like this:

import (
	"go.opentelemetry.io/otel/attribute"
	"go.uber.org/zap"
)

type ComponentLogger struct {
	*zap.Logger
	root *zap.Logger
	diff []attribute.KeyValue
}

func NewComponentLogger(root *zap.Logger, defaultAttrs map[string]string) *ComponentLogger {
	defaultLogger := root
	for k, v := range defaultAttrs {
		defaultLogger = defaultLogger.With(zap.String(k, v))
	}

	return &ComponentLogger{
		Logger: defaultLogger,
		root:   root,
		diff:   defaultAttrs,
	}
}

func (cl *ComponentLogger) Without(keys ...string) *zap.Logger {
	excludeKeys := make(map[string]struct{})
	for _, key := range keys {
		excludeKeys[key] = struct{}{}
	}

	logger := cl.root
	for _, attr := range cl.diff {
		if _, excluded := excludeKeys[string(attr.Key)]; !excluded {
			logger = logger.With(zap.String(k, v))
		}
	}
	return logger
}

func (cl *ComponentLogger) With(fields ...zap.Field) *ComponentLogger {
	// Create new diff by appending the new fields as attributes
	newDiff := make([]attribute.KeyValue, len(cl.diff))
	copy(newDiff, cl.diff)
	for _, f := range fields {
		newDiff = append(newDiff, attribute.Any(f.Key, f.Interface))
	}

	return &ComponentLogger{
		Logger: cl.Logger.With(fields...),
		root:   cl.root,
		diff:   newDiff,
	}
}

TracerProvider and MeterProvider can expose a similar With/Without interface, but the implementation will be different.

@dmitryax
Copy link
Member

I like the general idea. A couple of questions:

  • Is there a need to have With method at this point?
  • Requiring string arguments for Without might be error-prone. Yes, we can provide public constants. But maybe we can have more specific methods, e.g., WithSharedSignal(), instead?

@djaglowski
Copy link
Member Author

Is there a need to have With method at this point?

We can leave it out. There may be some uses out there already, but they would be able to access it on the embedded zap logger.

Requiring string arguments for Without might be error-prone. Yes, we can provide public constants. But maybe we can have more specific methods, e.g., WithSharedSignal(), instead?

We have 5 different attributes defined in the RFC, which differ based on component kind. Additionally, I would hope that in practice we see the need for removing attributes only a few specific cases. So my opinion is that such methods would be either inapplicable (e.g. receiver doesn't have a otelcol.pipeline.id to remove) or unadvisable in most cases. That said, I think limiting the values to those we enumerate makes things less error prone, so that would be my preference.

@djaglowski
Copy link
Member Author

I think we need a new module which eventually contains:

  • Attribute keys agreed upon in the RFC
  • The logger described above
  • Similar meter & tracer providers

Constraints:

  • Component implementations will use the package, so cannot be in service.
  • Should be able to construct appropriate attribute sets from e.g. pipeline.Signal and component.ID, so cannot be in component. Seems out of place in pipeline too though.

The best I can come up with is component/componentattribute. Does this seem reasonable @open-telemetry/collector-maintainers, or do you have other suggestions?

@djaglowski
Copy link
Member Author

Draft solution: #12259

@mx-psi
Copy link
Member

mx-psi commented Feb 4, 2025

The best I can come up with is component/componentattribute. Does this seem reasonable @open-telemetry/collector-maintainers, or do you have other suggestions?

Sounds reasonable to me

github-merge-queue bot pushed a commit that referenced this issue Feb 6, 2025
Implements the logger described in
#12217

Alternative to #12057

Resolves #11814

`component/componentattribute`:
- Initializes new module
- Defines constants for component telemetry attribute keys
- Defines a `zapcore.Core` which can remove attributes from the root
logger

`service`:
- Rebases component instantiation on attribute sets
- Internal constructors for attribute sets for each component type
- Constructs loggers from `componentattribute`

`otlpreceiver`:
- Uses `componentattribute` to remove `otelcol.signal` attribute from
logger

`memorylimiter`:
- Uses `componentattribute` to remove `otelcol.signal`,
`otelcol.pipeline.id` and `otelcol.component.id` attributes from logger
@dmitryax
Copy link
Member

dmitryax commented Feb 7, 2025

We have 5 different attributes defined in the RFC, which differ based on component kind. Additionally, I would hope that in practice we see the need for removing attributes only a few specific cases. So my opinion is that such methods would be either inapplicable (e.g. receiver doesn't have a otelcol.pipeline.id to remove) or unadvisable in most cases. That said, I think limiting the values to those we enumerate makes things less error prone, so that would be my preference.

We have the solution for the logger now. However, we will likely need to apply a similar approach to the meter and tracer in the future. If we keep adding Without to them, it'll be hard for API users to catch up and keep it consistent. Especially when we need to replace data_type metrics attribute with otelcol.signal (probably via a feature gate). Having something like WithSharedSIgnal would help to consolidate it across the signals and centralize such migrations if needed. I don't believe we need dropping any other attributes at this point. Once we have such a need, a new method could be added, even if it's not useful in some scenarios.

Also, I think it could be one mutation call to TelemetrySettings instead of changing logger, meter and tracer separately. Something like TelemetrySettings.SharedSignal would return another instance of TelemetrySettings with logger, meter, and tracer not having that attribute.

@mx-psi mx-psi moved this from Todo to In Progress in Collector: v1 Feb 7, 2025
@mx-psi
Copy link
Member

mx-psi commented Feb 7, 2025

Also, I think it could be one mutation call to TelemetrySettings instead of changing logger, meter and tracer separately. Something like TelemetrySettings.SharedSignal would return another instance of TelemetrySettings with logger, meter, and tracer not having that attribute.

TelemetrySettings does not have a meter and a tracer, it has a meter provider and a tracer provider. These do not have any attributes set

@jade-guiton-dd
Copy link
Contributor

jade-guiton-dd commented Feb 7, 2025

The meter/tracer provider don't contain attributes themselves, so to propagate the component attributes we would probably want to add separatetrace.SpanStartEventOption and metric.MeasurementOption fields to TelemetrySettings (or just a []attribute.KeyValue to be wrapped in options later). In that context, I think having a mutable method on TelemetrySettings to subtract attribute keys for all three signals at once would be convenient.

I do however think the Without approach is more flexible then WithSharedSignal or something similarly specific; @djaglowski's PR already included a use case for Without("otelcol.component.id") in the form of the memorylimiterprocessor.

@djaglowski
Copy link
Member Author

djaglowski commented Feb 7, 2025

Since MeterProvider and TracerProvider are interfaces, my plan was to wrap them into new implementations of their respective interfaces, where the new implementations have default attributes that can be removed by component authors using Without.

Am I understanding correctly, that you'd like us to have a single Without method which updates the TelemetrySettings as a whole, so that the Logger, MeterProvider and TracerProvider all have the same attributes modified? This seems reasonable to me.

Having something like WithSharedSIgnal would help to consolidate it across the signals and centralize such migrations if needed.

I don't think it's necessary vs just listing constants, e.g., in memory limiter:

  1. telSet = telSet.Without(semconv.OTelColComponentID, semconv.OTelColPipelineID, semconv.OTelColSignal)
  2. telSet = telSet.WithSharedComponentID().WithSharedPipelineID().WithSharedSignal()
  3. telSet = telSet.WithShared(WithSharedComponentID(), WithSharedPipelineID(), WithSharedSignal())

Minor notes:

  • (2) and maybe (3) require recreation of the TelemetrySettings multiple times.
  • If (3), is there a better name for the method?

Personally I like (1) best as I believe component authors should be thinking about the attributes on their telemetry, rather than thinking about an abstraction which they may not recognize as impacting their telemetry.

@mx-psi mx-psi added the release:blocker The issue must be resolved before cutting the next release label Feb 7, 2025
@mx-psi
Copy link
Member

mx-psi commented Feb 7, 2025

Since MeterProvider and TracerProvider are interfaces, my plan was to wrap them into new implementations of their respective interfaces, where the new implementations have default attributes that can be removed by component authors using Without.

If the intent is to use metric-level attributes for this, I suspect it will be a bit complex to do this for MeterProvider, given the number of instruments that you would have to wrap and because of the unfortunate quirks of the spec. I am not saying that we shouldn't do it because of that, just that it is probably harder than it looks.

Since the discussion includes the possibility of modifying the recently added method LoggerWithout I marked this as release:blocker (we can always just remove this method before the release if we do not reach consensus).

@mx-psi
Copy link
Member

mx-psi commented Feb 7, 2025

I tried doing the MeterProvider wrapper: you need at least 21 wrapper structs (MeterProvider + Meter + each instrument + the observables for async instruments) to make it work. We could do it if we want to, it just makes me feel like this is not the right solution, given how complicated this is.

@mx-psi
Copy link
Member

mx-psi commented Feb 7, 2025

Concretely: I wonder if metric attributes are not the right level for this, and these should be instrumentation attributes or resource attributes. The RFC is not very clear about what type of attributes these should be.

@dmitryax
Copy link
Member

dmitryax commented Feb 7, 2025

Assuming we can find a clean way to move towards this approach and avoid overcomplication, I would prefer the option (3)

telSet = telSet.WithShared(WithSharedComponentID(), WithSharedPipelineID(), WithSharedSignal())

It'll help us control the migration of some attribute keys (e.g., data_type -> otelcol.signal) and make the interface less error-prone.

Concretely: I wonder if metric attributes are not the right level for this, and these should be instrumentation attributes or resource attributes. The RFC is not very clear about what type of attributes these should be.

We need to think about what the resource is in our case (which entity). We've been considering the collector as the resource. If we want to move some attributes to the resource level, we need to reconsider this. I think we can move all the attributes to the resource level if we consider the resource to be component instance. However, this can complicate re-aggregation. Currently, the collector doesn't provide any processor to do aggregation by resource attributes as far as I know, only by the datapoint attributes . So I'm not sure it that's the right move, but we still can consider it if that's required to proceed with the RFC.

@djaglowski
Copy link
Member Author

I tried doing the MeterProvider wrapper: you need at least 21 wrapper structs (MeterProvider + Meter + each instrument + the observables for async instruments) to make it work. We could do it if we want to, it just makes me feel like this is not the right solution, given how complicated this is.

Since the library we are wrapping is stable, this seems reasonable to me. However, I think it's worth clarifying first whether we want to model component instances as resources or entities. IMO instrumentation is not technically correct because we intend to have external code (e.g. in the service graph) produce signals describing the component instances, but not originating from within them.

@jade-guiton-dd
Copy link
Contributor

jade-guiton-dd commented Feb 10, 2025

My impression is that resource attributes describe where the telemetry is coming "from" as well, so I don't think it would be correct either. I'm not sure there is a convention for instrumentation libraries (here, the pipeline) to indicate which user (a component) of the instrumented library (the Consumer API?) caused the observed event. My opinion is that perhaps the least bad solution would be to set the instrumentation library name to go.opentelemetry.io/collector/service or something similar, but use instrumentation attributes to describe the instrumented component?

@mx-psi
Copy link
Member

mx-psi commented Feb 10, 2025

@djaglowski could we add the clarification of what kind of attributes these are to the RFC? I don't have a strong opinion, but I think we can close the discussion that way.

Also, @djaglowski @dmitryax I filed #12334 so that we can keep on experimenting on this while being able to mark component as 1.0

@djaglowski
Copy link
Member Author

could we add the clarification of what kind of attributes these are to the RFC? I don't have a strong opinion, but I think we can close the discussion that way.

I agree we should update the RFC but I'd like to have some idea of what to propose. I'm unclear myself on what type of attributes are reasonable.

My impression is that resource attributes describe where the telemetry is coming "from" as well, so I don't think it would be correct either.

Resource attributes don't necessarily have to be applied from within the same code that the signals describe. For example, most pull-based receivers attribute signals to a resource, despite the receiver not being part of that resource.


If we were to say that the RFC describes resource attributes, how is this applied for loggers?

@mx-psi
Copy link
Member

mx-psi commented Feb 11, 2025

how is this applied for loggers?

I would guess OTLP logs export would use the resource while zap Logger export would flatten things and set those as attributes?

github-merge-queue bot pushed a commit that referenced this issue Feb 18, 2025
github-merge-queue bot pushed a commit that referenced this issue Feb 18, 2025
@jade-guiton-dd jade-guiton-dd moved this from In Progress to Waiting for reviews in Collector: v1 Feb 19, 2025
@jade-guiton-dd
Copy link
Contributor

jade-guiton-dd commented Feb 27, 2025

I'm reconsidering if wrapping Providers (and Meters, and every instrument...) to add span/metric attributes is the most maintainable way to go about this.

I think there's a few more options we haven't yet fully explored:

  1. Adding component attributes as instrumentation scope attributes.

    For metrics and spans, we would keep the wrappers only for the Providers, and add the attributes when creating the Meter/Tracer. This would eliminate most of the complexity of the wrappers, which is mostly due to needing to wrap individual metric instruments.

    (Alternative A: Get rid of wrappers entirely and modify mdatagen to add the attributes manually in NewTelemetryBuilder. The caveat is that this would not work for instrumentation libraries outside of the Collector like otelgrpc.)

    As for logs, for consistency, we could modify the otelzap bridge to allow converting some Zap fields into scope attributes. Although, this may have some overhead if we have to recreate Loggers constantly.

    (Alternative B: Accept that our logs will never be OTel-native and keep using record-level attributes for logs only.)

    I believe the main point of contention is semantics: Should a scope only describe the code base producing the telemetry, or can it also describe a specific instantiation of said code base?
    (I think a "sub-resource" would make more sense here, but unless we add some API to the SDK to specialize Providers without recreating the entire exporting pipeline, I don't think it's doable.)

  2. Use SDK-level processors to add the attributes to the telemetry.

    As noted in their documentation, SDK processors are called in the order they are registered, and can modify the telemetry along the way without passing them to an exporter. This is for example used in the baggagecopy processor. This means that we could add a custom processor before the ones defined in the config, which adds the relevant attributes to the telemetry without wrapping the Providers or the Logger.

    There are a number of issues that complicate this approach, but I believe they are solvable:

    1. The go-contrib config package takes care of instantiating all the providers/processors/exporters, so we may need a new API to make it cooperate in adding our custom processor.

    2. Processors are set at the Provider level, and are passed the calling Context and a mutable span / log record. In the current state of things, this is not enough to determine the component attributes.

      My proposal would be start threading our component attributes through the Context instead of the TelemetrySettings, by adding them as a Context key.

      This key would be set by the service at all entry points into the Component, such as Factory.CreateTraces or Component.Start/Shutdown, and it would be updated as Contexts progress through the pipeline, by resetting the key inside a service-provided wrapper around the Consumer API (which we will need later to implement pipeline telemetry anyway). The WithoutAttributes function could then simply return a modified Context.

    3. Metric processors simply do not exist at the moment. This is especially disappointing as metrics are the most problematic signal to wrap in the current implementation.

      There is however an ongoing proposal to add them to the spec. (This somewhat related feature request in the SDK may also be of interest.)

      We could try to help move those proposals forward, but if we think they will take too long to come to fruition, we could still go with the Context-based approach, but implement it for metrics using wrappers (similar to the current implementation). And hopefully, we would eventually replace that with a metric processor-based approach.

What are your thoughts on these two proposals?

@djaglowski
Copy link
Member Author

Setting aside the details of this design discussion for a moment, I think it's important to acknowledge that the while the Collector is part of the OpenTelemetry ecosystem the task of instrumenting it should be similar to what we would ask of any OpenTelemetry instrumentation user.

With that in mind, here is the vision statement for OpenTelemetry which should be quite relevant to instrumentation users. Particularly relevant to our discussion:


The main question in my mind is why our experience using OpenTelemetry instrumentation isn't living up to the vision. Why is it so difficult (especially for us)? Why are we considering solutions that produce non-correlated telemetry? Why are we considering solutions which would require us to use our positions as community members to insert special cases into the instrumentation codebase?

IMO the root cause is the asymmetry between the traditional logger, which doesn't understand our data model, and the Tracer/MeterProvider. If we had a LoggerProvider which mirrors the design of the others and works naturally with the data model, then I think we'd have far less trouble deciding where these attributes should be placed. However, OpenTelemetry's Logs API is intended for use by log appender authors, so it's unclear that we should use it at all.

Have others arrived at the same problem when attempting to implement telemetry which can be correlated across signals? Does the project have an answer to this yet? Maybe something @tigrannajaryan, @jmacd, or someone else on the TC could answer?

@jade-guiton-dd
Copy link
Contributor

jade-guiton-dd commented Feb 28, 2025

If we're searching for root causes for the complexity of this issue:

  • I agree that using Zap instead of our own LoggerProvider is part of it. Partly because it doesn't understand our data model (scopes / resources etc.), and partly because unlike the other fields in TelemetrySettings, it is already specialized to a specific scope instead of being the equivalent of a Provider.

  • But I think the main root cause is the fact that despite the already complex data model, there is no obvious place for these attributes. None of the existing "attribute-holding entities" match our use case, and thus their API cannot be used as is.

    Our attributes:

    • describe the source of the telemetry at a granular, sub-process level, unlike resources which are set when you set up the telemetry pipeline for the entire process;
    • describe a concrete instance of a component, unlike scopes which typically only describe the module and version of the telemetry-emitting code;
    • are set ahead of, and outside of the telemetry-emitting code, unlike metric/span/log attributes.

    As I mentioned above, I feel like ideally, a new concept like "sub-resources" could be introduced to properly take this use case into account, with an API that fits our needs better and does not require workarounds.

    (Even more ideally, I think we shouldn't have formalized specific attribute-holding entities in the first place. Instead of having to have exactly one of each and structuring the API around that, the spec could have been more generic and allowed to specialize your Logger/Tracer/Meter any number of times by adding attributes at any point in the pipeline. But that's only wishful thinking at this point.)

  • The lack of this concept could perhaps be mitigated with processors, but unfortunately, here too:

    • There is an asymmetry between signals: the lack of processors for metrics
    • The intended use case (and thus the API) does not quite fit: they are meant to do processing based on the Context and the emitted data, not based on the object which emitted it

Unfortunately, solving point 1 would require breaking changes to a fundamental part of our API, and would not solve point 2. And solving point 2 would require large changes to the OTel API and Data model. In other words, I think it's too late, and we have to resign ourselves to workarounds.

@mx-psi mx-psi moved this from Waiting for reviews to In Progress in Collector: v1 Mar 3, 2025
@jade-guiton-dd
Copy link
Contributor

Summary of discussion about this on the Collector stability meeting:

  1. Using instrumentation scope attributes would be problematic because of the lack of support from backend vendors
  2. The solution currently used in my PR is fine and, while ugly, is unlikely to break things often or silently.
  3. However, we should discuss this issue (and possibly [telemetry] extend collector telemetry with additional attributes #12316) with the Go SIG to see if SDK-level features could let us rewrite this more cleanly later. I'll file an issue on the Go repo about it.

@jade-guiton-dd
Copy link
Contributor

I opened this issue on the Go repo to ask if the Go SIG has ideas for a better implementation.

@jade-guiton-dd jade-guiton-dd moved this from In Progress to Waiting for reviews in Collector: v1 Mar 4, 2025
@tigrannajaryan
Copy link
Member

However, OpenTelemetry's Logs API is intended for use by log appender authors, so it's unclear that we should use it at all.

This is being changed, see open-telemetry/opentelemetry-specification#4352 and open-telemetry/opentelemetry-specification#4438

Maybe something @tigrannajaryan, @jmacd, or someone else on the TC could answer?

@djaglowski I haven't read the entire thread, anything else you wanted me to help answer other than the question about log appenders?

@pellared may be able to help with Go Logs SDK. I think LoggerProvider was supposed to mirror the design of TracerProvider, but my knowledge may be outdated.

@jmacd
Copy link
Contributor

jmacd commented Mar 4, 2025

open-telemetry/opentelemetry-go#6404 (comment)

I replied to the new issue posted by @jade-guiton-dd, since it narrows the discussion. I see at least three ideas being discussed there, and all of them are backed by a stalled OpenTelemetry discussion. To summarize:

  1. We can't use InstrumentationScope.Attributes, but why not?
  2. We have talked about bound metric instruments, but why not bound tracer/logger instruments?
  3. We have talked about a MetricsProcessor for extracting context-scoped attributes, but wouldn't it be nice if there was a first-class instrumentation API for this?

I'd like to see more energy in the OpenTelemetry Specification SIG discussing these topics. It's great that we're finally seeing these problems within our own codebase.

@mx-psi mx-psi moved this from Waiting for reviews to Blocked in Collector: v1 Mar 6, 2025
@axw
Copy link
Contributor

axw commented Mar 7, 2025

(Apologies if my questions are answered already - there's a lot of context to take in here.)

I agree that using Zap instead of our own LoggerProvider is part of it. Partly because it doesn't understand our data model (scopes / resources etc.), and partly because unlike the other fields in TelemetrySettings, it is already specialized to a specific scope instead of being the equivalent of a Provider.

Is this something that could be changed? For example, could we replace the TelemetrySettings.Logger field with a method that instantiates a new logger, given an instrumentation scope and options, similar to if you were to call LoggerProvider.Logger? i.e.

diff --git a/component/telemetry.go b/component/telemetry.go
index 994064d9d..b2936ad5a 100644
--- a/component/telemetry.go
+++ b/component/telemetry.go
@@ -4,6 +4,10 @@
 package component // import "go.opentelemetry.io/collector/component"
 
 import (
+       "slices"
+
+       "go.opentelemetry.io/contrib/bridges/otelzap"
+       "go.opentelemetry.io/otel/log"
        "go.opentelemetry.io/otel/metric"
        "go.opentelemetry.io/otel/trace"
        "go.uber.org/zap"
@@ -13,9 +17,8 @@ import (
 
 // TelemetrySettings provides components with APIs to report telemetry.
 type TelemetrySettings struct {
-       // Logger that the factory can use during creation and can pass to the created
-       // component to be used later as well.
-       Logger *zap.Logger
+       // LoggerProvider that the factory can pass to other instrumented third-party libraries.
+       LoggerProvider log.LoggerProvider
 
        // TracerProvider that the factory can pass to other instrumented third-party libraries.
        TracerProvider trace.TracerProvider
@@ -26,3 +29,20 @@ type TelemetrySettings struct {
        // Resource contains the resource attributes for the collector's telemetry.
        Resource pcommon.Resource
 }
+
+func (s *TelemetrySettings) Logger(name string, options ...log.LoggerOption) *zap.Logger {
+       p := boundOptionsLoggerProvider{p, s.LoggerProvider, options: options}
+       core := otelzap.NewCore(name, otelzap.WithLoggerProvider(&lp))
+       // (plus all of the stuff that telemetry.newLogger does here...)
+       return zap.New(core)
+}
+
+type boundOptionsLoggerProvider struct {
+       p       log.LoggerProvider
+       options log.LoggerOption
+}
+
+func (b *boundOptionsLoggerProvider) Logger(name string, options ...log.LoggerOption) log.Logger {
+       options = slices.Insert(options, 0, b.options)
+       return b.p.Logger(name, options...)
+}

That will still mean a lot of changes to all the components, but it seems achievable.

(I realise there's a bunch of issues to solve here, just looking for an easy win!)

@jade-guiton-dd
Copy link
Contributor

jade-guiton-dd commented Mar 7, 2025

@jmacd To react to your point 1:
Besides the blurry semantics, I believe the main reason we don't want to use scope attributes here is not spec-related, it's just that they're not well supported anywhere (usually ignored by backend vendors, but even otelzap doesn't support them). This is of course fixable. The question is whether we want to do that work for something it seems only we would be using, and how much we are willing to strongarm backend vendors into supporting them. We could provide some kind of migration path like "convert into span/metric/log attributes with a transform processor", though I'm afraid that might stick and become the defacto solution.

@axw This would be a huge breaking change, and unfortunately, the component module which defines TelemetrySettings was marked as stable 2 weeks ago (#9376). A big change like that would probably have to wait until v2.0 (assuming we're allowing ourselves to do that eventually?)

I've been trying to prototype how to recreate a zap.Logger with new scope attributes without breaking the API or adding special cases in otelzap, and my current idea is pretty similar to yours, except (similar to the current code) we use a custom Zap core which references a shared struct containing the log.LoggerProvider (plus a customizable "wrapper" function containing "all of the stuff that telemetry.newLogger does"). If people think we should proceed with scope attributes despite its issues, I can finish up that prototype and make a draft PR to compare.

@axw
Copy link
Contributor

axw commented Mar 8, 2025

@axw This would be a huge breaking change, and unfortunately, the component module which defines TelemetrySettings was marked as stable 2 weeks ago (#9376). A big change like that would probably have to wait until v2.0 (assuming we're allowing ourselves to do that eventually?)

I see, that's unfortunate - this seems like a pretty fundamental thing to get right.

As an alternative, the Logger field could remain as-is, and a new LoggerProvider field and new method (e.g. NewLogger or ScopeLogger or something) could be added.

Besides the blurry semantics, I believe the main reason we don't want to use scope attributes here is not spec-related, it's just that they're not well supported anywhere (usually ignored by backend vendors, but even otelzap doesn't support them).
...
If people think we should proceed with scope attributes despite its issues, I can finish up that prototype and make a draft PR to compare.

FWIW, I don't think semantics should be compromised due to what vendors do right now.

It's unclear to me that instrumentation scope is a good fit for all of the universal telemetry attributes though. I've always understood instrumentation scope as being related to static source code, and not some instantiation of that code. Am I wrong on that? If I'm not wrong, then I think at least the following attributes don't really belong at the instrumentation scope level:

  • otelcol.component.id -- this is obviously per-instance, and not statically defined at the code level
  • otelcol.signal and otelcol.signal.output -- all components can theoretically deal with all signal types, which one depends on the instantiation

otelcol.component.id is somewhat similar to service.instance.id, and I tend to agree with @jade-guiton-dd's idea of "sub-resources". This one is a property of the thing producing the telemetry; it is not a property of the telemetry, like otelcol.signal.

@jade-guiton-dd
Copy link
Contributor

jade-guiton-dd commented Mar 11, 2025

After bringing up the topic in a Spec SIG meeting, it sounds like the current intent for instrumentation scope attributes would very much encompass our use case, so scope attributes should be fine semantics-wise.

With that issue out of the way, I think the only big remaining concern with switching to scope attributes is their poor support by backend vendors. I think it would be reasonable to push for backends to support them, considering they have been in the spec for a while, their handling should be quite similar to resource attributes, and support would benefit other use cases.

We may want to provide a compatibility solution for users who want to access them before support is ready (the transform processor should be fine if they have a Collector on the path, but maybe there are more general solutions?), but I think this could be handled separately.


To recap the benefits of the scope attribute approach vs. the current metric/log/span approach:

  • Although the Logger wrapper becomes a bit more complicated, the code is overall simpler as we no longer need to wrap Logger/Tracer/Meter and every single metric instrument, with the API coupling this adds
  • Reduced data volume, as the attributes are only sent once per scope instead of on every metric point / log record / span

And I don't think there are other approaches we could make use of right now in the current state of things. For that reason, I'm planning to bring the topic up again in next week's Collector stability and Collector SIG meetings, and unless there are strong objections, I will keep working on a new PR using scope attributes and file it so people can compare with the old PR code-wise.

@jade-guiton-dd
Copy link
Contributor

Still work in progress, but I filed a PR with the scope attribute implementation: #12617

@jade-guiton-dd
Copy link
Contributor

I filed an issue on collector-contrib to track support for instrumentation scope attributes across the exporters there: open-telemetry/opentelemetry-collector-contrib#38744

github-merge-queue bot pushed a commit that referenced this issue Mar 28, 2025
#### Description

Fork of #12384 to showcase how component attributes can be injected into
scope attributes instead of log/metric/span attributes. See that PR for
more context.

To see the diff from the previous PR, filter changes starting from the
"Prototype using scope attributes" commit.

#### Link to tracking issue
Resolves #12217 
Also incidentally resolves #12213 and resolves #12117

#### Testing
I updated the existing tests to check for scope attributes, and did some
manual testing with a debug exporter to check that the scope attributes
are added/removed properly.

---------

Co-authored-by: Pablo Baeyens <[email protected]>
@github-project-automation github-project-automation bot moved this from Waiting for reviews to Done in Collector: v1 Mar 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment