-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[exporter/kafka] Replace "topic" setting by "traces_topic", "logs_topic" and "metrics_topic" #35432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This is still relevant and I would actually consider this a bug instead of a feature because setting the provided configuration value leads to runtime errors. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Anyone to respond except a bot? |
@aklemp this seems reasonable to me on the surface. Before pressing ahead, I'd like to discuss the alternatives a bit.
Agreed. On a more general note, it may be useful for the exporter to dynamically choose the topic based on some information it has about the client, such as a tenant ID. This could be done at the request/batch level, so you wouldn't need to evaluate it for every single message. For example you might set a topic name template to something like There's another alternative that I have been thinking about on and off: what if we sent everything to the same topic, and included the signal type and encoding as message headers? Then we could support producing to and receiving from a single topic: the receiver would use the signal type and encoding headers to figure out how to decode the data, and use the signal type header to route to the correct pipeline. Have you considered this option? Would it suit your needs, or do you prefer separating signals into different topics? |
@axw Thank you for your response. I'm not sure about choosing topics dynamically based on client information. We currently don't have a use case for that and to me it would be probably a decision per telemetry type, like logs for this client are going here and logs for that client are going there. The telemetry type itself is already differentiated by the collector and I could apply different pipelines and configuration for that so the exporter doesn't have to worry. Regarding sending all telemetry types via one topic is possible and basically an internal contract between exporter and receiver. The OpenTelemetry user isn't aware of that detail. But after thinking about it, I found several arguments from architecture, security and IT operations perspective against it.
From exporter/receiver development perspective there are also a few arguments against joining all telemetry data on a single topic.
So overall, I would still prefer a solution with separate but easily to configure topics (without duplicating Kafka cluster configuration) together with #32735 (just closed because of inactivity). |
@aklemp thank you! That all makes sense to me, except:
I ask because if these are per-signal concerns, then I would be worried about a "slippery slope" of
OK. For what it's worth, this is a concern for my team, that's why I brought it up. It's a way of enabling multi-tenancy: https://kafka.apache.org/documentation/#multitenancy-topic-naming One major benefit of having a combined topic for all the signals is that it may lead to fewer partitions. This can matter at large scale (e.g. when combined with per-tenant topics), as each cluster can only handle so many partitions, and managed Kafka (e.g. AWS MSK) typically have a per-partition cost. I am going to investigate the template approach a bit more. Essentially what I have in mind is to to enable setting a config like this: receivers:
kafka:
topic: "custom_${signal}_topic" # defaults to "otlp_${signal}"
exporters:
kafka:
topic: "custom_${signal}_topic" # defaults to "otlp_${signal}" (Not necessarily with that syntax.) I think we need to make that work to enable namespacing, but if for whatever reason it can't be done then I think we could consider the |
I agree that it sounds strange to do encryption or virus scanning per signal type. Despite the fact that it would just give the user freedom to do so, one might argue that metrics and traces have a quite defined format and transport technical information. Logs however could contain anything including confidential data (not a great idea to log such things, but sometimes it is, what it is...). Processes like the mentioned increase latency and resource consumption (e.g. CPU, memory), so one might want to apply these only for the risky signals. A template like |
For the template, I can think of a few options:
Pros:
Cons:
Pros:
Cons:
Basically the same pros/cons as OTTL, but a little less verbose and a little more expressive (e.g. it has conditonals built into the language). Expr-lang is used in the collector already (namely in receivercreator), but is less prolific than OTTL. Simple case: I'm currently looking into whether we could combine either OTTL expressions or expr-lang with confmap's variable resolution. That way we could do something like |
Of course it now occurs to me that if we use confmap variable syntax, we'll need to escape it in configurations, which may become a footgun. I can't think of a better alternative at the moment... |
I like the approach with alternatives with pros/cons. Unfortunately, I don't have deep experience either in Go or with Otel Collector best practices. So I cannot give a useful opinion on the options. The only thing is from user perspective, that simple and less verbose is probably better. |
Thanks @aklemp. I'll bring this to the attention of other code owners and go from there. |
I've opened a proposal here: #38888 |
Deprecate `topic` and `encoding`, and introduce signal-specific equivalents: - `logs::topic`, `metrics::topic`, and `traces::topic` - `logs::encoding`, `metrics::encoding`, and `traces::encoding` This enables users to explicitly define a configuration equivalent to the default configuration, or some variation thereof. It also enables specifying different encodings for each signal type, which may be important due to the fact that some encodings only support a subset of signals. Closes open-telemetry#35432
Deprecate `topic` and `encoding`, and introduce signal-specific equivalents: - `logs::topic`, `metrics::topic`, and `traces::topic` - `logs::encoding`, `metrics::encoding`, and `traces::encoding` This enables users to explicitly define a configuration equivalent to the default configuration, or some variation thereof. It also enables specifying different encodings for each signal type, which may be important due to the fact that some encodings only support a subset of signals. Closes open-telemetry#35432
Deprecate `topic` and `encoding`, and introduce signal-specific equivalents: - `logs::topic`, `metrics::topic`, and `traces::topic` - `logs::encoding`, `metrics::encoding`, and `traces::encoding` This enables users to explicitly define a configuration equivalent to the default configuration, or some variation thereof. It also enables specifying different encodings for each signal type, which may be important due to the fact that some encodings only support a subset of signals. Closes open-telemetry#35432
Deprecate `topic` and `encoding`, and introduce signal-specific equivalents: - `logs::topic`, `metrics::topic`, and `traces::topic` - `logs::encoding`, `metrics::encoding`, and `traces::encoding` This enables users to explicitly define a configuration equivalent to the default configuration, or some variation thereof. It also enables specifying different encodings for each signal type, which may be important due to the fact that some encodings only support a subset of signals. Closes open-telemetry#35432
) #### Description Deprecate `topic` and `encoding`, and introduce signal-specific equivalents: - `logs::topic`, `metrics::topic`, and `traces::topic` - `logs::encoding`, `metrics::encoding`, and `traces::encoding` This enables users to explicitly define a configuration equivalent to the default configuration, or some variation thereof. It also enables specifying different encodings for each signal type, which may be important due to the fact that some encodings only support a subset of signals. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#35432 #### Testing Unit tests added. #### Documentation Updated README. --------- Co-authored-by: Antoine Toulme <[email protected]>
@axw I was wondering when and how the version v0.124.0 containing the changes is created on Docker hub. The last version there is 0.123.0 from 22 days ago... |
) #### Description Deprecate `topic` and `encoding`, and introduce signal-specific equivalents: - `logs::topic`, `metrics::topic`, and `traces::topic` - `logs::encoding`, `metrics::encoding`, and `traces::encoding` This enables users to explicitly define a configuration equivalent to the default configuration, or some variation thereof. It also enables specifying different encodings for each signal type, which may be important due to the fact that some encodings only support a subset of signals. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#35432 #### Testing Unit tests added. #### Documentation Updated README. --------- Co-authored-by: Antoine Toulme <[email protected]>
Component(s)
exporter/kafka
Is your feature request related to a problem? Please describe.
Inspired by #32735 because it is a related problem:
When the setting "topic" is not specified, the same kafka exporter config can be used in all three pipelines if the topic names match the default values:
If the topic is set to any value, this structure will work in exporter perspective.
What happens in this case is that the three exporters will send to the same topic. This is a race condition that will succeed in 1/3 of scenarios at the receiving end (see #32735).
To avoid this problem, the user must create three different exporters for each pipeline to set custom topic names. This is error prone and inconsistent with the default behavior that allows having one exporter for all three pipelines with the default topic names.
Describe the solution you'd like
Having three different topic names by default but being able to override it with only a single one is a strange feature.
Just create three topic properties of the kafka exporter:
Alternative definition (solution should match for exporter and receiver):
Describe alternatives you've considered
The documentation gives some hints about determining the actual topic:
Current workaround: define three exporters with all properties redundant except the
topic
property and use them individually in three pipelines.Additional context
No response
The text was updated successfully, but these errors were encountered: