Skip to content

[exporter/kafka] Replace "topic" setting by "traces_topic", "logs_topic" and "metrics_topic" #35432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aklemp opened this issue Sep 26, 2024 · 16 comments · Fixed by #39204
Closed
Labels
enhancement New feature or request exporter/kafka

Comments

@aklemp
Copy link

aklemp commented Sep 26, 2024

Component(s)

exporter/kafka

Is your feature request related to a problem? Please describe.

Inspired by #32735 because it is a related problem:

When the setting "topic" is not specified, the same kafka exporter config can be used in all three pipelines if the topic names match the default values:

exporter:
  kafka:

pipelines: 
  metrics:
     exporters: [kafka] # publishes topic otlp_metrics
  logs:
     exporters: [kafka] # publishes topic otlp_logs
  traces:
     exporters: [kafka] # publishes topic otlp_spans

If the topic is set to any value, this structure will work in exporter perspective.

exporter:
  kafka:
    topic: custom_traces_topic

pipelines: 
  metrics:
     exporter: [kafka] # publishes topic custom_traces_topic
  logs:
     exporter: [kafka] # publishes topic custom_traces_topic
  traces:
     exporter: [kafka] # publishes topic custom_traces_topic

What happens in this case is that the three exporters will send to the same topic. This is a race condition that will succeed in 1/3 of scenarios at the receiving end (see #32735).

To avoid this problem, the user must create three different exporters for each pipeline to set custom topic names. This is error prone and inconsistent with the default behavior that allows having one exporter for all three pipelines with the default topic names.

Describe the solution you'd like

Having three different topic names by default but being able to override it with only a single one is a strange feature.
Just create three topic properties of the kafka exporter:

exporter:
  kafka:
    traces_topic: custom_traces_topic # default otlp_spans
    metrics_topic: custom_metrics_topic # default otlp_metrics
    logs_topic: custom_logs_topic # default otlp_logs

pipelines: 
  metrics:
     exporters: [kafka] # publishes topic custom_metrics_topic
  logs:
     exporters: [kafka] # publishes topic custom_logs_topic
  traces:
     exporters: [kafka] # publishes topic custom_traces_topic

Alternative definition (solution should match for exporter and receiver):

exporter:
  kafka:
    topic:
      traces: custom_traces_topic # default otlp_spans
      metrics: custom_metrics_topic # default otlp_metrics
      logs: custom_logs_topic # default otlp_logs

Describe alternatives you've considered

The documentation gives some hints about determining the actual topic:

  1. The client application sending telemetry data to OpenTelemetry should not be concerned with setting topic names in attributes that are used internally to transport OpenTelemetry information using Kafka.
  2. The context could be configured with topic names.
    • found no example how to configure that
    • need to have some logic to determine from telemetry data to configure the topic to use
    • evaluated for every message
    • more complex setup than simply defining three static properties
  3. Feature enhancement of this ticket.

Current workaround: define three exporters with all properties redundant except the topic property and use them individually in three pipelines.

Additional context

No response

@aklemp aklemp added enhancement New feature or request needs triage New item requiring triage labels Sep 26, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@atoulme atoulme removed the needs triage New item requiring triage label Oct 2, 2024
Copy link
Contributor

github-actions bot commented Dec 2, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 2, 2024
@aklemp
Copy link
Author

aklemp commented Dec 2, 2024

This is still relevant and I would actually consider this a bug instead of a feature because setting the provided configuration value leads to runtime errors.

@github-actions github-actions bot removed the Stale label Dec 3, 2024
Copy link
Contributor

github-actions bot commented Feb 3, 2025

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Feb 3, 2025
@aklemp
Copy link
Author

aklemp commented Feb 3, 2025

Anyone to respond except a bot?

@github-actions github-actions bot removed the Stale label Feb 4, 2025
@axw
Copy link
Contributor

axw commented Mar 18, 2025

@aklemp this seems reasonable to me on the surface. Before pressing ahead, I'd like to discuss the alternatives a bit.

The client application sending telemetry data to OpenTelemetry should not be concerned with setting topic names in attributes that are used internally to transport OpenTelemetry information using Kafka.

Agreed.

On a more general note, it may be useful for the exporter to dynamically choose the topic based on some information it has about the client, such as a tenant ID. This could be done at the request/batch level, so you wouldn't need to evaluate it for every single message.

For example you might set a topic name template to something like ${tenant}-otlp_logs. In theory the solution to that could also work for signals, e.g. set topic to ${tenant}-otlp_${signal}, but that is a little bit inflexible.

There's another alternative that I have been thinking about on and off: what if we sent everything to the same topic, and included the signal type and encoding as message headers? Then we could support producing to and receiving from a single topic: the receiver would use the signal type and encoding headers to figure out how to decode the data, and use the signal type header to route to the correct pipeline.

Have you considered this option? Would it suit your needs, or do you prefer separating signals into different topics?

@aklemp
Copy link
Author

aklemp commented Mar 19, 2025

@axw Thank you for your response.

I'm not sure about choosing topics dynamically based on client information. We currently don't have a use case for that and to me it would be probably a decision per telemetry type, like logs for this client are going here and logs for that client are going there. The telemetry type itself is already differentiated by the collector and I could apply different pipelines and configuration for that so the exporter doesn't have to worry.

Regarding sending all telemetry types via one topic is possible and basically an internal contract between exporter and receiver. The OpenTelemetry user isn't aware of that detail. But after thinking about it, I found several arguments from architecture, security and IT operations perspective against it.

  • Schemas could be associated per telemetry type (and I assume it is already the case as the receiver detects mismatching telemetry data on the same topic).
  • Depending on requirements different policies could be applied per topic, like validation, virus scanning, encryption.
  • The telemetry type messages typically vary a lot regarding number and size. Separate topics can be tuned accordingly (e.g. partitioning, retention times, replication factors).
  • Separate topics are easier to monitor because one could easily spot if there is a problem regarding one of the telemetry types or alert rules can be defined according to the specific amount of messages.
  • Other people like Kafka admins can easily see what kind of data is transported (e.g. only metrics and traces are going via Kafka and logs are delivered to a different system directly).

From exporter/receiver development perspective there are also a few arguments against joining all telemetry data on a single topic.

  • It already works with separated topics as long as we don't change the topic configuration of exporter/receiver.
  • Only handling of configuration has to be changed instead of the actual implementation.

So overall, I would still prefer a solution with separate but easily to configure topics (without duplicating Kafka cluster configuration) together with #32735 (just closed because of inactivity).

@axw
Copy link
Contributor

axw commented Mar 20, 2025

@aklemp thank you! That all makes sense to me, except:

  • Virus scanning: is that relevant for logs, metrics, traces, or profiles? It sounds like a concern for other types of data.
  • Encryption: why would you encrypt one and not the other?

I ask because if these are per-signal concerns, then I would be worried about a "slippery slope" of logs_encryption, metrics_encryption, etc. I think of these as cross-cutting concerns that should apply to all the data; and in the (I expect unusual) event that they do not, then you could always create a separate exporter.

I'm not sure about choosing topics dynamically based on client information. We currently don't have a use case for that and to me it would be probably a decision per telemetry type, like logs for this client are going here and logs for that client are going there. The telemetry type itself is already differentiated by the collector and I could apply different pipelines and configuration for that so the exporter doesn't have to worry.

OK. For what it's worth, this is a concern for my team, that's why I brought it up. It's a way of enabling multi-tenancy: https://kafka.apache.org/documentation/#multitenancy-topic-naming

One major benefit of having a combined topic for all the signals is that it may lead to fewer partitions. This can matter at large scale (e.g. when combined with per-tenant topics), as each cluster can only handle so many partitions, and managed Kafka (e.g. AWS MSK) typically have a per-partition cost.

I am going to investigate the template approach a bit more. Essentially what I have in mind is to to enable setting a config like this:

receivers:
  kafka:
    topic: "custom_${signal}_topic" # defaults to "otlp_${signal}"

exporters:
  kafka:
   topic: "custom_${signal}_topic" # defaults to "otlp_${signal}"

(Not necessarily with that syntax.)

I think we need to make that work to enable namespacing, but if for whatever reason it can't be done then I think we could consider the <signal>_topic config option.

@aklemp
Copy link
Author

aklemp commented Mar 20, 2025

I agree that it sounds strange to do encryption or virus scanning per signal type. Despite the fact that it would just give the user freedom to do so, one might argue that metrics and traces have a quite defined format and transport technical information. Logs however could contain anything including confidential data (not a great idea to log such things, but sometimes it is, what it is...). Processes like the mentioned increase latency and resource consumption (e.g. CPU, memory), so one might want to apply these only for the risky signals.

A template like custom_${signal}_topic would be fine as the result would be the same, having the signal types in separate topics.

@axw
Copy link
Contributor

axw commented Mar 20, 2025

For the template, I can think of a few options:

  1. Use OTTL for evaluating the topic name

Pros:

  • Doesn't introduce a new template/expression language

Cons:

  • We may need to introduce a new config field like topic_ottl or something similar; otherwise we would need to have a way to reliably differentiate a static topic name from a dynamic template.
  • A little more verbose for simple cases, e.g. Format("%s_otlp_%s", request["tenant-id"], signal)
  • A lot more verbose for more complex cases, e.g. where different signals follow different formats such as {"logs": "logs_topic", "traces": Format("%s.otlp_traces", request["tenant-id"]), "metrics": Format("%s.otlp_metrics", request["tenant-id"])}[signal] (and there would be no short-circuiting of expressions, i.e. we would have to evaluate the entire map in that example before we know which one to choose)
  1. Use Go's text/template

Pros:

  • This could trivially be supported in the existing topic config field: we could parse the config value as a template because a static topic name cannot have "{{" in it (that would be invalid for a Kafka topic name)

Cons:

  • Introduces a template language not commonly used in OTel Collector configurations
  • A bit more verbose than above for simple cases, e.g. {{printf "%s_otlp_%s" .Signal (index .Request "tenant-id") }}
  • Possible but verbose to implement the more complex cases using conditionals, e.g. {{if .eq .Signal "logs"}}logs_topic{{else}}{{ index .Request "tenant-id" }}.otlp_{{ .Signal }}{{end}}
  1. Use https://github.com/expr-lang/expr

Basically the same pros/cons as OTTL, but a little less verbose and a little more expressive (e.g. it has conditonals built into the language). Expr-lang is used in the collector already (namely in receivercreator), but is less prolific than OTTL.

Simple case: request["tenant-id"] + "_otlp_" + signal
Complex case: signal == "logs" ? "logs_topics" : request["tenant-id"] + "_otlp_" + signal


I'm currently looking into whether we could combine either OTTL expressions or expr-lang with confmap's variable resolution. That way we could do something like ${request["tenant-id"]}_otlp_${signal} where the expression inside the ${...} is interpreted as either OTTL or expr-lang. I think this will give us the best of both worlds.

@axw
Copy link
Contributor

axw commented Mar 20, 2025

Of course it now occurs to me that if we use confmap variable syntax, we'll need to escape it in configurations, which may become a footgun. I can't think of a better alternative at the moment...

@aklemp
Copy link
Author

aklemp commented Mar 21, 2025

I like the approach with alternatives with pros/cons. Unfortunately, I don't have deep experience either in Go or with Otel Collector best practices. So I cannot give a useful opinion on the options. The only thing is from user perspective, that simple and less verbose is probably better.

@axw
Copy link
Contributor

axw commented Mar 21, 2025

Thanks @aklemp. I'll bring this to the attention of other code owners and go from there.

@axw
Copy link
Contributor

axw commented Mar 24, 2025

I've opened a proposal here: #38888

axw added a commit to axw/opentelemetry-collector-contrib that referenced this issue Apr 7, 2025
Deprecate `topic` and `encoding`, and introduce
signal-specific equivalents:
- `logs::topic`, `metrics::topic`, and `traces::topic`
- `logs::encoding`, `metrics::encoding`, and `traces::encoding`

This enables users to explicitly define a configuration
equivalent to the default configuration, or some variation
thereof. It also enables specifying different encodings for
each signal type, which may be important due to the fact that
some encodings only support a subset of signals.

Closes
open-telemetry#35432
axw added a commit to axw/opentelemetry-collector-contrib that referenced this issue Apr 7, 2025
Deprecate `topic` and `encoding`, and introduce
signal-specific equivalents:
- `logs::topic`, `metrics::topic`, and `traces::topic`
- `logs::encoding`, `metrics::encoding`, and `traces::encoding`

This enables users to explicitly define a configuration
equivalent to the default configuration, or some variation
thereof. It also enables specifying different encodings for
each signal type, which may be important due to the fact that
some encodings only support a subset of signals.

Closes
open-telemetry#35432
axw added a commit to axw/opentelemetry-collector-contrib that referenced this issue Apr 7, 2025
Deprecate `topic` and `encoding`, and introduce
signal-specific equivalents:
- `logs::topic`, `metrics::topic`, and `traces::topic`
- `logs::encoding`, `metrics::encoding`, and `traces::encoding`

This enables users to explicitly define a configuration
equivalent to the default configuration, or some variation
thereof. It also enables specifying different encodings for
each signal type, which may be important due to the fact that
some encodings only support a subset of signals.

Closes
open-telemetry#35432
axw added a commit to axw/opentelemetry-collector-contrib that referenced this issue Apr 7, 2025
Deprecate `topic` and `encoding`, and introduce
signal-specific equivalents:
- `logs::topic`, `metrics::topic`, and `traces::topic`
- `logs::encoding`, `metrics::encoding`, and `traces::encoding`

This enables users to explicitly define a configuration
equivalent to the default configuration, or some variation
thereof. It also enables specifying different encodings for
each signal type, which may be important due to the fact that
some encodings only support a subset of signals.

Closes
open-telemetry#35432
LucianoGiannotti pushed a commit to LucianoGiannotti/opentelemetry-collector-contrib that referenced this issue Apr 9, 2025
)

#### Description

Deprecate `topic` and `encoding`, and introduce signal-specific
equivalents:

- `logs::topic`, `metrics::topic`, and `traces::topic`
- `logs::encoding`, `metrics::encoding`, and `traces::encoding`

This enables users to explicitly define a configuration equivalent to
the default configuration, or some variation thereof. It also enables
specifying different encodings for each signal type, which may be
important due to the fact that some encodings only support a subset of
signals.

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue

Fixes
open-telemetry#35432

#### Testing

Unit tests added.

#### Documentation

Updated README.

---------

Co-authored-by: Antoine Toulme <[email protected]>
@aklemp
Copy link
Author

aklemp commented Apr 23, 2025

@axw I was wondering when and how the version v0.124.0 containing the changes is created on Docker hub. The last version there is 0.123.0 from 22 days ago...

@axw
Copy link
Contributor

axw commented Apr 23, 2025

@aklemp please see open-telemetry/opentelemetry-collector-releases#926

Fiery-Fenix pushed a commit to Fiery-Fenix/opentelemetry-collector-contrib that referenced this issue Apr 24, 2025
)

#### Description

Deprecate `topic` and `encoding`, and introduce signal-specific
equivalents:

- `logs::topic`, `metrics::topic`, and `traces::topic`
- `logs::encoding`, `metrics::encoding`, and `traces::encoding`

This enables users to explicitly define a configuration equivalent to
the default configuration, or some variation thereof. It also enables
specifying different encodings for each signal type, which may be
important due to the fact that some encodings only support a subset of
signals.

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue

Fixes
open-telemetry#35432

#### Testing

Unit tests added.

#### Documentation

Updated README.

---------

Co-authored-by: Antoine Toulme <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request exporter/kafka
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants