-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[collector] [receiver/k8s_observer] filelog/regex_parser configuration do not work as expected #39163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thank's for reporting this @kuisathaverat. I was able to reproduce the issue but still not sure where the problem comes from. FWIWI the produced configuration by the receiver creator looks correct: 2025-04-07T12:38:33.553Z info [email protected]/observerhandler.go:201 starting receiver {"name": "filelog/4142839f-4ca2-462b-9e62-e0d72eed67f2_apache", "endpoint": "10.244.0.6", "endpoint_id": "k8s_observer/4142839f-4ca2-462b-9e62-e0d72eed67f2/apache", "config": {"include":["/var/log/pods/default_apache-84d6d9fcbc-kmb7b_4142839f-4ca2-462b-9e62-e0d72eed67f2/apache/*.log"],"include_file_name":false,"include_file_path":true,"operators":[{"id":"container-parser","type":"container"},{"field":"attributes.tag","id":"some","type":"add","value":"hints"},{"id":"apache-logs","regex":"^(?P<source_ip>\\d+\\.\\d+.\\d+\\.\\d+)\\s+-\\s+-\\s+\\[(?P<timestamp_log>\\d+/\\w+/\\d+:\\d+:\\d+:\\d+\\s+\\+\\d+)\\]\\s\"(?P<http_method>\\w+)\\s+(?P<http_path>.*)\\s+(?P<http_version>.*)\"\\s+(?P<http_code>\\d+)\\s+(?P<http_size>\\d+)$","type":"regex_parser"}],"start_at":"end"}}
2025-04-07T12:38:33.553Z info adapter/receiver.go:41 Starting stanza receiver {"name": "filelog/4142839f-4ca2-462b-9e62-e0d72eed67f2_apache/receiver_creator/logs{endpoint=\"10.244.0.6\"}/k8s_observer/4142839f-4ca2-462b-9e62-e0d72eed67f2/apache"}
2025-04-07T12:38:33.758Z info fileconsumer/file.go:265 Started watching file {"name": "filelog/4142839f-4ca2-462b-9e62-e0d72eed67f2_apache/receiver_creator/logs{endpoint=\"10.244.0.6\"}/k8s_observer/4142839f-4ca2-462b-9e62-e0d72eed67f2/apache", "component": "fileconsumer", "path": "/var/log/pods/default_apache-84d6d9fcbc-kmb7b_4142839f-4ca2-462b-9e62-e0d72eed67f2/apache/0.log"} The config part: {
"name": "filelog/4142839f-4ca2-462b-9e62-e0d72eed67f2_apache",
"endpoint": "10.244.0.6",
"endpoint_id": "k8s_observer/4142839f-4ca2-462b-9e62-e0d72eed67f2/apache",
"config": {
"include": [
"/var/log/pods/default_apache-84d6d9fcbc-kmb7b_4142839f-4ca2-462b-9e62-e0d72eed67f2/apache/*.log"
],
"include_file_name": false,
"include_file_path": true,
"operators": [
{
"id": "container-parser",
"type": "container"
},
{
"field": "attributes.tag",
"id": "some",
"type": "add",
"value": "hints"
},
{
"id": "apache-logs",
"regex": "^(?P<source_ip>\\d+\\.\\d+.\\d+\\.\\d+)\\s+-\\s+-\\s+\\[(?P<timestamp_log>\\d+/\\w+/\\d+:\\d+:\\d+:\\d+\\s+\\+\\d+)\\]\\s\"(?P<http_method>\\w+)\\s+(?P<http_path>.*)\\s+(?P<http_version>.*)\"\\s+(?P<http_code>\\d+)\\s+(?P<http_size>\\d+)$",
"type": "regex_parser"
}
],
"start_at": "end"
}
} I will try to find more time to investigate this soon. |
Looking more carefully into the resulted regex produced by the receiver creator I can validate my original assumption that the issue lies somewhere in the yaml unmarshaling. Indeed it seems that it comes with extra escapes compared to the original one: < ^(?P<source_ip>\d+\.\d+.\d+\.\d+)\s+-\s+-\s+\[(?P<timestamp_log>\d+/\w+/\d+:\d+:\d+:\d+\s+\+\d+)\]\s"(?P<http_method>\w+)\s+(?P<http_path>.*)\s+(?P<http_version>.*)"\s+(?P<http_code>\d+)\s+(?P<http_size>\d+)$
---
> ^(?P<source_ip>\\d+\\.\\d+.\\d+\\.\\d+)\\s+-\\s+-\\s+\\[(?P<timestamp_log>\\d+/\\w+/\\d+:\\d+:\\d+:\\d+\\s+\\+\\d+)\\]\\s\"(?P<http_method>\\w+)\\s+(?P<http_path>.*)\\s+(?P<http_version>.*)\"\\s+(?P<http_code>\\d+)\\s+(?P<http_size>\\d+)$ That might be something coming from the unmarshaling that takes place at
|
Pinging code owners for receiver/receivercreator: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label. |
It seems to be a generic issue with the receiver creator (not only the annotation based discovery) and how escapes are handled in order to support escaped backticks like the one in the testcase: opentelemetry-collector-contrib/receiver/receivercreator/config_expansion_test.go Line 48 in 2a7d122
I could reproduce it with the following static receiver creator configuration: receiver_creator/logsstatic:
watch_observers: [ k8s_observer ]
receivers:
filelog/apache:
rule: type == "pod.container" && container_name == "apache"
config:
include:
- /var/log/pods/`pod.namespace`_`pod.name`_`pod.uid`/`container_name`/*.log
include_file_name: false
include_file_path: true
operators:
- id: container-parser
type: container
- type: add
field: attributes.log.template
value: apache
- id: apache-logs
type: regex_parser
regex: ^(?P<source_ip>\d+\.\d+.\d+\.\d+)\s+-\s+-\s+\[(?P<timestamp_log>\d+/\w+/\d+:\d+:\d+:\d+\s+\+\d+)\]\s"(?P<http_method>\w+)\s+(?P<http_path>.*)\s+(?P<http_version>.*)"\s+(?P<http_code>\d+)\s+(?P<http_size>\d+)$ The docs mention Hence we need to explicitly check for the backtick at opentelemetry-collector-contrib/receiver/receivercreator/config_expansion.go Lines 41 to 46 in e818a4c
I'll send a PR to fix this. |
Component(s)
No response
What happened?
Description
We are testing the
k8s_observer
as a replacement for thefilelog
receiver. We have found that a configuration that works with thefilelog
receiver has a different behavior with thek8s_observer
the regexp used does not match.Steps to Reproduce
I have prepared two configurations that show both case, these configuration deploy an Apache pod and the OpenTelemetry collector configured to show the log document in the logs with the debug exported. You can see that the parse of the same log passes in the
filelog
receiver and does not match in thek8s_observer
receiver.Filelog receiver configuration in the OpenTelemetry Collector
k8s_observer Configuration in the pod annotations
Expected Result
The same parse results in both cases; all the attributes are parsed from the log and added to the document.
Actual Result
The regexp does not match in the
k8s_observer
use case.Collector version
0.123.0
Environment information
Environment
OS: k8s container docker.io/otel/opentelemetry-collector-contrib:latest
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: