Skip to content

Add Logs support to mdatagen #12571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cyrusjk opened this issue Mar 6, 2025 · 6 comments
Open

Add Logs support to mdatagen #12571

cyrusjk opened this issue Mar 6, 2025 · 6 comments
Assignees

Comments

@cyrusjk
Copy link

cyrusjk commented Mar 6, 2025

Component(s)

mdatagen

Describe the issue you're reporting

mdatagen supports metrics and telemetry, but not logs, even though the collector supports log collection. Having wrappers generated that support all of the Open Telemetry types would be nice thing.

@bogdandrutu
Copy link
Member

How would you use it? And what for?

@cyrusjk
Copy link
Author

cyrusjk commented Mar 7, 2025

In the context of RDBMS receivers, I would like to be able to update the receiver to support the exporting of SQL query and query plan text data to the Logs pipeline along with some attributes such as timestamp, db/schema, user, etc. While Logs scraping can be added to a Receiver, the support is inconsistent and having mdatagen support would help set a standard for how this kind of integration can be done.

Copy link
Contributor

Pinging code owners for cmd/mdatagen: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label.

@sincejune sincejune self-assigned this Mar 10, 2025
@sincejune
Copy link
Contributor

While working on the SQL Server receiver, I encountered similar use cases. I believe there are several benefits, particularly for scraper-based receivers, from supporting logs in mdatagen:

Automatic Generation of Scope Name and Version

Currently, we manually attach the scope name and version when constructing plog. Although this approach works, many scope versions may become outdated over time.

Template-Based Documentation

Documentation can become inconsistent across different scraper-based receivers. Using templates can help maintain consistency and uniformity.

Avoiding Field Type Conflicts

There might be multiple scrapers calling databases and sending different log records for various use cases (e.g., sample query and top query). Field types may unexpectedly differ among these log records, potentially causing issues for subsequent consumers. Having a defined schema will help avoid such conflicts.

Implementation Plan

I plan to add the following specification to the schema for supporting logs in the initial version:

logs:
  sample_log:
    enabled: true
    description: <description>
    extended_documentation: <extended_documentation>
    attributes: [attribute1, attribute2] # reuse the attributes section

We can consider adding configurations as well in the next iteration:

logs:
  sample_log:
    enabled: true
    description: <description>
    extended_documentation: <extended_documentation>
    attributes: [attribute1, attribute2] # reuse the attributes section
    configs:
      lookback_window:
        type: int
        unit: s
      ...

@bogdandrutu I'm willing to implement this if the team agrees with this proposal. cc @open-telemetry/collector-approvers

@dmitryax
Copy link
Member

I'm not against this approach. However, we are now introducing some framework for receivers emitting structured log records while most of the receivers wouldn't be able to follow it. So maybe we can start with some generic functionality like Automatic Generation of Scope Name and Version and Template-Based Documentation and apply it to all log receivers. Then we proceed with other features and clearly mark receivers emitting structured logs when some specific features are applied to them. I'm suggesting this approach because it'll bring consistent usage of the mdatagen in log receivers instead of having this applied only to a couple of receivers while others are not even mentioned in metadata.yaml.

@cyrusjk
Copy link
Author

cyrusjk commented Mar 15, 2025 via email

github-merge-queue bot pushed a commit that referenced this issue Apr 2, 2025
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
### Description
This PR introduces the foundational changes necessary for supporting
`logs` data in the `mdatagen` tool.

#### mdatagen files for logs
This update includes the generation of `generated_logs.go` and
`generated_logs_test.go`. These files are specifically for receiver and
scraper components that support `logs` data, enabling initial log
handling capabilities.

#### Introducing LogsBuilder
This PR introduced a `LogsBuilder` similar to the existing
`MetricsBuilder`. It provides a structured way to manage log data with
the following functions:
1. `Emit(...ResourceLogsOption)`
    Similar to `Emit` function in MetricsBuilder
2. `EmitForResource(...ResourceLogsOption)`
    Similar to `EmitForResource` function in MetricsBuilder
3. `AppendLogRecord(plog.LogRecord)`
This function appends a log record to an internal buffer. The buffered
log records are used to construct a `ScopeLog` when the `Emit()` or
`EmitForResource()` functions are called. Scope name and version are
automatically generated.

#### Next steps
* Add more test cases to LogsBuilder (e.g. reading test configs)
* Add LogsBuilderConfig to read logs property in `metadata.yml` to send
structured logs
* Update receivers in contrib to use LogsBuilder.

#### Example usage:
```
lb := NewLogsBuilder(settings)

res := pcommon.NewResource()
res.Attributes().PutStr("region", "us-west-1")

// append the first log record
lr := plog.NewLogRecord()
lr.SetTimestamp(pcommon.NewTimestampFromTime(time.Now()))
lr.Attributes().PutStr("type", "log")
lr.Body().SetStr("the first log record")

// append the second log record
lr2 := plog.NewLogRecord()
lr2.SetTimestamp(pcommon.NewTimestampFromTime(time.Now()))
lr2.Attributes().PutStr("type", "event")
lr2.Body().SetStr("the second log record")

lb.AppendLogRecord(lr)
lb.AppendLogRecord(lr2)

logs := lb.Emit(WithLogsResource(res))
```
#### Example output:
```
resourceLogs:
  - resource:
      attributes:
        - key: region
          value:
            stringValue: us-west-1
    scopeLogs:
      - logRecords:
          - attributes:
              - key: type
                value:
                  stringValue: log
            body:
              stringValue: the first log record
            spanId: ""
            timeUnixNano: "1742291226022739000"
            traceId: ""
          - attributes:
              - key: type
                value:
                  stringValue: event
            body:
              stringValue: the second log record
            spanId: ""
            timeUnixNano: "1742291226022739000"
            traceId: ""
        scope:
          name: github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awscloudwatchreceiver
          version: latest
```

<!-- Issue number if applicable -->
### Link to tracking issue
Part of #12571 

<!--Describe what testing was performed and which tests were added.-->
### Testing
Added

<!--Describe the documentation added.-->
### Documentation
Added

<!--Please delete paragraphs that you did not use before submitting.-->
github-merge-queue bot pushed a commit that referenced this issue Apr 30, 2025
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
This PR added type definition needed for supporting structured events in
mdatagen.

Note: the NewLogsBuilder function signature is generated dynamically
according to logs config:
```
func NewLogsBuilder({{ if .Events }}lbc LogsBuilderConfig, {{ end }}settings {{ .Status.Class }}.Settings) *LogsBuilder {}
```
<!-- Issue number if applicable -->
#### Link to tracking issue
Part of #12571 

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Added

<!--Describe the documentation added.-->
#### Documentation
Added

<!--Please delete paragraphs that you did not use before submitting.-->
github-merge-queue bot pushed a commit that referenced this issue May 8, 2025
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
Add functions for processing structured events

<!-- Issue number if applicable -->
#### Link to tracking issue
Part of #12571 

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Added

<!--Describe the documentation added.-->
#### Documentation
Added

<!--Please delete paragraphs that you did not use before submitting.-->
bogdandrutu pushed a commit to bogdandrutu/opentelemetry-collector that referenced this issue May 8, 2025
…etry#12961)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Add functions for processing structured events

<!-- Issue number if applicable -->
Part of open-telemetry#12571

<!--Describe what testing was performed and which tests were added.-->
Added

<!--Describe the documentation added.-->
Added

<!--Please delete paragraphs that you did not use before submitting.-->
bogdandrutu pushed a commit to bogdandrutu/opentelemetry-collector that referenced this issue May 8, 2025
…etry#12961)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Add functions for processing structured events

<!-- Issue number if applicable -->
Part of open-telemetry#12571

<!--Describe what testing was performed and which tests were added.-->
Added

<!--Describe the documentation added.-->
Added

<!--Please delete paragraphs that you did not use before submitting.-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants