Skip to content

[tailsamplingprocessor] Fix the decision timer metric to measure > 50ms #37722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Logiraptor
Copy link
Contributor

@Logiraptor Logiraptor commented Feb 5, 2025

Description

Previously the decision timer metric was in microseconds. Because the max histogram bucket was 50000, it capped the maximum measurable latency at 50ms. This commit changes the metric to be in milliseconds, which allows for measuring latencies of up to 50 seconds.

Since the default decision tick is 1s, it's necessary to observe when the latency is approaching 1s.

I'm not sure how big of a breaking change it is to update the unit, but given the previous value was not super useful, I thought it was better to change it.

Here's an example of the current state:
Screenshot 2025-02-05 at 3 23 21 PM

Notice how the average latency is running much higher than the p99, indicating that we are missing important data on how slow the decision tick actually is.

…er latencies beyond 50ms.

Previously the decision timer metric was in microseconds. Because the max histogram bucket was 50000,
it capped the maximum measurable latency at 50ms. This commit changes the metric to be in milliseconds, which
allows for measuring latencies of up to 50 seconds.

Since the default decision tick is 1s, it's necessary to observe when the latency is approaching 1s.

I'm not sure how big of a breaking change it is to update the unit, but given the previous value was not super useful,
I thought it was better to change it.
@Logiraptor Logiraptor requested review from jpkrohling and a team as code owners February 5, 2025 20:22
@github-actions github-actions bot added the processor/tailsampling Tail sampling processor label Feb 5, 2025
@github-actions github-actions bot requested a review from portertech February 5, 2025 20:22
@crobert-1
Copy link
Member

I've added waiting-for-code-owners to signal that my review is pending their approval of this logical change. I don't know enough about this component to be able to confidently review at this time.

# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: bug_fix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the fence, do we consider this breaking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking more, it does feel like a breaking change. Updated.

Copy link
Contributor

@portertech portertech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only nit is related to the change type, bug_fix vs breaking.

@jpkrohling
Copy link
Member

Once the merge conflict is resolved, this is ready to go.

@Logiraptor
Copy link
Contributor Author

@jpkrohling Thanks, I think it's good to go!

@jpkrohling jpkrohling merged commit 154d313 into open-telemetry:main Feb 14, 2025
162 checks passed
@github-actions github-actions bot added this to the next release milestone Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
processor/tailsampling Tail sampling processor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants