[tailsamplingprocessor] Fix the decision timer metric to measure > 50ms #37722

Logiraptor · 2025-02-05T20:22:20Z

Description

Previously the decision timer metric was in microseconds. Because the max histogram bucket was 50000, it capped the maximum measurable latency at 50ms. This commit changes the metric to be in milliseconds, which allows for measuring latencies of up to 50 seconds.

Since the default decision tick is 1s, it's necessary to observe when the latency is approaching 1s.

I'm not sure how big of a breaking change it is to update the unit, but given the previous value was not super useful, I thought it was better to change it.

Here's an example of the current state:

Notice how the average latency is running much higher than the p99, indicating that we are missing important data on how slow the decision tick actually is.

…er latencies beyond 50ms. Previously the decision timer metric was in microseconds. Because the max histogram bucket was 50000, it capped the maximum measurable latency at 50ms. This commit changes the metric to be in milliseconds, which allows for measuring latencies of up to 50 seconds. Since the default decision tick is 1s, it's necessary to observe when the latency is approaching 1s. I'm not sure how big of a breaking change it is to update the unit, but given the previous value was not super useful, I thought it was better to change it.

crobert-1 · 2025-02-05T23:20:28Z

I've added waiting-for-code-owners to signal that my review is pending their approval of this logical change. I don't know enough about this component to be able to confidently review at this time.

portertech · 2025-02-07T05:40:33Z

.chloggen/logiraptor_fix-decision-timer-metric.yaml

+# Use this changelog template to create an entry for release notes.
+
+# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
+change_type: bug_fix


On the fence, do we consider this breaking?

After thinking more, it does feel like a breaking change. Updated.

portertech

My only nit is related to the change type, bug_fix vs breaking.

jpkrohling · 2025-02-11T12:01:24Z

Once the merge conflict is resolved, this is ready to go.

Logiraptor · 2025-02-12T20:29:38Z

@jpkrohling Thanks, I think it's good to go!

Logiraptor requested review from jpkrohling and a team as code owners February 5, 2025 20:22

github-actions bot assigned crobert-1 Feb 5, 2025

github-actions bot added the processor/tailsampling Tail sampling processor label Feb 5, 2025

github-actions bot requested a review from portertech February 5, 2025 20:22

Update logiraptor_fix-decision-timer-metric.yaml

7eb3896

crobert-1 added the waiting-for-code-owners label Feb 5, 2025

Logiraptor added 2 commits February 6, 2025 22:36

run make generate

e258d99

Fix telemetry test

638006f

portertech reviewed Feb 7, 2025

View reviewed changes

portertech approved these changes Feb 7, 2025

View reviewed changes

Update change type

a09723b

jpkrohling approved these changes Feb 8, 2025

View reviewed changes

Merge branch 'main' into logiraptor/fix-decision-timer-metric

7e85d49

crobert-1 removed the waiting-for-code-owners label Feb 10, 2025

crobert-1 approved these changes Feb 10, 2025

View reviewed changes

Logiraptor added 2 commits February 12, 2025 14:16

Merge branch 'main' into logiraptor/fix-decision-timer-metric

16550f3

Merge branch 'main' into logiraptor/fix-decision-timer-metric

b5fb427

jpkrohling merged commit 154d313 into open-telemetry:main Feb 14, 2025
162 checks passed

github-actions bot added this to the next release milestone Feb 14, 2025

Logiraptor mentioned this pull request Mar 10, 2025

[tailsamplingprocessor] Decision Timer Latency Metric is misleading #38502

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tailsamplingprocessor] Fix the decision timer metric to measure > 50ms #37722

[tailsamplingprocessor] Fix the decision timer metric to measure > 50ms #37722

Logiraptor commented Feb 5, 2025 •

edited

Loading

crobert-1 commented Feb 5, 2025

portertech Feb 7, 2025

Logiraptor Feb 7, 2025

portertech left a comment

jpkrohling commented Feb 11, 2025

Logiraptor commented Feb 12, 2025

[tailsamplingprocessor] Fix the decision timer metric to measure > 50ms #37722

[tailsamplingprocessor] Fix the decision timer metric to measure > 50ms #37722

Conversation

Logiraptor commented Feb 5, 2025 • edited Loading

Description

crobert-1 commented Feb 5, 2025

portertech Feb 7, 2025

Choose a reason for hiding this comment

Logiraptor Feb 7, 2025

Choose a reason for hiding this comment

portertech left a comment

Choose a reason for hiding this comment

jpkrohling commented Feb 11, 2025

Logiraptor commented Feb 12, 2025

Logiraptor commented Feb 5, 2025 •

edited

Loading