[exporter][batcher] MergedContext implemented with SpanLink #12318

sfc-gh-sili · 2025-02-07T06:47:23Z

Description

This PR implements a component called mergedContext that is linked to all incoming context via a spanLink.

For example: let's say req1, req2, req3 are batched in the same request, then

batch = { requests: [req1, req2, req3], mergedContext: ctx1}
ctx1.span is linked to ctx2.span and ctx3.span via a SpanLink

Link to tracking issue

#12212
#8122

exporter/internal/queue/merged_context.go

bogdandrutu · 2025-02-07T08:14:23Z

exporter/internal/queue/merged_context.go

+	"go.uber.org/multierr"
+)
+
+type mergedContext struct {


The scope is:

Deadline is max of all.

Keep a list of all SpanContext.

We just need these 2 things, not the whole contexts.

codecov · 2025-02-12T04:26:16Z

Codecov Report

Attention: Patch coverage is 88.46154% with 3 lines in your changes missing coverage. Please review.

Project coverage is 91.54%. Comparing base (2dc95de) to head (1765a6d).
Report is 14 commits behind head on main.

Files with missing lines	Patch %	Lines
...r/exporterhelper/internal/batcher/batch_context.go	76.92%	2 Missing and 1 partial ⚠️

❌ Your patch check has failed because the patch coverage (88.46%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@           Coverage Diff           @@
##             main   #12318   +/-   ##
=======================================
  Coverage   91.54%   91.54%           
=======================================
  Files         467      468    +1     
  Lines       25623    25677   +54     
=======================================
+ Hits        23456    23507   +51     
- Misses       1768     1770    +2     
- Partials      399      400    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jade-guiton-dd

I don't think the logic in this PR will work as intended.

The pipeline in the exporterhelper is queue → batcher → obsReportSender, so exporterhelper only creates its own span AFTER queuing and batching. This means the spans you are creating links between are the spans of the parent component (for example the span emitted by a batchprocessor), which will be hard to interpret for users.
Let's say that four dequeue operations D1, D2, D3, and D4 are performed, with associated parent spans S1, S2, S3, and S4. Let's say that the data from D1, D2, and part of D3 are put into a batch B1, and the rest of D3 + D4 are put into a second batch B2. Your code will attempt to add links from S1 to S2 and S3, then from S3 to S4. This will be also be difficult to interpret.
Most importantly, by the time S2 and S3 are added to B1, span S1 will likely already have ended, and will likely already have been exported, without any span links. I tried your code by enabling the batcher in the OTLP exporter, and I suspect this is the reason why I don't see any span links being exported in my tests, even though AddLink is called with appropriate parameters.

I think the proper solution to adding span links across the queue and batcher would involve:

Creating a span for the enqueue operation in the queueSender, to avoid relying on or adding links to spans from parent components
Adding our span links to the span created by the obsReportSender at span creation time, to ensure they are all exported. This would involve not adding the links in the batcher, but only recording them and passing them to the obsReportSender, presumably through a Context key.

exporter/exporterhelper/internal/batcher/default_batcher.go

sfc-gh-sili · 2025-02-12T23:30:15Z

@jade-guiton-dd Thanks for the feedback! I updated the implementation to link to batch spans from obsReportSender. I agree it's cleaner that way.

Regarding spans created from the queue, you proposed to create a span for the enqueue operation, but the enqueue and dequeue uses different contexts if the queue is persistent storage queue. Did you mean we want to create a new span for every dequeue operation?

jade-guiton-dd · 2025-02-13T10:13:05Z

No, I did mean "enqueue". The ultimate goal of the span links is to link the "export" trace with the "input" trace that pushed the data into the exporter. Since only the enqueue operation is a synchronous part of the input trace, I think we will need to create a span for it. To make the link work across the persistent queue, we will need to persist the SpanContext across it, but that should hopefully be doable.

However for now, it's probably simpler to add a span for the dequeue operation and link to that across the batching operations. We can leave linking the enqueue and dequeue operations across the queue as future work.

exporter/exporterhelper/internal/batcher/batch_context.go

sfc-gh-sili · 2025-02-14T08:13:00Z

@jade-guiton-dd Thanks! Just updated this PR with a new iteration that propagates span context with a thinner layer. Now trace links are stored in a dummy background context for obs_report_sender to read from.

Regarding timeout: it's not hard to implement. One way is to keep track of deadline time through the batching process and create context with deadline when the batch flushes. I removed it from this PR because I am not sure how useful the timeout would be. It feels awkward to have an additional timeout when we have timeout in batching and timeout_sender already.

Regarding start of span: This PR now starts a new span whenever a item is read from the queue, but I think the proper way is to

Like you mentioned, move it to enqueue
and delegate the operation to obs_report_sender (or something similar)

obs_report_sender -> queue_sender -> [ queue ] -> batch_sender -> obs_report_sender -> retry_sender -> timeout_sender

Hopefully this PR is enough for the purpose of linking span context in the batcher. Let me know what you think.

jade-guiton-dd

Thank you for all your work! Just one more fix needed, and a nitpick. (Although I'm not an approver, so a second review will be necessary anyway)

exporter/exporterhelper/internal/queue_sender.go

jade-guiton-dd · 2025-02-14T12:06:49Z

exporter/exporterhelper/internal/batcher/batch_context.go

+	if links, ok := ctx.Value(batchSpanLinksKey).([]trace.Link); ok {
+		return links
+	}
+	return []trace.Link{trace.LinkFromContext(ctx)}


When the batcher is disabled (or when it is enabled but only one request ends up in the batch), contextWithMergedLinks is never called, and we pass the parent context through directly to the obsreportsender. I think this is fine, but because the latter calls LinksFromContext, the parent span ends up as both the parent AND as a link.

It's not a big deal, but I think it would be better to create links only when we cut the trace, which is to say, when calling contextWithMergedLinks.

Yeah, I wondered about created a struct like:

type linkContext struct { Link trace.SpanLink More context.Context }

at each contextWithMergedLinks

return context.WithValue( context.Background(), batchSpanLinksKey, linkContext{ Link: trace.SpanFromContext(ctx2), More: ctx1, )

I believe this will reduce overall allocation count. The LinksFromContext() method would repeatedly get the batchSpanLinksKey value and append a link. To improve on this, the caller of LinksFromContext will (or can) know how many requests were put into the batch (I think?), so it can allocate a slice of the correct capacity, then ideally the call LinksFromContext(into []trace.SpanLink) error to populate the result, expecting to find len(into) many elements.

My opinion is that this would be a premature optimization. I think that creating a linked list of new Contexts, which is then resolved into a slice of SpanLinks, would require just as many (if not more) allocations than just appending the SpanLinks to a single slice, even if you allocate the slice with the right length.

My suggestion was very different:

in LinksFromContext, retrieve the slice associated with the batchSpanLinksKey, or return an empty slice if the key is absent (no merges were performed, so no links are necessary).

in contextWithMergedLinks, do something like this:

links := append(LinksFromContext(ctx1), LinksFromContext(ctx2)...) for _, ctx := range []Context{ctx1, ctx2} { spanCtx := trace.SpanContextFromContext(ctx) if spanCtx.IsValid() { // ctx has a parent span links = append(links, Link{ SpanContext: spanCtx }) // turn it into a link } } return context.WithValue(context.Background(), batchSpanLinksKey, links)

exporter/exporterhelper/internal/queue_sender.go

linux-foundation-easycla · 2025-03-06T01:30:38Z

❌ - login: @sfc-gh-sili / name: Sindy Li . The commit (9ff4620) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please submit a support request ticket.

jade-guiton-dd · 2025-03-19T12:31:47Z

@sfc-gh-sili Do you think you'll have the time to finish this PR? I believe the remaining steps would be signing the CLA, fixing the failing tests, and addressing the remaining comment threads.

jade-guiton-dd · 2025-03-31T13:17:02Z

I continued the work in this PR: #12768

@mx-psi

…requests (#12768) #### Description Continuation of #12318: - Small change to avoid adding both a parent span relationship *and* a span link in cases where no merge is made - Fix failing tests by removing them: Those tests relate to cancelling a batch of export requests if the context for one of them is cancelled. I'm not sure how useful this logic is, or if it makes sense to cancel unrelated requests that happened to be batched with the one that was cancelled. If there are objections to this, I can try to reimplement this logic. #### Link to tracking issue Updates #12212 (remaining: persist parent span across persistent queue) (edit by @mx-psi): - Fixes #11140 - Fixes #11141 --------- Co-authored-by: Sindy Li <[email protected]>

github-actions · 2025-04-15T03:26:18Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

bogdandrutu reviewed Feb 7, 2025

View reviewed changes

sfc-gh-sili force-pushed the sili-fix-new-batching-context branch 4 times, most recently from b7080a2 to 3f6082e Compare February 12, 2025 00:10

sfc-gh-sili changed the title ~~An attempt to implement merged context~~ [exporter][batcher] MergedContext implemented with SpanLink Feb 12, 2025

sfc-gh-sili marked this pull request as ready for review February 12, 2025 00:57

sfc-gh-sili requested review from dmitryax and a team as code owners February 12, 2025 00:57

sfc-gh-sili marked this pull request as draft February 12, 2025 00:58

sfc-gh-sili force-pushed the sili-fix-new-batching-context branch 2 times, most recently from c2082a8 to 6a425b5 Compare February 12, 2025 01:41

sfc-gh-sili force-pushed the sili-fix-new-batching-context branch from 87f5b85 to ce96200 Compare February 12, 2025 04:41

sfc-gh-sili marked this pull request as ready for review February 12, 2025 04:41

sfc-gh-sili requested a review from bogdandrutu February 12, 2025 04:42

jade-guiton-dd requested changes Feb 12, 2025

View reviewed changes

exporter/exporterhelper/internal/batcher/default_batcher.go Outdated Show resolved Hide resolved

jade-guiton-dd requested changes Feb 13, 2025

View reviewed changes

exporter/exporterhelper/internal/batcher/batch_context.go Outdated Show resolved Hide resolved

exporter/exporterhelper/internal/batcher/batch_context.go Outdated Show resolved Hide resolved

sfc-gh-sili force-pushed the sili-fix-new-batching-context branch 2 times, most recently from 7f04365 to 5998f4e Compare February 14, 2025 06:59

sfc-gh-sili requested a review from jade-guiton-dd February 14, 2025 07:02

jade-guiton-dd requested changes Feb 14, 2025

View reviewed changes

sfc-gh-sili force-pushed the sili-fix-new-batching-context branch from df83318 to cfdfd23 Compare February 15, 2025 07:43

jade-guiton-dd reviewed Feb 17, 2025

View reviewed changes

exporter/exporterhelper/internal/queue_sender.go Outdated Show resolved Hide resolved

sfc-gh-sili force-pushed the sili-fix-new-batching-context branch from 1765a6d to 829252f Compare March 6, 2025 01:30

sfc-gh-sili force-pushed the sili-fix-new-batching-context branch from 829252f to bf64b9e Compare March 6, 2025 01:33

Implemented merged context with link

9ff4620

sfc-gh-sili force-pushed the sili-fix-new-batching-context branch from bf64b9e to 9ff4620 Compare March 6, 2025 01:34

jade-guiton-dd mentioned this pull request Mar 31, 2025

[exporterhelper] Add span links across batcher when merging multiple requests #12768

Merged

github-actions bot added the Stale label Apr 15, 2025

jade-guiton-dd closed this Apr 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[exporter][batcher] MergedContext implemented with SpanLink #12318

[exporter][batcher] MergedContext implemented with SpanLink #12318

sfc-gh-sili commented Feb 7, 2025 •

edited

Loading

bogdandrutu Feb 7, 2025

codecov bot commented Feb 12, 2025 •

edited

Loading

jade-guiton-dd left a comment •

edited

Loading

sfc-gh-sili commented Feb 12, 2025 •

edited

Loading

jade-guiton-dd commented Feb 13, 2025 •

edited

Loading

sfc-gh-sili commented Feb 14, 2025

jade-guiton-dd left a comment

jade-guiton-dd Feb 14, 2025

jmacd Feb 21, 2025

jade-guiton-dd Feb 24, 2025 •

edited

Loading

linux-foundation-easycla bot commented Mar 6, 2025 •

edited

Loading

jade-guiton-dd commented Mar 19, 2025

jade-guiton-dd commented Mar 31, 2025

github-actions bot commented Apr 15, 2025

[exporter][batcher] MergedContext implemented with SpanLink #12318

[exporter][batcher] MergedContext implemented with SpanLink #12318

Conversation

sfc-gh-sili commented Feb 7, 2025 • edited Loading

Description

Link to tracking issue

bogdandrutu Feb 7, 2025

Choose a reason for hiding this comment

codecov bot commented Feb 12, 2025 • edited Loading

Codecov Report

jade-guiton-dd left a comment • edited Loading

Choose a reason for hiding this comment

sfc-gh-sili commented Feb 12, 2025 • edited Loading

jade-guiton-dd commented Feb 13, 2025 • edited Loading

sfc-gh-sili commented Feb 14, 2025

jade-guiton-dd left a comment

Choose a reason for hiding this comment

jade-guiton-dd Feb 14, 2025

Choose a reason for hiding this comment

jmacd Feb 21, 2025

Choose a reason for hiding this comment

jade-guiton-dd Feb 24, 2025 • edited Loading

Choose a reason for hiding this comment

linux-foundation-easycla bot commented Mar 6, 2025 • edited Loading

jade-guiton-dd commented Mar 19, 2025

jade-guiton-dd commented Mar 31, 2025

github-actions bot commented Apr 15, 2025

sfc-gh-sili commented Feb 7, 2025 •

edited

Loading

codecov bot commented Feb 12, 2025 •

edited

Loading

jade-guiton-dd left a comment •

edited

Loading

sfc-gh-sili commented Feb 12, 2025 •

edited

Loading

jade-guiton-dd commented Feb 13, 2025 •

edited

Loading

jade-guiton-dd Feb 24, 2025 •

edited

Loading

linux-foundation-easycla bot commented Mar 6, 2025 •

edited

Loading