-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[exporter][batcher] MergedContext implemented with SpanLink #12318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter][batcher] MergedContext implemented with SpanLink #12318
Conversation
"go.uber.org/multierr" | ||
) | ||
|
||
type mergedContext struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scope is:
- Deadline is max of all.
- Keep a list of all SpanContext.
We just need these 2 things, not the whole contexts.
b7080a2
to
3f6082e
Compare
c2082a8
to
6a425b5
Compare
Codecov ReportAttention: Patch coverage is
❌ Your patch check has failed because the patch coverage (88.46%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #12318 +/- ##
=======================================
Coverage 91.54% 91.54%
=======================================
Files 467 468 +1
Lines 25623 25677 +54
=======================================
+ Hits 23456 23507 +51
- Misses 1768 1770 +2
- Partials 399 400 +1 ☔ View full report in Codecov by Sentry. |
87f5b85
to
ce96200
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the logic in this PR will work as intended.
- The pipeline in the exporterhelper is queue → batcher → obsReportSender, so exporterhelper only creates its own span AFTER queuing and batching. This means the spans you are creating links between are the spans of the parent component (for example the span emitted by a batchprocessor), which will be hard to interpret for users.
- Let's say that four dequeue operations D1, D2, D3, and D4 are performed, with associated parent spans S1, S2, S3, and S4. Let's say that the data from D1, D2, and part of D3 are put into a batch B1, and the rest of D3 + D4 are put into a second batch B2. Your code will attempt to add links from S1 to S2 and S3, then from S3 to S4. This will be also be difficult to interpret.
- Most importantly, by the time S2 and S3 are added to B1, span S1 will likely already have ended, and will likely already have been exported, without any span links. I tried your code by enabling the batcher in the OTLP exporter, and I suspect this is the reason why I don't see any span links being exported in my tests, even though
AddLink
is called with appropriate parameters.
I think the proper solution to adding span links across the queue and batcher would involve:
- Creating a span for the enqueue operation in the queueSender, to avoid relying on or adding links to spans from parent components
- Adding our span links to the span created by the obsReportSender at span creation time, to ensure they are all exported. This would involve not adding the links in the batcher, but only recording them and passing them to the obsReportSender, presumably through a Context key.
@jade-guiton-dd Thanks for the feedback! I updated the implementation to link to batch spans from obsReportSender. I agree it's cleaner that way. Regarding spans created from the queue, you proposed to create a span for the enqueue operation, but the enqueue and dequeue uses different contexts if the queue is persistent storage queue. Did you mean we want to create a new span for every dequeue operation? |
No, I did mean "enqueue". The ultimate goal of the span links is to link the "export" trace with the "input" trace that pushed the data into the exporter. Since only the enqueue operation is a synchronous part of the input trace, I think we will need to create a span for it. To make the link work across the persistent queue, we will need to persist the However for now, it's probably simpler to add a span for the dequeue operation and link to that across the batching operations. We can leave linking the enqueue and dequeue operations across the queue as future work. |
7f04365
to
5998f4e
Compare
@jade-guiton-dd Thanks! Just updated this PR with a new iteration that propagates span context with a thinner layer. Now trace links are stored in a dummy background context for Regarding timeout: it's not hard to implement. One way is to keep track of deadline time through the batching process and create context with deadline when the batch flushes. I removed it from this PR because I am not sure how useful the timeout would be. It feels awkward to have an additional timeout when we have timeout in batching and Regarding start of span: This PR now starts a new span whenever a item is read from the queue, but I think the proper way is to
Hopefully this PR is enough for the purpose of linking span context in the batcher. Let me know what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all your work! Just one more fix needed, and a nitpick. (Although I'm not an approver, so a second review will be necessary anyway)
if links, ok := ctx.Value(batchSpanLinksKey).([]trace.Link); ok { | ||
return links | ||
} | ||
return []trace.Link{trace.LinkFromContext(ctx)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the batcher is disabled (or when it is enabled but only one request ends up in the batch), contextWithMergedLinks
is never called, and we pass the parent context through directly to the obsreportsender. I think this is fine, but because the latter calls LinksFromContext
, the parent span ends up as both the parent AND as a link.
It's not a big deal, but I think it would be better to create links only when we cut the trace, which is to say, when calling contextWithMergedLinks
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wondered about created a struct like:
type linkContext struct {
Link trace.SpanLink
More context.Context
}
at each contextWithMergedLinks
return context.WithValue(
context.Background(),
batchSpanLinksKey,
linkContext{
Link: trace.SpanFromContext(ctx2),
More: ctx1,
)
I believe this will reduce overall allocation count. The LinksFromContext() method would repeatedly get the batchSpanLinksKey value and append a link. To improve on this, the caller of LinksFromContext will (or can) know how many requests were put into the batch (I think?), so it can allocate a slice of the correct capacity, then ideally the call LinksFromContext(into []trace.SpanLink) error
to populate the result, expecting to find len(into)
many elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opinion is that this would be a premature optimization. I think that creating a linked list of new Contexts, which is then resolved into a slice of SpanLinks, would require just as many (if not more) allocations than just append
ing the SpanLinks to a single slice, even if you allocate the slice with the right length.
My suggestion was very different:
- in
LinksFromContext
, retrieve the slice associated with thebatchSpanLinksKey
, or return an empty slice if the key is absent (no merges were performed, so no links are necessary). - in
contextWithMergedLinks
, do something like this:
links := append(LinksFromContext(ctx1), LinksFromContext(ctx2)...)
for _, ctx := range []Context{ctx1, ctx2} {
spanCtx := trace.SpanContextFromContext(ctx)
if spanCtx.IsValid() { // ctx has a parent span
links = append(links, Link{ SpanContext: spanCtx }) // turn it into a link
}
}
return context.WithValue(context.Background(), batchSpanLinksKey, links)
df83318
to
cfdfd23
Compare
1765a6d
to
829252f
Compare
|
829252f
to
bf64b9e
Compare
bf64b9e
to
9ff4620
Compare
@sfc-gh-sili Do you think you'll have the time to finish this PR? I believe the remaining steps would be signing the CLA, fixing the failing tests, and addressing the remaining comment threads. |
I continued the work in this PR: #12768 |
…requests (#12768) #### Description Continuation of #12318: - Small change to avoid adding both a parent span relationship *and* a span link in cases where no merge is made - Fix failing tests by removing them: Those tests relate to cancelling a batch of export requests if the context for one of them is cancelled. I'm not sure how useful this logic is, or if it makes sense to cancel unrelated requests that happened to be batched with the one that was cancelled. If there are objections to this, I can try to reimplement this logic. #### Link to tracking issue Updates #12212 (remaining: persist parent span across persistent queue) (edit by @mx-psi): - Fixes #11140 - Fixes #11141 --------- Co-authored-by: Sindy Li <[email protected]>
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Description
This PR implements a component called mergedContext that is linked to all incoming context via a spanLink.
For example: let's say req1, req2, req3 are batched in the same request, then
batch = { requests: [req1, req2, req3], mergedContext: ctx1}
ctx1.span
is linked toctx2.span
andctx3.span
via aSpanLink
Link to tracking issue
#12212
#8122