Skip to content

Unhandled errors in ANR worker causes app to pause indefinitely #11147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
delaneyb opened this issue Mar 16, 2024 · 6 comments · Fixed by #11161
Closed
3 tasks done

Unhandled errors in ANR worker causes app to pause indefinitely #11147

delaneyb opened this issue Mar 16, 2024 · 6 comments · Fixed by #11161
Assignees
Labels
Package: browser Issues related to the Sentry Browser SDK

Comments

@delaneyb
Copy link

delaneyb commented Mar 16, 2024

Is there an existing issue for this?

How do you use Sentry?

Sentry Saas (sentry.io)

Which SDK are you using?

@sentry/node
@sentry/integrations

SDK Version

7.107.0 (48b8169)

Framework Version

No response

Link to Sentry event

No response

SDK Setup

const Sentry = require('@sentry/node') as typeof import('@sentry/node')
const Integrations = require('@sentry/integrations') as typeof import('@sentry/integrations')
// SENTRY_RELEASE is a special object that must be defined on `global` and will be picked up by
// the sentry SDK to report the release associated with the event. We use compile-time variable
// substitution in the bundler to replace the RELEASE_CODE constant with its real value. This
// can also be "injected" by the esbuild-sentry plugin but this adds function calls to every
// single bundled module.
global.SENTRY_RELEASE = { id: RELEASE_CODE }
Sentry.init({
	dsn: "https://[email protected]/...",
	attachStacktrace: true,
	includeLocalVariables: true,
	sendDefaultPii: true,
	maxBreadcrumbs: 40,
	normalizeDepth: 6,
	integrations: [
		new Integrations.CaptureConsole({
			levels: ['warn', 'error']
		}),
		new Sentry.Integrations.Anr({
			captureStackTrace: true,
			anrThreshold: 50,
			pollInterval: 50,
		})
	]
})

Sentry.setUser({
	id: hostname
})

Sentry.setContext('device', {
	hostname: hostname
})

Steps to Reproduce

  1. Make sure there is a const ctx defined in the main node runtime
  2. Enable ANR worker
  3. Cause ANR events (If ctx not defined in your own code, then causing more than 1 anr event I believe also triggers the issue because then ctx has already been defined)
  4. Notice that the program just stays hanging:
    image

This is due to multiple issues with the worker.ts script:

session.post(
'Runtime.evaluate',
{
// Grab the trace context from the current scope
expression:
'const ctx = __SENTRY__.acs?.getCurrentScope().getPropagationContext() || {}; ctx.traceId + "-" + ctx.spanId + "-" + ctx.parentSpanId',
// Don't re-trigger the debugger if this causes an error
silent: true,
},
(_, param) => {
const traceId = param && param.result ? (param.result.value as string) : '--';
const [trace_id, span_id, parent_span_id] = traceId.split('-') as (string | undefined)[];
session.post('Debugger.resume');
session.post('Debugger.disable');
const context = trace_id?.length && span_id?.length ? { trace_id, span_id, parent_span_id } : undefined;
sendAnrEvent(stackFrames, context).then(null, () => {
log('Sending ANR event failed.');
});
},
);

Errors in evaluation of the const ctx = __SENTRY__... expression designed to capture the data for the trace ID are not properly handled.

  • For example if ctx is already defined, the expression throws Identifier 'ctx' has already been declared. Here is the value of param coming into the callback when that occurs:
const thing = {
	result: {
		type: 'object',
		subtype: 'error',
		className: 'SyntaxError',
		description: "SyntaxError: Identifier 'ctx' has already been declared\n" +
			'    at eval (eval at Codegen (/.../index.js:33122:29), <anonymous>:1:1)\n' +
			'    at Function (<anonymous>)\n' +
			'    at Codegen (/.../index.js:33122:29)\n' +
			'    at Type.setup (/.../index.js:34876:51)\n' +
			'    at Type.fromObject (/.../index.js:34912:19)\n' +
			'    at Type.Write$fromObject [as fromObject] (eval at Codegen (/.../index.js:33122:29), <anonymous>:35:34)\n' +
			'    at Type.fromObject (/.../index.js:34912:27)\n' +
			'    at Type.WriteRequest$fromObject [as fromObject] (eval at Codegen (/.../index.js:33122:29), <anonymous>:20:24)\n' +
			'    at Object.serialize [as requestSerialize] (/.../index.js:39147:29)\n' +
			'    at BaseStreamingInterceptingCall.sendMessageWithContext (/.../index.js:30755:46)',02: 21: 46.156 AM[0ms]
		objectId: '2536922713100491570.1.278'
	},
	exceptionDetails: {
		exceptionId: 2,
		text: 'Uncaught',
		lineNumber: 0,
		columnNumber: 0,
		scriptId: '594',
		exception: {02: 21: 46.156 AM[0ms]
			type: 'object',
			subtype: 'error',
			className: 'SyntaxError',02: 21: 46.157 AM[1ms]
			description: "SyntaxError: Identifier 'ctx' has already been declared\n" +
				'    at eval (eval at Codegen (/.../index.js:33122:29), <anonymous>:1:1)\n' +02: 21: 46.157 AM [0ms]
			'    at Function (<anonymous>)\n' +
				'    at Codegen (/.../index.js:33122:29)\n' +
				'    at Type.setup (/.../index.js:34876:51)\n' +
				'    at Type.fromObject (/.../index.js:34912:19)\n' +
				'    at Type.Write$fromObject [as fromObject] (eval at Codegen (/.../index.js:33122:29), <02:21:46.157 AM [   0ms] anonymous>:35:34)\n' +
				'    at Type.fromObject (/.../index.js:34912:27)\n' +
				'    at Type.WriteRequest$fromObject [as fromObject] (eval at Codegen (/.../index.js:33122:29), <anonymous>:20:24)\n' +02: 21: 46.158 AM [1ms]
			'    at Object.serialize [as requestSerialize] (/.../index.js:39147:29)\n' +
				'    at BaseStreamingInterceptingCall.sendMessageWithContext (/.../index.js:30755:46)',
			objectId: '2536922713100491570.1.279'
		}
	}
}

When an error like this occurs in evaluating the expression, this leads to TypeError: Cannot read properties of undefined (reading 'split') since traceId is not actually a string but rather undefined

This error gets silently dropped, and the program remains paused and stuck, since the following Debugger.resume never gets called

Expected Result

The ANR event should have been created and reported, and my process should have been resumed and continued running as normal.

The Debugger.resume should have been sent back to the runtime FIRST to make sure that it always resumes, and:

  • any error that occured in evaluating the const ctx = __SENTRY__.hub.getScope... expression and
  • any error that occurs in the callback of session.post('Runtime.evaluate'

Should be at least surfaced with a console.warn or something.

Actual Result

image

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Mar 16, 2024
@github-actions github-actions bot added the Package: browser Issues related to the Sentry Browser SDK label Mar 16, 2024
@delaneyb delaneyb changed the title ANR Integration worker silently crashes, causes entire app freeze Unhandled errors in ANR worker causes app to pause indefinitely Mar 16, 2024
@delaneyb
Copy link
Author

With captureStackTrace disabled, I am also observing that in sendAnrEvent the worker thread is exited and is never restarted, so we only ever get one Anr event from a single instance of the app?

@delaneyb
Copy link
Author

I also only get user details (id etc.) on the AppNotResponding event if I modify the event inside the worker to add user

This applies with or without captureStackTrace enabled

Example of a regular event where user ID, app memory, start time etc. all captured normally: 3eb53f70c4974c6bb21f41e46c5c1f47
Example of an ANR event where nothing else was captured: 1ba31386009441fa8f5f16b2b975e4af

Both of these were triggered with the same setup - just a setTimeout which either threw an Error or sat in a blocking loop for 2s to trigger the ANR.

@delaneyb
Copy link
Author

Another one: sendAnrEvent doesn't seem to pass through tunnel to createEventEnvelope, which now explains we are not getting ANR events from devices out on site, even after patching this other stuff - without tunnel being honoured our ANR event requests would be getting dropped by firewall.

@lforst
Copy link
Contributor

lforst commented Mar 18, 2024

@delaneyb thanks for the detailed report! We will take a look and hopefully address al your feedback.

@timfish Would you mind taking a look when you find the time? I think the ctx collision should be pretty straightforward by just renaming to something a little bit less prone to colliding. The other feedback is a bit trickier to implement I think and might even be additional features on top of our current ANR implementation.

@timfish
Copy link
Collaborator

timfish commented Mar 18, 2024

I've split this into multiple issues:

@AbhiPrasad
Copy link
Member

cadesalaberry pushed a commit to cadesalaberry/sentry-javascript that referenced this issue Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Package: browser Issues related to the Sentry Browser SDK
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants