Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

river hides panic stack traces in database #418

Closed
elee1766 opened this issue Jul 3, 2024 · 6 comments · Fixed by #423
Closed

river hides panic stack traces in database #418

elee1766 opened this issue Jul 3, 2024 · 6 comments · Fixed by #423

Comments

@elee1766
Copy link
Contributor

elee1766 commented Jul 3, 2024

often in development, someone will write a worker that panics.

currently, the stack is uploaded to the database, and the stack trace is not printed in the log
https://github.com/riverqueue/river/blob/master/job_executor.go#L169-L183

this is very annoying in development, because now the developer needs to query the database to get the stack trace, and it's rather hard to actually get nice formatted traces. it makes debugging and development much slower overall.

i saw there was #122 . some sort of middleware like func(next func(context.Context) error) func(context.Context) error could maybe help with both these problems

@bgentry
Copy link
Contributor

bgentry commented Jul 3, 2024

Hi @elee1766, have you added a panic handler? In addition to reporting to a bug tracking service, you could customize it to print the trace in dev. https://riverqueue.com/docs/error-handling

@elee1766
Copy link
Contributor Author

elee1766 commented Jul 3, 2024

the stack trace is polluted at the panichandler, I couldn't find a way to get the stack trace from the result object since only the error is passed. If i could get the original stack track then it would be perfect.

otoh, we could implement and hook a custom panic handler to the job processing function, but it just seems silly to change the semantics of a workers job when I want to change how the job executor executes jobs regardless of the underlying work to be done.

@bgentry
Copy link
Contributor

bgentry commented Jul 3, 2024

Looking at this in the executor @brandur, I'm seeing that we do actually capture trace correctly with debug.Stack() within the recover, which is what we record to the database. However we the don't actually expose this to the panic handler, only the raw panic value, which seems like an important oversight.

I think it's essential that this be available in some way. Since we already have the API for panic handling, our options are (1) break the API by adding the trace as another arg to ErrorHandler.HandlePanic(), or (2) stuff it in the context to avoid breaking the API. I lean pretty strongly toward (1) in this case because the trace is pretty important for panics IMO, thoughts?

@brandur
Copy link
Contributor

brandur commented Jul 4, 2024

Yeah, no way we do it via context shenanigans. Also +1 on the API breakage — I think we can get away with it given this is a feature most users aren't going to be dipping into yet (hopefully), and it's reasonably easy for them to fix.

I'll take this.

brandur added a commit that referenced this issue Jul 4, 2024
This one in response to #418 in which although we persist a stack trace
to a job row's errors property, we don't reveal it in a panic handler,
which is quite inconvenient for purposes of logging or other telemetry
(e.g. sending to Sentry).

Here, `HandlePanic`'s signature changes to (`trace` is added):

    HandlePanic(ctx context.Context, job *rivertype.JobRow, panicVal any, trace string) *ErrorHandlerResult

A couple notes on choices:

* The naming of `trace` is reused from `AttemptError`.

* The value type is `string`. This is a little non-obvious because Go
  exposes it as a `[]byte`, something I've never quite understood as to
  why, but I did `string` because for one it's more convenient to use,
  but more importantly, it's the same type on `AttemptError`.

This is a breaking change, but it seems like being able to get a stack
trace during panic is important enough that it's worth it, and with any
luck there's few enough people using this feature that it won't break
that many people. The fix is quite easy regardless and will easily be
caught by the compiler.

Fixes #418.
brandur added a commit that referenced this issue Jul 4, 2024
This one in response to #418 in which although we persist a stack trace
to a job row's errors property, we don't reveal it in a panic handler,
which is quite inconvenient for purposes of logging or other telemetry
(e.g. sending to Sentry).

Here, `HandlePanic`'s signature changes to (`trace` is added):

    HandlePanic(ctx context.Context, job *rivertype.JobRow, panicVal any, trace string) *ErrorHandlerResult

A couple notes on choices:

* The naming of `trace` is reused from `AttemptError`.

* The value type is `string`. This is a little non-obvious because Go
  exposes it as a `[]byte`, something I've never quite understood as to
  why, but I did `string` because for one it's more convenient to use,
  but more importantly, it's the same type on `AttemptError`.

This is a breaking change, but it seems like being able to get a stack
trace during panic is important enough that it's worth it, and with any
luck there's few enough people using this feature that it won't break
that many people. The fix is quite easy regardless and will easily be
caught by the compiler.

Fixes #418.
brandur added a commit that referenced this issue Jul 4, 2024
This one in response to #418 in which although we persist a stack trace
to a job row's errors property, we don't reveal it in a panic handler,
which is quite inconvenient for purposes of logging or other telemetry
(e.g. sending to Sentry).

Here, `HandlePanic`'s signature changes to (`trace` is added):

    HandlePanic(ctx context.Context, job *rivertype.JobRow, panicVal any, trace string) *ErrorHandlerResult

A couple notes on choices:

* The naming of `trace` is reused from `AttemptError`.

* The value type is `string`. This is a little non-obvious because Go
  exposes it as a `[]byte`, something I've never quite understood as to
  why, but I did `string` because for one it's more convenient to use,
  but more importantly, it's the same type on `AttemptError`.

This is a breaking change, but it seems like being able to get a stack
trace during panic is important enough that it's worth it, and with any
luck there's few enough people using this feature that it won't break
that many people. The fix is quite easy regardless and will easily be
caught by the compiler.

Fixes #418.
brandur added a commit that referenced this issue Jul 4, 2024
This one in response to #418 in which although we persist a stack trace
to a job row's errors property, we don't reveal it in a panic handler,
which is quite inconvenient for purposes of logging or other telemetry
(e.g. sending to Sentry).

Here, `HandlePanic`'s signature changes to (`trace` is added):

    HandlePanic(ctx context.Context, job *rivertype.JobRow, panicVal any, trace string) *ErrorHandlerResult

A couple notes on choices:

* The naming of `trace` is reused from `AttemptError`.

* The value type is `string`. This is a little non-obvious because Go
  exposes it as a `[]byte`, something I've never quite understood as to
  why, but I did `string` because for one it's more convenient to use,
  but more importantly, it's the same type on `AttemptError`.

This is a breaking change, but it seems like being able to get a stack
trace during panic is important enough that it's worth it, and with any
luck there's few enough people using this feature that it won't break
that many people. The fix is quite easy regardless and will easily be
caught by the compiler.

Fixes #418.
@bgentry
Copy link
Contributor

bgentry commented Jul 4, 2024

@elee1766 the HandlePanic method now passes the original stack trace as an arg as of #423 / v0.9.0. Can you let us know if this makes it trivial for you to print your trace in dev?

@elee1766
Copy link
Contributor Author

elee1766 commented Jul 4, 2024

@bgentry yep, its good now. ty!
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants