Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Applying certain processors to stacktraces is inefficient #485

Closed
nateberkopec opened this issue Mar 21, 2016 · 12 comments
Closed

Applying certain processors to stacktraces is inefficient #485

nateberkopec opened this issue Mar 21, 2016 · 12 comments

Comments

@nateberkopec
Copy link
Contributor

See #480

@zanker
Copy link
Contributor

zanker commented Mar 21, 2016

I think that ideally, what I'm doing in #480 isn't what we would do. I'd rather see some way of selectively indicating data is "blessed" and doesn't need to run through the processors. I didn't do it in that case, just because it looks like it requires a decent bit of reworking to how it processes events.

I don't think there's really a good way of optimizing the processor though. You could merge them all into one, but that kind of ruins the point of plugging them in.

@zanker
Copy link
Contributor

zanker commented Apr 13, 2016

Any thoughts on this? We're running on a fork so it's not the end of the world, but it would be nice to get a conclusion as I think the use of OkJson is a little dangerous given the performance/garbage.

@dcramer
Copy link
Member

dcramer commented Jun 10, 2016

FYI we've started doing this elsewhere -- its been good and bad.

For example, we shouldn't need to sanitize 'user', as its provided by the developer, but people shove objects in and make life hard.

With that said, we can definitely whitelist the keys that need sanitized. Looking at sentry-php for a most recent example:

https://github.com/getsentry/sentry-php/blob/master/lib/Raven/Client.php#L651

  • request
  • user
  • tags
  • extra

And then if a stacktrace has local vars (which it doesnt), we'd want to sanitize those.

@zanker
Copy link
Contributor

zanker commented Jun 10, 2016

If you want I can split prior PR and move the sanitization into a solo PR and we can go from there, that one only has modules and exception stacktrace, both of which are safe. It's mostly a question of whether you're ok with the approach I had previously.

@dcramer
Copy link
Member

dcramer commented Jun 10, 2016

@zanker i think we can just literally mimic the PHP client and say "only apply the serializer on these attributes". It's a bit awful to have to whitelist things, but we do it everywhere. PHP is probably our middle ground, but on the Python server we explicitly reach into various subattributes of things like the HTTP request.

@dcramer
Copy link
Member

dcramer commented Jun 10, 2016

Also #509 is a concern here, though quite frankly custom headers need to be scrubbed

@tmeinlschmidt
Copy link

would be great to add at least sanitizing passwords out of error messages..

ActionDispatch::ParamsParser::ParseError: 757: unexpected token at '{ username: 'someone', password: 'aaa' } '

it's easy to send wrong json to the API, had to resolve this by a rack middleware

@nateberkopec
Copy link
Contributor Author

@tmeinlschmidt Can you open that as a separate issue? I'm not sure if there's anything we can do about that particular case, but I'd like to take a look.

@nateberkopec
Copy link
Contributor Author

nateberkopec commented Jul 10, 2016

See #485.

To not sanitize stacktraces, you could, for example:

config.sanitize_whitelist = ["sentry.interfaces.Exception"]

@nateberkopec
Copy link
Contributor Author

I'm shelving this for now. In general, if performance is a concern, you should be using async with a background job processor. The background job will take care of the data processing and HTTP, which are 90% of the wall time in this library.

@nateberkopec
Copy link
Contributor Author

Okay, thought/workaround coming in 2.2.0 - we no longer attempt to sanitize any data structures which are frozen. If you freeze a Hash, for example, we basically stop there and don't attempt to sanitize any of it's children. We could strategically freeze objects to stop them from being touched by the SanitizeData processor.

@nateberkopec
Copy link
Contributor Author

Closing. You can now bless data by freezing it. Freezing stacktraces resulted in no improvement to our benchmark, so there's no improvement to be had there. Performance is still a focus for 3.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants