Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add better documentation for sanitizing data #1140

Closed
FriedSock opened this issue Dec 11, 2020 · 22 comments
Closed

Add better documentation for sanitizing data #1140

FriedSock opened this issue Dec 11, 2020 · 22 comments

Comments

@FriedSock
Copy link

Describe the bug

sanitize_fields has been removed, but there is no description of how to use the data scrubbing features https://docs.sentry.io/platforms/ruby/data-management/sensitive-data/#scrubbing-data and https://docs.sentry.io/platforms/ruby/configuration/filtering/

Sentry.init do |config|
  config.before_send = lambda do |event, hint|
    event[:key] = value
    event
  end
end

It would be good if this was a concrete example showing how to replicate the sanitize_fields functionality, it is unclear what value is here and how to access request parameters from the event.

@seand7565
Copy link

👍 on this. I'm attempting to migrate from sentry-raven and this is the most confusing part. I'm not sure how to use Rails.application.config.filter_parameters to filter out sensitive data with sentry_ruby. It used to be we could just do this:
config.sanitize_fields = Rails.application.config.filter_parameters.map(&:to_s)
but I'm not sure how that translates to sentry-ruby now. 🤔

@thib44
Copy link

thib44 commented Dec 23, 2020

I had the same issue, for me the best solution is to do something like :

config.before_send = lambda do |event, _hint|
  Rails.application.config.filter_parameters.each do |filter_parameter|
    event[filter_parameter] = nil
  end
  event
end

@st0012
Copy link
Collaborator

st0012 commented Dec 23, 2020

sorry for the late reply, here's an example:

  config.before_send = lambda do |event, _hint|
    # note: if you have config.async configured, the event here will be a Hash instead of an Event object
    request_data = event.request.data
    mask = "*****".freeze

    Rails.application.config.filter_parameters.each do |filter_parameter|
      if sensitive_data = request_data[filter_parameter]
        request_data[filter_parameter] = mask
      end
    end

    event.request.data = request_data
    event
  end

if you want to replicate how sanitize_fields work, you can mimic the behavior of the processor in the old SDK. you can also borrow ideas from other processors

we also recommend you to use server-side data scrubbing whenever possible.

@mcclymont
Copy link

Why was this feature dropped? This makes it very difficult to upgrade.

The functionality in sanitizedata.rb is not exactly trivial and something to be copy-pasted into before_send. I'd rather rely on the gem and their own tests rather than having to re-implement this feature along with tests

@HazAT
Copy link
Member

HazAT commented Jan 2, 2021

@mcclymont This decision was made already some time ago when we moved SDKs to be unified.
We understand that there might be some edge cases where you really want to sanitize in the SDK but for the majority of our users, it's enough to have server-side data scrubbing (see https://docs.sentry.io/product/data-management-settings/).
Maintaining all this code in all different SDKs wasn't feasible and also it caused some bugs.
We will try to provide better docs to make the migration easier but this feature, will not come back to SDKs the way it was.

@mcclymont
Copy link

Client-side sanitisation is much preferred over server-side scrubbing for obvious security reasons. If we don't send sensitive data to Sentry, we don't need to trust Sentry to scrub it properly.

The documentation even says this

Using before-send in the SDKs to scrub any data before it is sent is the recommended scrubbing approach, so sensitive data never leaves the local environment.

So I don't believe sanitization in the SDK is an "edge case"

@mcclymont
Copy link

sorry for the late reply, here's an example:

@st0012 Unfortunately this code sample does not work nearly as well as the old functionality in sentry-raven

  1. The sanitize_fields previously worked with regexes matching key names
  2. sanitize_fields would clean all nested fields, even in nested JSON, query params and other places by traversing Hashes recursively.

@st0012
Copy link
Collaborator

st0012 commented Jan 7, 2021

@mcclymont if you're using Rails, I think something like this should match the old behavior the best.

  filter = ActiveSupport::ParameterFilter.new(Rails.application.config.filter_parameters)
  config.before_send = lambda do |event, hint|
    event.request.data = filter.filter(event.request.data)
    event
  end

(I'm still improving this example though)

@mcclymont
Copy link

Thanks @st0012, that's a good idea.

Unfortunately, it still doesn't measure up to the functionality of the sentry-raven processor:

  1. It doesn't filter out sensitive query parameters
  2. It doesn't filter out parameters in request.body if they are encoded with JSON since the value of event.request.data is a string, not a hash

@st0012
Copy link
Collaborator

st0012 commented Jan 8, 2021

@mcclymont as we've stated before, we don't intend to provide either a feature nor a code snippet that matches the exact same behavior of sentry-raven. the one-fits-all approach sentry-raven took just didn't work and will not work (you can see some of the issues it caused here). so we won't try to do that with our examples either.

we'll improve our examples overtime to reduce the migration cost, but you should be the one to adjust them to fit your own use cases 🙂

@bbugh
Copy link

bbugh commented Jan 25, 2021

@st0012 that's not particularly helpful, and I don't think what people are asking for. It's not really a "migration guide" if you don't actually provide a way to migrate. It's not like this is casual data we're sending, ensuring that PII is covered is a legal and privacy requirement. I don't have time to study all of the stuff Sentry was doing under the hood to sanitize in order to replicate the behavior and then possibly still get it wrong and get into legal trouble. This is really disappointing position to take.

@st0012
Copy link
Collaborator

st0012 commented Jan 25, 2021

@bbugh the new SDK also provides an option to not sending any PII at all

config.send_default_pii = false # this is actually the default

just use it and that'll save you from the trouble 🙂

if you want part of the PII but don't want the rest of the parts, you'll need to scrub them yourself. because we don't know what you want and what you don't want. guessing a general pattern of sensitive data means it'll always accidentally scrub something else and it's a never-ending work, which eventually will become slow and buggy.

@st0012 st0012 added this to To do in 4.x via automation Jan 26, 2021
@mrexox
Copy link
Contributor

mrexox commented Feb 6, 2021

Hello, everyone!

I've made a try to add missing sanitization feature (in memory of sentry-raven) through gem sentry-sanitizer: https://github.com/mrexox/sentry-sanitizer

You can try it and check, if it fits your requirements. For my project it did fit. I hope, this would help.

Note: works only with sentry-ruby ~> 4.2.0

@st0012
Copy link
Collaborator

st0012 commented Feb 6, 2021

Note: works only with sentry-raven ~> 4.2.0

@mrexox I guess you mean sentry-ruby here? 😉

also thanks for building the new SDK's first community plugin 🎉

@mrexox
Copy link
Contributor

mrexox commented Feb 6, 2021

@st0012 yep, thank you, fixed it!

You're welcome! Hope, it is useful :)

@francisco-rojas
Copy link

@mcclymont if you're using Rails, I think something like this should match the old behavior the best.

  filter = ActiveSupport::ParameterFilter.new(Rails.application.config.filter_parameters)
  config.before_send = lambda do |event, hint|
    event.request.data = filter.filter(event.request.data)
    event
  end

(I'm still improving this example though)

I ended up doing this with async config:

  filter = ActiveSupport::ParameterFilter.new(Rails.application.config.filter_parameters)
  config.before_send = lambda do |event, _hint|
    event['request']['headers'] = filter.filter(event.dig('request', 'headers')) if event.dig('request', 'headers')
    event['request']['data'] = filter.filter(event.dig('request', 'data')) if event.dig('request', 'data')

    event
  end

I never saw a "data" key under event['request'] but I kept it just in case.

I also added config.filter_parameters += ['Authorization'] to application.rb in rails to filter out the Authorization token. Even though it is not in the list of Rails.application.config.filter_parameters it seems that sentry just filters it out in their UI, but I didnt want it to travel in the payload so I filter it before sending it to sentry.

@Vita1ik
Copy link

Vita1ik commented Feb 22, 2021

Me as well! : I never saw a "data" key under event['request'].

It have to work with Rails < 6.0

config.before_send = lambda do |event, _hint|
  mask = '*****'.freeze

  if event.dig('request', 'data')
      Rails.application.config.filter_parameters.each do |filter_parameter|
         event['request']['data'][filter_parameter.to_s] &&= mask
      end
  end
      
  event
end

@anderscarling
Copy link

A part of this issue I think is not really discussed enough (but mentioned by @mcclymont earlier in the tread) is that query params is not filtered by config.send_default_pii and not (as far as I've seen) mentioned in the documentation.

Obviously they are somewhat likely to contain pii or other sensitive info, especially stuff like password reset tokens (which if unique to a single user could be pii under gdpr) often go there. Obviously the same could be said for the rest of the request url as well, but I think query params is the most likely offended here.

This does imo make the naming of send_default_pii a bit unfortunate as it sort of promises a bit more than what it can possibly deliver, especially without doing more complicated filtering.

@st0012
Copy link
Collaborator

st0012 commented Feb 27, 2021

@anderscarling thanks for bringing up the reset token example. based on such cases, I think it's a bug that send_default_pii doesn't remove query params. I will exclude them in a recent release.

Related PR: #1302

@st0012
Copy link
Collaborator

st0012 commented Mar 1, 2021

Update: I think this should work better (and is simpler) than focus on filtering request data

  filter = ActiveSupport::ParameterFilter.new(Rails.application.config.filter_parameters)

  config.before_send = lambda do |event, hint|
    filter.filter(event.to_hash)
  end

@franzliedke
Copy link

franzliedke commented Mar 8, 2021

@st0012 before_send should return the filtered event, correct?

EDIT: I am stupid, this does. 🤦

@st0012
Copy link
Collaborator

st0012 commented May 28, 2021

I've updated the documentation several times for better sanitization guidance. And I think using Rails' parameter filter generally works well, so I'm closing this for now. But feel free to drop new comments or new issues if you think we can improve it further, thanks 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
4.x
  
Done
Development

No branches or pull requests