Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ActiveSupportLogger logs way too many Ruby objects (for SQL breadcrumbs) #1266

Closed
leonid-shevtsov opened this issue Feb 4, 2021 · 2 comments
Assignees
Projects

Comments

@leonid-shevtsov
Copy link

leonid-shevtsov commented Feb 4, 2021

Describe the bug

Symptom: stack level too deep error when sending event

I've debugged this down to breadcrumbs containing Ruby objects that definitely should not be sent with the trace.

Here is one such breadcrumb; I've obtained it by patching event.to_hash and sanitizing it:

  def self.scrub_ruby_objects(obj)
    case obj
    when Array then obj.map { |x| scrub_ruby_objects(x) }
    when Hash then obj.transform_values {|x| scrub_ruby_objects(x)}
    when String, Numeric, Date, Symbol, nil, true, false then obj
    else "SCRUBBED_#{obj.to_s}"
    end
  end
      {
        "category": "sql.active_record",
        "data": {
          "sql": "SELECT \"billing_discounts\".* FROM \"billing_discounts\" WHERE \"billing_discounts\".\"billing_subscription_id\" = $1 LIMIT $2",
          "binds": [
            "SCRUBBED_#<ActiveRecord::Relation::QueryAttribute:0x00007ff0e33f1f98>",
            "SCRUBBED_#<ActiveModel::Attribute::WithCastValue:0x00007ff0e2bfcba8>"
          ],
          "type_casted_binds": "SCRUBBED_#<Proc:0x00007ff0e33f1de0 /.../ruby/gems/2.7.0/gems/activerecord-6.1.1/lib/active_record/connection_adapters/abstract/query_cache.rb:130 (lambda)>",
          "name": "BillingDiscount Load",
          "connection": "SCRUBBED_#<ActiveRecord::ConnectionAdapters::PostgreSQLAdapter:0x00007ff0e32b57d8>",
          "cached": true,
          "start_timestamp": 1612376469.889782
        },
        "level": null,
        "message": null,
        "timestamp": 1612376469,
        "type": null
      },

To Reproduce

Just enable transaction tracing and active_support_logger. (Transaction tracing needed to generate events)

Expected behavior

I really insist you should not use a blacklist approach with the IGNORED_DATA_TYPES constant here. With a blacklist, every new change to rails logging - or instrumentation from any other library - can break breadcrumbs again. It makes more sense to log the bare minimum of information that's known to be simple data types (name, timestamp), and then add to that for certain event types with data that, again, you know to be simple (like a SQL query).

BTW performance traces don't have the same problem - and that's because Sentry::Rails::Tracing::ActiveRecordSubscriber uses just the approach I described.

Environment

sentry-rails (4.1.6)
sentry-ruby (4.1.5)
sentry-ruby-core (4.1.5)

Config

config.breadcrumbs_logger = %i[sentry_logger active_support_logger]
@st0012
Copy link
Collaborator

st0012 commented Feb 4, 2021

thanks for reporting the issue. I agree that there are many breadcrumb data that are unnecessary to capture, but I don't think we can just use the tracing subscriber's approach here. although tracing subscribers & breadcrumb loggers both subscribe to instrumentation events, they have very different purposes:

  • tracing subscribers expect certain types of performance-related events, which are just several of them and don't vary too much between clients. so the data can be filtered accurately.
  • breadcrumb loggers, on the other hand, are designed to capture as many events as possible. and because different events' key information can be named very differently, the whitelisting approach means we'll need to have a very long list (I don't think just having name & timestamp is a great default).
    • even if we only capture events from the Rails itself, the list would still be long (see all the events here).
    • and if we think about instrumentation made by the users themselves or their libraries, it's basically impossible for us to make a correct list for them. this means they'll lose the information they need.

I'm not saying it's an approach I'll never take. I just think it could sacrifice some customers the data they need so I won't choose it without trying other options first.

Possible Causes of The Error

but we need to talk about the potential cause of the stack level too deep error first. after doing some research, the cause could be (but not limited to):

My Proposals

so my ideas for addressing the above issues are:

Separate individual breadcrumb's serialization

we serialize breadcrumbs individually and ditch the problematic data if we see any error

Pros

  • we can retain as much data as possible
  • not easy to implement & maintain

Cons

  • it doesn't address the root cause of the issue, which is to have healthy data in the first place

Clean the data ourselves

we scrub the data as you did for tracking down this issue (convert all data into pure Ruby objects) + remove circular references

Pros

  • it solves the issue from the source and we don't need to rescue anything
  • could be simple to implement

Cons

  • based on our experience with data sanitization in the old SDK, deeply iterating through customer data could be error-prone because there could always be unexpected behavior. for example, in some old xml builder libs calling to_s to a builder object will generate a to_s entry to the xml document and mutates the data.
  • depends on the amount of the data, it could allocate a lot of memory during the process just by generating all the new strings.

No Conclusion, Yet

I actually have implemented the first approach in #1250, which has been released in sentry-ruby v4.1.6. I hope you can upgrade the SDK and see if it works for you first. and whether it worked or not, we can continue this discussion base on the result there and maybe we'll have other ideas 🙂

Update

today we released version 4.2.0 that added the new before_breadcrumb config option. you can use it like this:

config.before_breadcrumb = lambda do |breacrumb, hint|
  breadcrumb.data = scrubbed_data
  breadcrumb
end

@st0012 st0012 added this to To do in 4.x via automation Feb 4, 2021
@st0012 st0012 modified the milestone: sentry-rails-4.2.1 Feb 4, 2021
@st0012 st0012 closed this as completed Mar 3, 2021
4.x automation moved this from To do to Done Mar 3, 2021
@lucaong
Copy link

lucaong commented Apr 8, 2021

@leonid-shevtsov does your app use the oj gem for JSON serialization? If so, the stack level too deep might be caused by this: #1389

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
4.x
  
Done
Development

No branches or pull requests

3 participants