Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(similarity): Add formatted stacktrace null check #80796

Merged
merged 13 commits into from
Nov 22, 2024

Conversation

jangjodi
Copy link
Member

Add an empty stacktrace check after stacktrace formatting is done in ingest
If the stacktrace is empty, do not send data to seer

@jangjodi jangjodi requested a review from a team as a code owner November 14, 2024 23:12
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 14, 2024
Copy link

codecov bot commented Nov 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #80796      +/-   ##
==========================================
+ Coverage   77.34%   80.33%   +2.98%     
==========================================
  Files        7223     7223              
  Lines      320123   320132       +9     
  Branches    20779    20779              
==========================================
+ Hits       247612   257189    +9577     
+ Misses      72118    62550    -9568     
  Partials      393      393              

@jangjodi jangjodi changed the title fix(similarity): Add formatted stacktracce null check fix(similarity): Add formatted stacktrace null check Nov 19, 2024
src/sentry/grouping/ingest/seer.py Outdated Show resolved Hide resolved
Co-authored-by: Armen Zambrano G. <[email protected]>
Copy link
Member

@lobsterkatie lobsterkatie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good! Just some nits and suggestions.

src/sentry/grouping/ingest/seer.py Outdated Show resolved Hide resolved
src/sentry/grouping/ingest/seer.py Outdated Show resolved Hide resolved
exception_type = get_path(event.data, "exception", "values", -1, "type")

request_data: SimilarIssuesEmbeddingsRequest = {
"event_id": event.event_id,
"hash": event_hash,
"project_id": event.project.id,
"stacktrace": stacktrace_string,
"stacktrace": event.data.get("stacktrace_string", ""),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use .get here? If we make it to this point, we know we'll have stored the string on the event.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need it, but considering that it's in ingest I thought we'd play it safe just in case someone calls this function again later and forgets this caveat

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That fair. Actually, here's an idea: Assuming my tallying-of-frames business works, I'm realizing I'm actually still going to need the variants to be passed into this function. So what if we were to do it like this?

Suggested change
"stacktrace": event.data.get("stacktrace_string", ""),
"stacktrace": event.data.get(
"stacktrace_string", get_stacktrace_string(get_grouping_info_from_variants(variants))
),

Then you also wouldn't need to add a call to should_call_seer_for_grouping in test_sends_expected_data_to_seer.

src/sentry/grouping/ingest/seer.py Show resolved Hide resolved
tests/sentry/grouping/ingest/test_seer.py Outdated Show resolved Hide resolved
tests/sentry/grouping/ingest/test_seer.py Outdated Show resolved Hide resolved
tests/sentry/grouping/ingest/test_seer.py Outdated Show resolved Hide resolved
tests/sentry/grouping/ingest/test_seer.py Outdated Show resolved Hide resolved
tests/sentry/grouping/ingest/test_seer.py Outdated Show resolved Hide resolved
tests/sentry/grouping/ingest/test_seer.py Outdated Show resolved Hide resolved
exception_type = get_path(event.data, "exception", "values", -1, "type")

request_data: SimilarIssuesEmbeddingsRequest = {
"event_id": event.event_id,
"hash": event_hash,
"project_id": event.project.id,
"stacktrace": stacktrace_string,
"stacktrace": event.data.get("stacktrace_string", ""),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That fair. Actually, here's an idea: Assuming my tallying-of-frames business works, I'm realizing I'm actually still going to need the variants to be passed into this function. So what if we were to do it like this?

Suggested change
"stacktrace": event.data.get("stacktrace_string", ""),
"stacktrace": event.data.get(
"stacktrace_string", get_stacktrace_string(get_grouping_info_from_variants(variants))
),

Then you also wouldn't need to add a call to should_call_seer_for_grouping in test_sends_expected_data_to_seer.

data={
"title": "title",
"platform": "python",
"stacktrace": {"in-app": False, "contributes": False, "frames": [{}]},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think in-app and contributes are things which show up in actual event data (only in variants). Also, an empty frames list also wouldn't get past event_content_is_seer_eligible... But here's maybe a simpler solution: Can we just mock get_stacktrace_string to return an empty string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty frame list actually passes event_content_is_seer_eligible, hence how the empty stacktrace metric is called after

tests/sentry/grouping/ingest/test_seer.py Outdated Show resolved Hide resolved
@jangjodi
Copy link
Member Author

@lobsterkatie regarding your first suggestion, we don't want to have to call get_stacktrace_string again since we need to call it in should_call_seer_for_grouping to figure out if the stacktrace is empty

@lobsterkatie
Copy link
Member

@lobsterkatie regarding your first suggestion, we don't want to have to call get_stacktrace_string again since we need to call it in should_call_seer_for_grouping to figure out if the stacktrace is empty

Right - that's why we're still getting it out of the event, where it should be, given that fact. It's just that if somehow it's not there, we're recalculating it as backup.

@jangjodi
Copy link
Member Author

@lobsterkatie regarding your first suggestion, we don't want to have to call get_stacktrace_string again since we need to call it in should_call_seer_for_grouping to figure out if the stacktrace is empty

Right - that's why we're still getting it out of the event, where it should be, given that fact. It's just that if somehow it's not there, we're recalculating it as backup.

ah sorry, I missed that. thank you!

@jangjodi jangjodi merged commit 87d5759 into master Nov 22, 2024
50 checks passed
@jangjodi jangjodi deleted the jodi/similarity-ingest-empty-stacktrace branch November 22, 2024 17:32
harshithadurai pushed a commit that referenced this pull request Nov 25, 2024
Add an empty stacktrace check after stacktrace formatting is done in
ingest
If the stacktrace is empty, do not send data to seer

---------

Co-authored-by: Armen Zambrano G. <[email protected]>
evanh pushed a commit that referenced this pull request Nov 25, 2024
Add an empty stacktrace check after stacktrace formatting is done in
ingest
If the stacktrace is empty, do not send data to seer

---------

Co-authored-by: Armen Zambrano G. <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants