Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPIKE: diagnose and find fix for NR agent error on 404 response - bump newrelic agent to latest #1032

Merged

Conversation

JFU-GIT
Copy link
Contributor

@JFU-GIT JFU-GIT commented Jul 26, 2022

JIRA Ticket:
BB2-1417

User Story or Bug Summary:

Context:

On deployment environment with Django upgrade 3.2.14: TEST, SBX, there are errors present in New Relic (see captured stack), this is new on Django upgrade deployment.

To reproduce:

Any URL pointing to the BB2 site with a non-exist path will trigger this error, example URLs are but not limited to:

https://test.bluebutton.cms.gov/kkkkkkkkkkkkkkkkkk
https://sandbox.bluebutton.cms.gov/lllllllllllllllllllllll

As shown by error stack, the request handling path which triggered the python TypeError is not in BB2 code, but originated from 3rd party code: new relic packages, and axes packages are involved.

This needs to be triaged and a fix found to make NR errors log clean

Suspicious area of problem: NewRelic agent vs Django-axes inconsistency

AC:

the root cause identified and a fix found.

What Does This PR Do?

The fix must be in the newrelic python package itself - since we don't have good ways to fix a third party inconsistency with Django 3.2.

the issue was found with a pretty old version of newrelic agent (version 4.4.1.* released 2018)

Bumping the python newrelic agent to the latest version (7.16.0.178 release 2022 July) is likely fix the issue.

What Should Reviewers Watch For?

If you're reviewing this PR, please check these things, in particular:

  • TODO

What Security Implications Does This PR Have?

Submitters should complete the following questionnaire:

  • If the answer to any of the questions below is Yes, then here's a link to the associated Security Impact Assessment (SIA), security checklist, or other similar document in Confluence: N/A.
    • Does this PR add any new software dependencies? No.
    • Does this PR modify or invalidate any of our security controls? No.
    • Does this PR store or transmit data that was not stored or transmitted before? No.
  • If the answer to any of the questions below is Yes, then please add StewGoin as a reviewer, and note that this PR should not be merged unless/until he also approves it.
    • Do you think this PR requires additional review of its security implications for other reasons? No.

What Needs to Be Merged and Deployed Before this PR?

This PR cannot be either merged or deployed until the following pre-requisite changes have been fully deployed:

  • CMSgov/some_repo#42

Any Migrations?

  • Yes, there are migrations
    • The migrations should be run PRIOR to the code being deployed
    • The migrations should be run AFTER the code is deployed
    • There is a more complicated migration plan (downtime, etc)
  • No migrations

Submitter Checklist

I have gone through and verified that...:

  • This PR is reasonably limited in scope, to help ensure that:
    1. It doesn't unnecessarily tie a bunch of disparate features, fixes, refactorings, etc. together.
    2. There isn't too much of a burden on reviewers.
    3. Any problems it causes have a small "blast radius".
    4. It'll be easier to rollback if that becomes necessary.
  • I have named this PR and its branch such that they'll be automatically be linked to the (most) relevant Jira issue, per: https://confluence.atlassian.com/adminjiracloud/integrating-with-development-tools-776636216.html.
  • This PR includes any required documentation changes, including README updates and changelog / release notes entries.
  • All new and modified code is appropriately commented, such that the what and why of its design would be reasonably clear to engineers, preferably ones unfamiliar with the project.
  • All tech debt and/or shortcomings introduced by this PR are detailed in TODO and/or FIXME comments, which include a JIRA ticket ID for any items that require urgent attention.
  • Reviews are requested from both:
    • At least two other engineers on this project, at least one of whom is a senior engineer or owns the relevant component(s) here.
    • Any relevant engineers on other projects (e.g. BFD, SLS, etc.).
  • Any deviations from the other policies in the DASG Engineering Standards are specifically called out in this PR, above.
    • Please review the standards every few months to ensure you're familiar with them.

@JFU-GIT JFU-GIT changed the title SPIKE: bump newrelic agent to latest - diagnose NR agent error on 404 response SPIKE: diagnose and find fix for NR agent error on 404 response - bump newrelic agent to latest Jul 26, 2022
Copy link
Contributor

@dtisza1 dtisza1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JFU-GIT I don't think we need the New Relic plugin for the selenium tests Docker image.

I'm thinking this file should remain unchanged, unless there are issues.

The other changes are looking good to me.

@JFU-GIT
Copy link
Contributor Author

JFU-GIT commented Jul 29, 2022

@JFU-GIT I don't think we need the New Relic plugin for the selenium tests Docker image.

I'm thinking this file should remain unchanged, unless there are issues.

The other changes are looking good to me.

Good question!

Even though newrelic is not used by selenium tests, the direct install of newrelic==7.16.0.178 is to work around an issue:

When pip installing requirements.dev.txt from vendor folder, as per this last line of Dockerfile:
RUN pip3 install -r requirements/requirements.dev.txt --no-index --find-links ./vendor/

there is a failure caused by newrelic of the version can not be satisfied by the selenium-chrome docker, python combination, similarly, need to do the same for: pyyaml==6.0 pillow==9.0.1 (not introduced in this PR).

I believe there are other unused packages not used by selenium tests but still got install through the vendor folder, but seems an acceptable 'waste' of disk space and runtime overhead, it's a tests to run on local or CI after all.

there are ways to be exact on the dependencies installed at runtime for tests & CI e.g., for example, use a trimmed version of requirements.selenium.txt, but that could make the dev environment cumbersome and also extra maintenance burden....

just some 2c.

Copy link
Contributor

@dtisza1 dtisza1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JFU-GIT This looks good to me!

That makes sense with keeping the NR and other packages for the selenium tests docker.

Copy link
Contributor

@ajshred ajshred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks James!

@JFU-GIT JFU-GIT merged commit dd60ad6 into master Aug 1, 2022
@JFU-GIT JFU-GIT deleted the jfuqian/BB2-1417-Investigate-NR-agent-error-on-404-response branch August 1, 2022 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants