-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try URL before creating / uploading Matterfile #179
Comments
QuestionIf the URL does exist but it turns out the call to |
This is perfect.
Yes, if the returned URL is |
Thanks for the info @JacksonMaxfield! I just want to double check on the hosting bit... the class declaration for the SupportingFile mentions it is not stored in firestore:
Is that still accurate / does it pertain to this at all? |
Great question! And I guess I didn't realize how deep in the weeds this issue was in the CDP infrastructure / pipeline. What you are looking at is our "ingestion models", the format of how our pipeline accepts data. What we are storing is our database models Regardless, those two things are connected, where you see a But yes, these files aren't stored in our own file store, we just store a link (or the https or http URL) to the file in our database. |
Apologies for the delay on getting this out, been working on it a bit this weekend and attempting to mock the
try_url doesn't actually perform HTTP requests) has been causing some problems.
Will update if I have any further progress (hopefully in the form of a PR!). |
I think you should be able to use: with mock.patch("cdp_backend.database.validators.resource_exists") as mocked_resource_exists:
mocked_resource_exists.return_value = True But I may be wrong... |
I tried it with the parameterized patch, I'll try it with that way instead 👍 |
Feature Description
This feature (enhancement) would add a bit more resiliency to the creation of Matterfile's seen here:
cdp-backend/cdp_backend/pipeline/event_gather_pipeline.py
Lines 1444 to 1452 in 2e472b9
Specifically, we want to check that the
supporting_file.uri
used in the creation of the Matterfile actually exists before creating / uploading the Matterfile.Use Case
Simply the use case here is to create better Matterfile data by ensuring that the URL's used exist.
Solution
In the above linked code, before the call to
create_matter_file
, do the following:try_url
(using the default resolve_func) to see if the URL exists.LookupError
as an exception to watch for in that loop, astry_url
will raise that exception on errors.Alternatives
Let me know if any other options are better.
The text was updated successfully, but these errors were encountered: