-
-
Notifications
You must be signed in to change notification settings - Fork 760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify scrape refactor #1630
Unify scrape refactor #1630
Conversation
Edit: you can ignore this comment. I compared it against develop where it is working fine, but my "production" system is broken as well (build based on #1604). So this should be 59c6fe0 / #1632 Did some quick testing. Scrape by URL and using "scrape with" seem to be working fine for both scenes and galleries. |
From a quick testing there seem to be issues when using the tagger interface ( i merged the dev version to avoid the above mrntioned bug)
For the scenes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a review, but I guess it's more like I learned from the code than actually having good feedback. Most comments are based on personal preferences which I would have done differently, so obviously not gonna make any rejections against them.
I guess the "biggest" issue is that ScrapedPerformer.image
is still being used by the client while it is marked as deprecated. But there might be more which I didn't check.
Thanks for the testing and feedback. The bugs reported should now be addressed. |
{"errors":[{"message":"unknown field","path":["variable","input","stash_ids",0,"__typename"],"extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}}],"data":null}
|
Studio bug was an existing latent bug. Both issues should now be fixed. |
I'm guessing I should hold my hand up for that? Most likely a regression from the Studio edit/details refactor. Although I seem to be able to edit/save a studio just fine on the develop branch. (But issue being stash box related I might not be able to reproduce because of not having stash box). |
It doesn't matter - I've introduced many very similar bugs from similar code changes. The bug is only reproducible if you have a stash-id on the studio. |
Found a regression I think. On develop I can open a newly scanned scene, fill in some of the fields and run a scraper (in my case the SARJ-LLC Python scraper) and it would work fine and show the result. But on this branch the scraper throws an error. Don't have time to investigate this at the moment, but I guess the filled in data isn't submitted and it uses the data from the database which obviously is still empty. When saving and afterwards executing the scene scraper it does work fine (but I guess changing data in between will also send the old / saved data to the scraper instead of the unsaved data). |
Checked the code. For scenes, galleries and movies there is a |
This regression is related to the way scene scrapers work, which is different to the way the performer scraper works, and I believe I've painted myself into a corner a bit. In the original scene scraping code, the "Scrape With..." scrape interface accepts This is distinct from the performer scrapers, where the performer fragment input is assumed to be the from a scraped performer search result. In preparation of the scene scraper query change, I changed the way that scene scraping works. The I'm thinking that this may be a necessary regression in order to make it consistent with the performer scraper code, and allow scene scrape querying. This means that all scene fragment scrapers would operate on current scene state. |
Okay, good point about performer scraper having the search form when using "scrape with". But could you explain me then what future plans you have? :) Will scenes and galleries get a search form too, where you can enter title and date? Then I would be happy, as currently I use my SARJ-LLC scraper based on (just) those fields which works pretty nice as I don't have to look up the URL. Just fill in the title and date (which are part of filename and thus title after scanning), and start the scraper which will fill in all fields including URL etc. |
The same already goes for JSON scrapers. They work with what is in the DB not by input , so that wouldnt be that much of a regression EDIT
|
The WIP branch is here: https://github.com/WithoutPants/stash/tree/scrape-scene-by-name
Still looks like it works ok for me. Can you give me some example hashes? Assuming they're in the database you provided forever ago, I should be able to reproduce using that. |
Thanks for the link. Ran it and checked the code. Seems like it adds a new "scraping" method based on name to provide a list of hits, correct? (So identical to performers name based search / scraping) But of course this discussion doesn't really affect this PR. Especially since you both stated JSON and xpath scrapers already receive the fragment from the database and not the edited values. So there was already a difference between those which is now gone and can be seen as a bugfix. |
I managed to replicate that using a single scene. Even if you have the old db it doesnt have the phashes :-( Steps to reproduce.
Dev build returns a result while the build from here doesnt All the matches i had with missing results seem to be matches by |
Can you merge/rebase with current dev? I think there are some bugs left that were already fixed in the dev version (like the duration of the scenes that match using the scene tagger) and the |
Done, but I don't think any of the latest changes would have fixed anything, so you might be seeing new bugs. |
Yeah new bug confirmed. PR Build -> Duration uknown The plain search query from the tagger has also the same issue The |
Thanks. This should now be fixed. |
ScrapedScene*
graphql types intoScraped*
This PR lays the groundwork for some additional changes and PRs. I have a PR almost ready to submit that adds support for querying scene scrapers by string, but I need this PR tested and reviewed first.