Implement scraping movies by URL #709

woodgen · 2020-08-07T17:59:20Z

It's the first time I really touched go or TypeScript, so a careful review is appreciated.

This series of patches implements the ability to scrape movies by URL.
It supports things like duration, studio and front + back image.
An example xpath scraper for Gamma Entertainment can be found on my community scrapers branch:
https://github.com/woodgen/CommunityScrapers/tree/movies

I tested the scraper with Evil Angel and it works well for creating new and updating existing movies by URL.

For more details I paste the commit messages here, since github will hide them:

    ui/movies: Add movie scrape dialog.
    
    Adds possibility to update existing movie entries with the URL scraper.
    
    For this the MovieScrapeDialog.tsx was implemented with Performers and
    Scenes as a reference. In addition DurationUtils needs to be called one
    time for converting seconds from the model to the string that is
    displayed in the component. This seemed the least intrusive to me as it
    kept a ScrapeResult<string> type compatible with ScrapedInputGroupRow.

    graphql+pkg+ui: Scrape movie studio.
    
    Extends and corrects the movie model for the ability to store and
    dereference studio IDs with received studio string from the scraper.
    This was done with Scenes as a reference. For simplicity the duplication
    of having `ScrapedMovieStudio` and `ScrapedSceneStudio` was kept, which
    should probably be refactored to be the same type in the model in the
    future.

    graphql+pkg+ui: Implement scraping movies by URL.
    
    This patch implements the missing required boilerplate for scraping
    movies by URL, using performers and scenes as a reference.
    
    Although this patch contains a big chunck of ground work for enabling
    scraping movies by fragment, the feature would require additional
    changes to be completely implemented and was not tested.

This patch implements the missing required boilerplate for scraping movies by URL, using performers and scenes as a reference. Although this patch contains a big chunck of ground work for enabling scraping movies by fragment, the feature would require additional changes to be completely implemented and was not tested.

Extends and corrects the movie model for the ability to store and dereference studio IDs with received studio string from the scraper. This was done with Scenes as a reference. For simplicity the duplication of having `ScrapedMovieStudio` and `ScrapedSceneStudio` was kept, which should probably be refactored to be the same type in the model in the future.

Adds possibility to update existing movie entries with the URL scraper. For this the MovieScrapeDialog.tsx was implemented with Performers and Scenes as a reference. In addition DurationUtils needs to be called one time for converting seconds from the model to the string that is displayed in the component. This seemed the least intrusive to me as it kept a ScrapeResult<string> type compatible with ScrapedInputGroupRow.

WithoutPants

Looks good. Thanks for the submission.

* api/urlbuilders/movie: Auto format. * graphql+pkg+ui: Implement scraping movies by URL. This patch implements the missing required boilerplate for scraping movies by URL, using performers and scenes as a reference. Although this patch contains a big chunck of ground work for enabling scraping movies by fragment, the feature would require additional changes to be completely implemented and was not tested. * graphql+pkg+ui: Scrape movie studio. Extends and corrects the movie model for the ability to store and dereference studio IDs with received studio string from the scraper. This was done with Scenes as a reference. For simplicity the duplication of having `ScrapedMovieStudio` and `ScrapedSceneStudio` was kept, which should probably be refactored to be the same type in the model in the future. * ui/movies: Add movie scrape dialog. Adds possibility to update existing movie entries with the URL scraper. For this the MovieScrapeDialog.tsx was implemented with Performers and Scenes as a reference. In addition DurationUtils needs to be called one time for converting seconds from the model to the string that is displayed in the component. This seemed the least intrusive to me as it kept a ScrapeResult<string> type compatible with ScrapedInputGroupRow.

woodgen force-pushed the scrape-movies branch from 6962571 to 16dd817 Compare August 7, 2020 21:52

woodgen added 4 commits August 8, 2020 00:16

api/urlbuilders/movie: Auto format.

0e957dc

woodgen force-pushed the scrape-movies branch from 16dd817 to 162cb77 Compare August 7, 2020 22:16

WithoutPants added the feature Pull requests that add a new feature label Aug 10, 2020

Merge remote-tracking branch 'upstream/develop' into prs/709

8ee52eb

WithoutPants added this to the Version 0.3.0 milestone Aug 10, 2020

WithoutPants added 3 commits August 10, 2020 14:36

Merge remote-tracking branch 'upstream/develop' into prs/709

73c4bbe

Update manual

fe97e9e

Add changelog entry

2bb58c0

WithoutPants approved these changes Aug 10, 2020

View reviewed changes

WithoutPants merged commit 4045ddf into stashapp:develop Aug 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement scraping movies by URL #709

Implement scraping movies by URL #709

woodgen commented Aug 7, 2020 •

edited

Loading

WithoutPants left a comment

Implement scraping movies by URL #709

Implement scraping movies by URL #709

Conversation

woodgen commented Aug 7, 2020 • edited Loading

WithoutPants left a comment

Choose a reason for hiding this comment

woodgen commented Aug 7, 2020 •

edited

Loading