Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caption support #2462

Merged
merged 63 commits into from
May 6, 2022
Merged

Caption support #2462

merged 63 commits into from
May 6, 2022

Conversation

cj12312021
Copy link
Collaborator

@cj12312021 cj12312021 commented Apr 1, 2022

Fixes #495

This pull request adds caption support for DE, EN, ES, FR, IT, NL, and PT VTT files. As far as I'm aware those are the only languages that are captioned in adult content. This pull request only adds support for VTT files since video.js only supports VTT files but we could support SRT files later by converting the files to VTT on the fly, see videojs/video.js#5923.

Caption

@WithoutPants WithoutPants added this to the Version 0.15.0 milestone Apr 2, 2022
@bnkai
Copy link
Collaborator

bnkai commented Apr 2, 2022

Some minor diffs to support srts as well.
After applying them do a go mod tidy and go mod vendor

Instead of serving the subtitle files, the subtitle files (vtt,srt) are first parsed and converted to vtts using the astisub package. That means that we can support every sub type that that library supports and also dismiss non valid subs.

diff --git a/internal/api/routes_scene.go b/internal/api/routes_scene.go
index 3ba2ed3d..fc83d5a8 100644
--- a/internal/api/routes_scene.go
+++ b/internal/api/routes_scene.go
@@ -1,6 +1,7 @@
 package api
 
 import (
+	"bytes"
 	"context"
 	"net/http"
 	"strconv"
@@ -298,46 +299,49 @@ func (rs sceneRoutes) InteractiveHeatmap(w http.ResponseWriter, r *http.Request)
 	http.ServeFile(w, r, filepath)
 }
 
-func (rs sceneRoutes) CaptionDE(w http.ResponseWriter, r *http.Request) {
+func (rs sceneRoutes) Caption(w http.ResponseWriter, r *http.Request, lang string) {
 	s := r.Context().Value(sceneKey).(*models.Scene)
-	caption := scene.GetCaptionDEPath(s.Path)
-	serveFileNoCache(w, r, caption)
+
+	caption := scene.GetCaptionPath(s.Path, lang, "vtt") // first try reading vtt
+	sub, err := scene.ReadSubs(caption)
+	if err != nil {
+		caption = scene.GetCaptionPath(s.Path, lang, "srt") // switch to srt if vtt wasnt available or invalid
+		sub, err = scene.ReadSubs(caption)
+	}
+
+	if err == nil {
+		var b bytes.Buffer
+		err = sub.WriteToWebVTT(&b)
+		if err == nil {
+			w.Header().Set("Content-Type", "text/vtt")
+			w.Header().Add("Cache-Control", "no-cache")
+			b.WriteTo(w)
+		}
+		return
+	}
+	logger.Debugf("Error while reading subs: %v", err)
 }
 
+func (rs sceneRoutes) CaptionDE(w http.ResponseWriter, r *http.Request) {
+	rs.Caption(w, r, "de")
+}
 func (rs sceneRoutes) CaptionEN(w http.ResponseWriter, r *http.Request) {
-	s := r.Context().Value(sceneKey).(*models.Scene)
-	caption := scene.GetCaptionENPath(s.Path)
-	serveFileNoCache(w, r, caption)
+	rs.Caption(w, r, "en")
 }
-
 func (rs sceneRoutes) CaptionES(w http.ResponseWriter, r *http.Request) {
-	s := r.Context().Value(sceneKey).(*models.Scene)
-	caption := scene.GetCaptionESPath(s.Path)
-	serveFileNoCache(w, r, caption)
+	rs.Caption(w, r, "es")
 }
-
 func (rs sceneRoutes) CaptionFR(w http.ResponseWriter, r *http.Request) {
-	s := r.Context().Value(sceneKey).(*models.Scene)
-	caption := scene.GetCaptionFRPath(s.Path)
-	serveFileNoCache(w, r, caption)
+	rs.Caption(w, r, "fr")
 }
-
 func (rs sceneRoutes) CaptionIT(w http.ResponseWriter, r *http.Request) {
-	s := r.Context().Value(sceneKey).(*models.Scene)
-	caption := scene.GetCaptionITPath(s.Path)
-	serveFileNoCache(w, r, caption)
+	rs.Caption(w, r, "it")
 }
-
 func (rs sceneRoutes) CaptionNL(w http.ResponseWriter, r *http.Request) {
-	s := r.Context().Value(sceneKey).(*models.Scene)
-	caption := scene.GetCaptionNLPath(s.Path)
-	serveFileNoCache(w, r, caption)
+	rs.Caption(w, r, "nl")
 }
-
 func (rs sceneRoutes) CaptionPT(w http.ResponseWriter, r *http.Request) {
-	s := r.Context().Value(sceneKey).(*models.Scene)
-	caption := scene.GetCaptionPTPath(s.Path)
-	serveFileNoCache(w, r, caption)
+	rs.Caption(w, r, "pt")
 }
 
 func (rs sceneRoutes) VttThumbs(w http.ResponseWriter, r *http.Request) {
diff --git a/pkg/scene/caption.go b/pkg/scene/caption.go
index 995ea3aa..ca90c1c3 100644
--- a/pkg/scene/caption.go
+++ b/pkg/scene/caption.go
@@ -3,46 +3,19 @@ package scene
 import (
 	"path/filepath"
 	"strings"
-)
-
-func GetCaptionDEPath(path string) string {
-	ext := filepath.Ext(path)
-	fn := strings.TrimSuffix(path, ext)
-	return fn + ".de.vtt"
-}
-
-func GetCaptionENPath(path string) string {
-	ext := filepath.Ext(path)
-	fn := strings.TrimSuffix(path, ext)
-	return fn + ".en.vtt"
-}
 
-func GetCaptionESPath(path string) string {
-	ext := filepath.Ext(path)
-	fn := strings.TrimSuffix(path, ext)
-	return fn + ".es.vtt"
-}
-
-func GetCaptionFRPath(path string) string {
-	ext := filepath.Ext(path)
-	fn := strings.TrimSuffix(path, ext)
-	return fn + ".fr.vtt"
-}
-
-func GetCaptionITPath(path string) string {
-	ext := filepath.Ext(path)
-	fn := strings.TrimSuffix(path, ext)
-	return fn + ".it.vtt"
-}
+	"github.com/asticode/go-astisub"
+)
 
-func GetCaptionNLPath(path string) string {
+// GetCaptionPath generates the path of a subtitle
+// from a given file path wanted language and subtitle sufffix
+func GetCaptionPath(path, lang, suffix string) string {
 	ext := filepath.Ext(path)
 	fn := strings.TrimSuffix(path, ext)
-	return fn + ".nl.vtt"
+	return fn + "." + lang + "." + suffix
 }
 
-func GetCaptionPTPath(path string) string {
-	ext := filepath.Ext(path)
-	fn := strings.TrimSuffix(path, ext)
-	return fn + ".pt.vtt"
+// ReadSubs reads a subtitles file
+func ReadSubs(path string) (*astisub.Subtitles, error) {
+	return astisub.OpenFile(path)
 }
diff --git a/pkg/scene/scan.go b/pkg/scene/scan.go
index be2aa48e..c4fd7ce7 100644
--- a/pkg/scene/scan.go
+++ b/pkg/scene/scan.go
@@ -62,6 +62,9 @@ func (scanner *Scanner) ScanExisting(existing file.FileBased, file file.SourceFi
 	interactive := getInteractive(path)
 
 	captioned := getCaption(file.Path())
+	if captioned {
+		logger.Debugf("Found subtitle for file %s", path)
+	}
 
 	oldHash := s.GetHash(scanner.FileNamingAlgorithm)
 	changed := false
@@ -342,33 +345,16 @@ func getInteractive(path string) bool {
 }
 
 func getCaption(path string) bool {
-	_, err := os.Stat(GetCaptionDEPath(path))
-	if err == nil {
-		return true
-	}
-	_, err = os.Stat(GetCaptionENPath(path))
-	if err == nil {
-		return true
-	}
-	_, err = os.Stat(GetCaptionESPath(path))
-	if err == nil {
-		return true
-	}
-	_, err = os.Stat(GetCaptionFRPath(path))
-	if err == nil {
-		return true
-	}
-	_, err = os.Stat(GetCaptionITPath(path))
-	if err == nil {
-		return true
-	}
-	_, err = os.Stat(GetCaptionNLPath(path))
-	if err == nil {
-		return true
-	}
-	_, err = os.Stat(GetCaptionPTPath(path))
-	if err == nil {
-		return true
+	langs := []string{"de", "en", "es", "fr", "it", "nl", "pt"}
+	for _, l := range langs {
+		_, err := os.Stat(GetCaptionPath(path, l, "vtt"))
+		if err != nil {
+			_, err = os.Stat(GetCaptionPath(path, l, "srt"))
+		}
+		if err == nil {
+			return true
+		}
+
 	}
 	return false
 }

@cj12312021
Copy link
Collaborator Author

I think we'll need to save the caption file paths to the database so we aren't guessing where to find the caption in routes_scene. If we do that we can also avoid adding captions to video.js that don't exist since if (scene.paths.caption_de) in ScenePlayer.tsx always returns true.

@bnkai
Copy link
Collaborator

bnkai commented Apr 3, 2022

Pasting input from the channel

I just got around to applying your change. The library is great. Regarding other subtitles, apart from .vtt files we would also be looking for .srt, .smi, .ssa, and .ass files. I still think the plex approach is best. For each scene, we would check and see if a subtitle file exist with the following name pattern {video_name}.{lang_code}.{sub_ext} and {video_name}.{sub_ext}. If we find a subtitle file with language code then we can provide a label for it video.js. If not then we would label it Unknown.
The concern here is that for each scene we would be looking for the existence of 40 different files(5 different extensions, 8 different names). 75 different files if we decided we also want to look for files that use the 3 letter language code. I'm not sure how much that would impact scan time. If the impact is noticeable I'm thinking we could have toggle on the Scan task to enable stash to look for subtitle.

And

I'm not sure what's going on with the captions during transcodes. Seems like video.js loses track of the video time index and starts reading the captions from the beginning. The second you change back to a direct stream video.js immediately picks up the correct time index. I can't seem to find anything on using captions in video.js with transcoded videos. I don't think there is anything we can change regarding our use of captions in video.js to fix this issue. I see this warning WARN[2022-04-02 20:33:20] Error while deregistering ffmpeg stream: exit status 1 being thrown in VS Code during our transcodes as well as this error in the browser console:

Uncaught (in promise) DOMException: The play() request was interrupted by a new load request.
setTimeout (async)
t.setTimeout @ vendor.665f3ad8.js:745
Fst @ vendor.665f3ad8.js:745
r.handleSrc_ @ vendor.665f3ad8.js:757
r.src @ vendor.665f3ad8.js:757
t.currentTime @ index.849f272b.js:1824
r.userSeek_ @ vendor.665f3ad8.js:745
r.handleMouseMove @ vendor.665f3ad8.js:745
a.handleMouseMove_ @ vendor.665f3ad8.js:745
Ua.n.dispatcher.n.dispatcher @ vendor.665f3ad8.js:745

I'm not sure if those have anything to do with it. I unfortunately don't know enough about the transcoding process atm.

I think we have a few different options to choose from for the subtitles detection.

  • Treat the subtitles as a new type. When scanning they will be added and associated with scenes. When cleaning they get removed if missing. This will improve the scanning efficiency but it is probably an overkill and will add a lot of extra code for maintenance
  • Remove all the subtitle code from the scan task. Only do the detection when the scene is loaded in the player. If that can be done we can check for subtitles when the scene starts to load. It will probably delay the loading a bit, i am not sure if it will be noticable since a few os.stats should be a lot faster than the backend reading the start of the video file or starting to transcode using ffmpeg
    EDIT
  • Another option would be to add the paths as an extra field to the scene. This will simplify the code a bit, it won't help much with the detection during scanning though as we will probably have to check every scene during scan.

In any case only a list of available subtitles and their label/lang should be sent over to videojs.
Finding a list of subtitles files more efficently is a little tricky.
For cases where not a lot of files are in the scene/video folder we could instead get the directory content once and then try to regex match subtitle filenames. That should be faster than os stating all the combos.
If a user though has a single or only a few folders for all his files we could end up with a listing of thousands of files which might not be faster than the os stating of the combos.
If the scan added all the subtitles in the db it would be faster but is it worth the extra code needed?

Regarding the issue with live transcoding i think we might need to check if the files are live transcoded or directly streamed. If they are live transcoded then before serving the sub we can use the subtitle library to add a delay (equal to the seek point) to the subtitles. Adding the delay is easy not sure how to do the first part though

@cj12312021
Copy link
Collaborator Author

  • For option 1, I think making a type for subtitles might be overkill.
  • For option 2, I think there is value to being able to identify what scenes are being captioned outside of the scene details page.
  • Option 3 seems like the best way to go. Since most users likely won’t be providing subtitles for their scenes it probably makes sense to have a scan toggle to enable/disable subtitle detection.

What do you consider a lot of files in a directory? I imagine some scene folders could contain up to 90 supporting files.

If we didn’t add the subtitle paths into the DB, would there be another way in ScenePlayer.tsx to identify what languages were provided without creating a Captioned Boolean for every language. I’m genuinely asking. Stash is my first adventure into Full-Stack development, so my knowledge of what is possible here is currently limited. Like I mentioned earlier, the check we do, such as if (scene.paths.caption_de) in that file, always returns true we could be adding blanks into video.js for languages that weren’t actually provided.

For offsetting the subtitles, I think we could use https://github.com/funkyremi/vtt-live-edit.

@cj12312021
Copy link
Collaborator Author

I could use an extra set of eyes on this. I'm having issues with the 'vtt-live-edit' library I've imported. It looks like the library was imported properly since the IDE nor the web console is throwing any errors. However, it doesn't look like the calls I'm making to the libraries' functions are not doing anything. It's not clear if I'm missing something or if the library just doesn't work. For example, I made a simple function call to change the subtitle font color to red but the font color was not affected at all.

@bnkai
Copy link
Collaborator

bnkai commented Apr 4, 2022

pasting from the channel

My ts/js knowledge is very limited:-(... From a quick look it seems that the library you added to the PR is meant for video elements and the native vtt support. Afaik videojs uses its own vtt.js library for most cases and doesnt switch to native unless for specific browsers ...
For example i found the below

Unfortunately, we don't have a good way for developers to programmatically style text tracks. The way that Vtt.js works (the lib we use to display captions) doesn't support the ::cue pseudo selectors yet and we don't have an external API for our text track settings: #4543. The caption settings should be available though if you're not using the native text tracks

The library you used counts on the ::cue to set the foreground so thats why you dont get the color change
If that doesnt work i wouldnt count on the offset to work
For example to change the color of the subs something like the below seems to work

          var settings = (player as any).textTrackSettings;
          settings.setValues({
            "color": "#F00",
          });
          settings.updateDisplay();

instead of your

 vtt.setFontColor("#FF0000");

With some googling i found (in the changelog from version 4.6.0 of videojs) this
Added a player option to offset the subtitles/captions timing (view)
From what i understand adding the track option mentioned in the link should offset the subtitles with videojs own vtt library.
Not sure how that can be done though.

@bnkai
Copy link
Collaborator

bnkai commented Apr 4, 2022

btw can you apply

diff --git a/internal/api/routes_scene.go b/internal/api/routes_scene.go
index 706da306..47914085 100644
--- a/internal/api/routes_scene.go
+++ b/internal/api/routes_scene.go
@@ -301,7 +301,7 @@ func (rs sceneRoutes) InteractiveHeatmap(w http.ResponseWriter, r *http.Request)
 
 func (rs sceneRoutes) Caption(w http.ResponseWriter, r *http.Request, lang string) {
        s := r.Context().Value(sceneKey).(*models.Scene)
-       
+
        caption := scene.GetCaptionPath(s.Path, lang, "vtt") // first try reading vtt
        sub, err := scene.ReadSubs(caption)
        if err != nil {
@@ -315,7 +315,7 @@ func (rs sceneRoutes) Caption(w http.ResponseWriter, r *http.Request, lang strin
                if err == nil {
                        w.Header().Set("Content-Type", "text/vtt")
                        w.Header().Add("Cache-Control", "no-cache")
-                       b.WriteTo(w)
+                       _, _ = b.WriteTo(w)
                }
                return
        }
diff --git a/pkg/scene/caption.go b/pkg/scene/caption.go
index d971e8fd..ca90c1c3 100644
--- a/pkg/scene/caption.go
+++ b/pkg/scene/caption.go
@@ -18,4 +18,4 @@ func GetCaptionPath(path, lang, suffix string) string {
 // ReadSubs reads a subtitles file
 func ReadSubs(path string) (*astisub.Subtitles, error) {
        return astisub.OpenFile(path)
-}
\ No newline at end of file
+}

?
I forgot to validate the subtile code before pasting

EDIT

What do you consider a lot of files in a directory? I imagine some scene folders could contain up to 90 supporting files.

My worst case scenario would be a user having a flat directory structure for his collection eg all scenes (or say a big percentage) in a single directory! In that case the readdir and regex method i mentioned above would probably be inneficient for detecting subs.
I guess combining option 3 with a detect subtitles option for the scan task would be the best we can do? That way users that dont have subtitles wont mind. Users that have a lot of subtitles would probably enable that by default and users with a few could do a selective scan from time to time?

@cj12312021
Copy link
Collaborator Author

According to videojs/video.js#4212 it sounds like your find from version 4.6.0 was later removed. I can try your suggested alternative from discord:

If needed the go library that converts the subs supports offseting. For non direct streamed scenes unload all subtitle tracks and reload a new delayed version of them? That would be a bit more complex but it could be done if nothing else works

btw can you apply

No problem. I can go ahead and apply the change you posted.

That way users that dont have subtitles wont mind. Users that have a lot of subtitles would probably enable that by default and users with a few could do a selective scan from time to time?

Yes, that was what I was thinking. Users who don't use subtitles would always have that detection disabled. Whereas users that do would turn that option on most likely during selective scans if only a few scenes provide them.

@cj12312021
Copy link
Collaborator Author

I just tried viewing the captioned scene on my phone and it looks like the VTT library appears to be working there a changed the subtitle text to red. Still, a solution that works everywhere would be ideal.

@cj12312021
Copy link
Collaborator Author

What scenario is the filter breaking for you? From my testing, it works the same as how the Studio Alias filter works, which was the intent. If I wanted to find scenes with English filters, I would provide a Captions is "en" or Captions included "en" query to get all the scenes that provide an English filter. IMO providing more querying capabilities is preferable to having less.

@bnkai
Copy link
Collaborator

bnkai commented Apr 22, 2022

What scenario is the filter breaking for you? From my testing, it works the same as how the Studio Alias filter works, which was the intent. If I wanted to find scenes with English filters, I would provide a Captions is "en" or Captions included "en" query to get all the scenes that provide an English filter. IMO providing more querying capabilities is preferable to having less.

I think it was the is null / not null
I was mainly trying to see which scenes had subs so i can test the playback

@cj12312021
Copy link
Collaborator Author

Ah, okay, I thought that might be the case. Couldn't that be solved by making captions nullable in the database?

@bnkai
Copy link
Collaborator

bnkai commented Apr 22, 2022

is not, excludes, not matches regex are also broken or at least dont work as expected

@cj12312021
Copy link
Collaborator Author

Ah okay, I'll go ahead and limit the filter capability as you suggested so we can get this change merged sooner.

@bnkai
Copy link
Collaborator

bnkai commented Apr 22, 2022

Ah okay, I'll go ahead and limit the filter capability as you suggested so we can get this change merged sooner.

You could try the not nullable you mention. I think the issue is when you join on the language_code column (that is not nullable)

Edit:
The studio alias is also not nullable so not sure where exactly the issue is

@cj12312021
Copy link
Collaborator Author

The studio alias is also not nullable so not sure where exactly the issue is

Okay, when I get some time, I'll look at the filter and see if it can be fixed with minimal changes. If not, I'll just dumb down the filter.

@cj12312021
Copy link
Collaborator Author

cj12312021 commented Apr 23, 2022

I did some testing with the studio alias filter and saw that it also does not work correctly. After looking at the table again, I don't think it makes sense to make language_code nullable in the table. So, I went ahead and updated the caption filter to be a binary true or false.

@@ -174,6 +174,8 @@ input SceneFilterType {
interactive: Boolean
"""Filter by InteractiveSpeed"""
interactive_speed: IntCriterionInput
"""Filter to only include scenes which have captions. `true` or `false`"""
captioned: String
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the values are only true or false, then it should be a boolean. I think however, that the filtering should probably be a StringCriterionInput or something so that one can filter on caption language etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a StringCriterionInput initially similar to the studio alias filter. However, we noticed that approach inherited some issues the studio alias filter has. The current implementation was copied from the has markers filter, which also uses strings to define a Boolean, but I can look into updating this to use a Boolean.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having had a quick look, I think the StringCriterionInput is probably better. I'd prefer to use that and fix the issues arising out of that than have to change the graphql schema later when we figure it out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I'll revert the change to continue using the StringCriterionInput.

CREATE TABLE `scene_captions` (
`scene_id` integer,
`language_code` varchar(255) NOT NULL,
`path` varchar(255) NOT NULL,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is path really necessary? Can't we infer the path from the language code and caption type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the path could be reconstructed with the language_code and caption type.

@@ -85,6 +85,8 @@ input ScanMetadataInput {
scanGeneratePhashes: Boolean
"""Generate image thumbnails during scan"""
scanGenerateThumbnails: Boolean
"""Detect scene captions"""
scanDetectCaptions: Boolean
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of making this optional?

Copy link
Collaborator Author

@cj12312021 cj12312021 May 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was this intentional. I can update it to be required.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what I mean is - is there any reason not to have the scan task always detect captions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The detect captions toggle was introduced when the code was still looking for captions by trying various name combinations to see if we could get a match. That approach was expensive, which led to this toggle being added so users without captions wouldn't be impacted by that search when it didn't benefit them. The toggle could probably be removed now that the search code is more intelligent.

@cj12312021
Copy link
Collaborator Author

Perfect, your recent push addresses the regression. Also, I love what you did with the caption filter.

@WithoutPants WithoutPants merged commit c1a096a into stashapp:develop May 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Pull requests that add a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Subtitle Support
3 participants