-
-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[youtube] still download translated_subs with --extractor-args youtube:skip=translated_subs
#4090
Comments
They are auto-generated subs, not auto-translated ones. See the output of |
Duplicate of #3875 |
I checked those subs( the ja one), but it is auto-translated too. |
YouTube auto-generates subtitles in every language. If the video has normal subtitles, each of them can also be translated to all languages. The |
I checked the web player, and only the |
maybe actually those subs are just useless and are not for real users? |
because auto-translated subs in most cases are not readable at all. and the only useful subs are
So may need a way to only download those. |
This is a great explanation. Thank you. In a youtube video in English language only and 1 English subtitle track, what is the difference between for example Zulu automatic caption (zu) and Zulu-from-English automatic caption (zu-en) ? |
My understanding is that zu is tts from the audio and zu-en is translated from English subs |
Here is why I would doubt that: Example: https://youtu.be/SoEkCshMcOY is a short clip of Donald Trump talking in English. content of <p begin="00:00:00.030" end="00:00:04.440" style="s2">much of it going to farmers and</p>
<p begin="00:00:01.860" end="00:00:06.629" style="s2">manufacturers so I'll let you know I</p>
<p begin="00:00:04.440" end="00:00:11.519" style="s2">mean I hope they got honor the deal</p>
<p begin="00:00:06.629" end="00:00:12.990" style="s2">what are you working for China I work</p>
<p begin="00:00:11.519" end="00:00:16.020" style="s2">with China are you with in this paper</p>
<p begin="00:00:12.990" end="00:00:19.439" style="s2">who are you with Bennett TV who owns</p>
<p begin="00:00:16.020" end="00:00:23.400" style="s2">that China aims it all by China or is it</p>
<p begin="00:00:19.439" end="00:00:26.460" style="s2">owned by the state no it's not ok good</p>
<p begin="00:00:23.400" end="00:00:28.080" style="s2">ok look I'll let you know I'll give you</p>
<p begin="00:00:26.460" end="00:00:30.000" style="s2">a good answer to that in a few months I</p>
<p begin="00:00:28.080" end="00:00:32.130" style="s2">wanted to see what they do because it's</p>
<p begin="00:00:30.000" end="00:00:34.920" style="s2">time for them to help us ok it's time</p>
<p begin="00:00:32.130" end="00:00:37.380" style="s2">right now for China to help us and</p>
<p begin="00:00:34.920" end="00:00:39.980" style="s2">hopefully they do and if they don't</p>
<p begin="00:00:37.380" end="00:00:39.980" style="s2">that's okay too</p> content of <p begin="00:00:00.030" end="00:00:04.440" style="s2">okuningi kuya kubalimi nabakhiqizi</p>
<p begin="00:00:01.860" end="00:00:06.629" style="s2">ngakho ngizonazisa ngiqonde ukuthi ngithemba ukuthi</p>
<p begin="00:00:04.440" end="00:00:11.519" style="s2">bathole ukuhlonishwa isivumelwano</p>
<p begin="00:00:06.629" end="00:00:12.990" style="s2">usebenzela iShayina ngisebenza</p>
<p begin="00:00:11.519" end="00:00:16.020" style="s2">neChina ukhona nobani kuleli phepha</p>
<p begin="00:00:12.990" end="00:00:19.439" style="s2">wena noBennett TV ongumnikazi walelo</p>
<p begin="00:00:16.020" end="00:00:23.400" style="s2">China ihlose iShayina yonke noma</p>
<p begin="00:00:19.439" end="00:00:26.460" style="s2">iphethwe umbuso cha akulungile kulungile bheka ngizonazisa</p>
<p begin="00:00:23.400" end="00:00:28.080" style="s2">ngizoninika</p>
<p begin="00:00:26.460" end="00:00:30.000" style="s2">impendulo eqondile kulokho ezinyangeni ezimbalwa</p>
<p begin="00:00:28.080" end="00:00:32.130" style="s2">bengifuna ukubona ukuthi benzani ngoba</p>
<p begin="00:00:30.000" end="00:00:34.920" style="s2">sekuyisikhathi sokuthi ukuthi basisize kulungile sekuyisikhathi</p>
<p begin="00:00:32.130" end="00:00:37.380" style="s2">manje sokuthi iChina isisize futhi</p>
<p begin="00:00:34.920" end="00:00:39.980" style="s2">ngethemba ukuthi izosisiza futhi uma ingakwenzi</p>
<p begin="00:00:37.380" end="00:00:39.980" style="s2">lokho kulungile futhi</p> Trump doesn't speak Zulu, so this must be auto-translated rather than text-2-speech. |
Obviously it is being translated. But from audio instead of from subs. Internally, they may be generating one auto sub first and then translating to other languages. Or they may have some system to do it directly from speech. I have no way of knowing. |
So if I get that right, you think it's the other way round:
|
Yes. Due to the way subs are extracted, I know for fact that |
Checklist
Region
No response
Description
but other links no issue with
--extractor-args youtube:skip=translated_subs
Verbose log
The text was updated successfully, but these errors were encountered: