Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoder improvements #127

Merged
merged 14 commits into from
Jun 22, 2024
Merged

Conversation

nihil-admirari
Copy link
Contributor

Images:

  1. The code no longer passes the options encoders doesn't expect to see: AVIF gets only AVIF-specific parameters, WEBP – only WEBP-specific ones etc.

  2. Added support for animated AVIFs.

  3. AVIF encoder's cpu-used is set to 6 compared to the default of 1. On my machine, values below 6 become increasing slower without much of an improvement when it comes to file size.

    FFmpeg CRF 15 (high quality) 937×400 still images: 3 MB per 10k images in exchange for 1.5 to 3 times slower encoding.

    cpu-used file size in bytes encoding time
    0 13776 18.535s
    1 13870 11.344s
    2 13800 11.296s
    3 13725 6.772s
    4 14046 5.515s
    5 14017 4.839s
    6 14177 4.433s
    7 14177 4.495s
    8 14177 4.479s

    FFmpeg CRF 32 (medium quality) 937×400 2 seconds animations: 1 MB per 100 images is more noticable, but the encoding time becomes obscene very very quickly.

    cpu-used file size in bytes encoding time
    0 84534 10m18.072s
    1 85875 3m30.353s
    2 89661 1m18.690s
    3 90883 30.647s
    4 91230 21.556s
    5 90856 12.933s
    6 92153 10.921s
    7 92153 10.892s
    8 92153 10.882s
  4. Still image scaling now uses sinc, which is the best downscaling algorithm: https://stackoverflow.com/a/6171860. Animated images still use Lanczos.

  5. Image scaling now uses accurate rounding.

  6. min()s in scaling filter prevent image upscaling.

  7. JPEG's quality is now correctly mapped into the range 2–31.

  8. Image aspect ratio preserving scaling now uses -1 instead of -2. According to FFmpeg docs, -n scale is here to ‘make sure that the calculated dimension is divisible by n.’ I don't quite understand while divisibility by 2 was needed in the first place.

Audio:

  1. Opus phase inversion is disabled, which improves mono quality: https://ffmpeg.org/ffmpeg-codecs.html#Option-Mapping.
  2. The point of Opus is Ogg is lost on me, considering that Opus has its own extension. Ogg files containing non-Vorbis data are confusing. The ability to choose an Opus container was added to fix this, though Ogg is still the default.
  3. Implementation of Opus in CAF was fixed. Previously it produced CAF-encoded files with Ogg extension.
  4. Support for M4A and Webm containers have been added as a CAF replacement for newer Apple devices.
  5. MP3-encoder now uses ABR, which is recommended for low bitrates: https://wiki.hydrogenaud.io/index.php?title=LAME#Recommended_encoder_settings.
  6. loudnorm now has a proper two-pass implementation. dual_mono is now enabled to account for mono downmixing: https://ffmpeg.org/ffmpeg-filters.html#loudnorm.

Miscellanea:

  1. append_user_audio_args no longer moves the last argument on every insert.
  2. Exit codes of FFmpeg and MPV are now checked before reporting a successful conversion. Checking for file presence is not enough: sometimes failed conversions result in zero-length files.
  3. Common MPV calling code was extracted into a separate function.
  4. MPV's libaom-av1 support is now checked.

@tatsumoto-ren
Copy link
Member

Hello. Thanks for the patch.

@tatsumoto-ren
Copy link
Member

AVIF encoder's cpu-used is set to 6 compared to the default of 1. On my machine, values below 6 become increasing slower without much of an improvement when it comes to file size.

If a computer has less cores than 6, will it automatically choose the right number? Or would it be better to set cpu-used to min(6, nproc)?

@tatsumoto-ren
Copy link
Member

tatsumoto-ren commented Jun 19, 2024

Image aspect ratio preserving scaling now uses -1 instead of -2. According to FFmpeg docs, -n scale is here to ‘make sure that the calculated dimension is divisible by n.’ I don't quite understand while divisibility by 2 was needed in the first place.

Setting the value to -2 is required to ensure that encoding doesn't fail. If set to -1, there might be problems. Encoding will fail with an error like "height/width not divisible by 2". This is probably not something that happens often though.

Comment on lines 290 to 297
# Perform loudness normalization.
# Parameter explanation can be found e.g. at:
# https://auphonic.com/blog/2013/01/07/loudness-targets-mobile-audio-podcasts-radio-tv/
# https://auphonic.com/blog/2019/08/19/dynamic-range-processing/
loudnorm=yes
loudnorm_target=-16
loudnorm_range=11
loudnorm_peak=-1.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new config parameters, the users will have to update their config files. For example, I have previously set loudnorm parameters using ffmpeg_audio_args, and now I get audio files with different loudness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabled two-pass loudnorm by default and brought back the custom filter. Added scary warnings that filter must be removed before enabling two-pass loudnorm.

Loudness will be the same if target, range and peak are set to the same parameters that previously were in ffmpeg_audio_args.

cfg_mgr.lua Outdated
Comment on lines 47 to 48
-- Animated webp images can only have .webp extension.
-- The user has no choice on this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lost the comment here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this comment meant that there is only one animation format, compared to three formats for still pictures. Since I've added support for another animation format, I've removed the comment.

If the comment was actually supposed to state that WebP animations come with and only with .webp extensions, well, it's kinda obvious.

encoder.lua Outdated
Comment on lines 70 to 76
local function rescale_quality(quality, min_q, max_q)
local scaled = min_q + (max_q - min_q) * quality / 100
-- Round to the nearest integer that's better in quality.
if min_q > max_q then
return math.floor(scaled)
end
return math.ceil(scaled)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was previously named quality_to_crf but with the new name it's hard to understand what it does.

Copy link
Contributor Author

@nihil-admirari nihil-admirari Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to fit_quality_percentage_to_range. JPEG function was also renamed since qscale is not CRF: https://stackoverflow.com/a/40673522. AVIF and JPEF constants were moved inside their respective functions.

encoder.lua Outdated
'-vf', string.format("scale='min(%d,iw)':'min(%d,ih)':flags=sinc+accurate_rnd",
self.config.snapshot_width, self.config.snapshot_height),
'-frames:v', '1',
table.unpack(encoder_args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using table.unpack() crashes mpvacious.

attempt to call field 'unpack' (a nil value)

Use h.unpack() instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better be commented somewhere.

encoder.lua Outdated Show resolved Hide resolved
encoder.lua Outdated Show resolved Hide resolved
@nihil-admirari
Copy link
Contributor Author

If a computer has less cores than 6, will it automatically choose the right number? Or would it be better to set cpu-used to min(6, nproc)?

cpu-used has nothing to do with multithreading. It's a computational intensity parameter, kinda like compression_level for WebP. See https://ffmpeg.org/ffmpeg-codecs.html#Options-28:

cpu-used
Set the quality/encoding speed tradeoff. Valid range is from 0 to 8, higher numbers indicating greater speed and lower quality. The default value is 1, which will be slow and high quality.

end

mpv.make_audio_args = function(source_path, output_path, start_timestamp, end_timestamp)
mpv.make_audio_args = function(source_path, output_path,
start_timestamp, end_timestamp, args_consumer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's possible to unwind these callbacks (e.g. args_consumer) to make the code more readable. Especially considering that lua doesn't have static typing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two possible ways to do that:

  1. Replace mp.command_native_async with mp.command_native. Can block the execution for considerable time (see the timings in the head), but may not be a problem if the rest of MPV remains responsive. Have no idea how MPV handles a single blocked Lua script.
  2. Rewrite using Lua coroutines. Never worked with these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is you could make the args table first, and then execute the args without passing the function that executes them into the function that makes the args.

@tatsumoto-ren
Copy link
Member

cpu-used has nothing to do with multithreading. It's a computational intensity parameter, kinda like compression_level for WebP.

Alright. The name is confusing apparently.

@nihil-admirari
Copy link
Contributor Author

  1. Fixed user supplied filters not being appended after loudnorm.
  2. Refactored the use of scaling related functions.

@tatsumoto-ren tatsumoto-ren merged commit aa86597 into Ajatt-Tools:master Jun 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants