0
0
mirror of https://github.com/mpv-player/mpv.git synced 2024-09-20 12:02:23 +02:00
Commit Graph

380 Commits

Author SHA1 Message Date
Kacper Michajłow
0a4b139ddf osdep: add MP_FALLTHROUGH 2023-02-02 14:23:02 +00:00
sfan5
9b59d39a3a vo_gpu: implement VO_DR_FLAG_HOST_CACHED
For OpenGL, this is based on simply comparing GL_VENDOR strings against
a list of allowed vendors.

Co-authored-by: Nicolas F. <ovdev@fratti.ch>
Co-authored-by: Niklas Haas <git@haasn.dev>
2023-01-23 14:13:34 +01:00
Niklas Haas
f8c17f55f9 vo: add int flags to the get_image signature
This is a huge disgusting mess to thread through everywhere. Maybe I'm
stupid for attempting to solve the problem this way.
2023-01-23 14:13:34 +01:00
sfan5
1201d59f0b various: replace abort() with MP_ASSERT_UNREACHABLE() where appropriate
In debug mode the macro causes an assertion failure.
In release mode it works differently and tells the compiler that it can
assume the codepath will never execute. For this reason I was conversative
in replacing it, e.g. in mpv-internal code that exhausts all valid values
of an enum or when a condition is clear from directly preceding code.
2023-01-12 22:02:07 +01:00
sfan5
7b03cd367d various: replace if + abort() with MP_HANDLE_OOM()
MP_HANDLE_OOM also aborts but calls assert() first, which
will result in an useful message if compiled in debug mode.
2023-01-12 22:02:07 +01:00
Niklas Haas
e1a04cd8ac lcms: fix validate_3dlut_size_opt
Not only was this function needlessly convoluted, it was also broken -
since it returned a bool value where an error code was expected.
2022-11-21 17:39:37 +01:00
Niklas Haas
b9b3342369 lcms: always parse lcms2-related options
Currently, the lcms-related options are only defined when lcms2 is
enabled at build time. However, this causes issues (e.g. segfault) for
vo_gpu_next, which relies on the presence of these options (to forward
them to libplacebo).

(That libplacebo internally depends on lcms2 as well is an
implementation detail - compiling mpv *without* lcms against libplacebo
*with* lcms should be possible in principle)

Fixes: #10891
Closes: #10856
2022-11-21 17:39:37 +01:00
Philip Langdale
25fa52d237 vo: hwdec: remove legacy_names
These were introduced for configuration compatibility in 0.35 but we
don't want to retain them past that point.
2022-11-15 16:33:12 +01:00
Niklas Haas
33136c276c vo_gpu_next: add tunable shader parameters
This is a very simple but easy way of doing it. Ideally, it would be
nice if we could also add some sort of introspection about shader
parameters at runtime, ideally exposing the entire list of parameters as
a custom property dict. But that is a lot of effort for dubious gain.

It's worth noting that, as currently implemented, re-setting
`glsl-shader-opts` to a new value doesn't reset back previously mutated
values to their defaults.
2022-11-11 13:58:35 +01:00
sfan5
ac3966184b vo_gpu: mark --gamma-factor and --gamma-auto with deprecation warnings
This was forgotten in commit 2207236aaa.
2022-11-10 16:19:37 +01:00
Dudemanguy
2c53fb6a2b gpu/context: properly guard wldmabuf context
This should only be added if the build has dmabuf-wayland enabled. This
fixes #10828.
2022-11-03 17:24:09 -05:00
Dudemanguy
adb556bf15 vo_dmabuf_wayland: use special ra_ctx_create_by_name
vo_dmabuf_wayland has its own ra and context so it can handle all the
different hwdec correctly. Unfortunately, this API was pretty clearly
designed with vo_gpu/vo_gpu_next in mind and has a lot of concepts that
don't make sense for vo_dmabuf_wayland. We still want to bolt on a
ra_ctx, but instead let's rework the ra_ctx logic a little bit. First,
this introduces a hidden bool within the ra_ctx_fns that is used to hide
the wldmabuf context from users as an option (unlike the other usual
contexts). We also want to make sure that hidden contexts wouldn't be
autoprobed in the usual ra_ctx_create, so we be sure to skip those in
that function. Additionally, let's create a new ra_ctx_create_by_name
function which does exactly what says. It specifically selects a context
based on a passed string. This function has no probing or option logic
is simplified just for what vo_dmabuf_wayland needs. The api/context
validations functions are modified just a little bit to make sure hidden
contexts are skipped and the documentation is updated to remove this
entries. Fixes #10793.
2022-10-28 02:36:46 +00:00
Aaron Boxer
7358b9d371 vo_dmabuf_wayland: wayland VO displaying dmabuf buffers
Wayland VO that can display images from either vaapi or drm hwdec

The PR adds the following changes:

1. a context_wldmabuf context with no gl dependencies
2. no-op ra_wldmabuf and dmabuf_interop_wldmabuf objects
   no-op because there is no need to map/unmap the drmprime buffer,
    and there is no need to manage any textures.

Tested on both x86_64 and rk3399 AArch64
2022-10-26 18:41:47 +00:00
Philip Langdale
395b37b005 vo_gpu/hwdec: add NULL check for legacy_name 2022-10-22 12:56:18 -07:00
Philip Langdale
064059e6c3 vo_gpu/hwdec: rename and introduce legacy names for some interops
We've had some annoying names for interops, which we can't simply
rename because that would break config files and command lines. So we
need to put a little more effort in and add a concept of legacy names
that allow us to continue loading them, but with a warning.

The two I'm renaming here are:
* vaapi-egl -> vaapi (vaapi works with Vulkan too)
* drmprime-drm -> drmprime-overlay (actually describes what it does)
* cuda-nvdec -> cuda (cuda interop is not nvdec specific)
2022-10-11 10:07:48 -07:00
sfan5
5463d3eeff vo_gpu: hwdec: add Android hwdec utilizing AImageReader 2022-10-02 14:12:26 +02:00
Philip Langdale
989d873d6e filters: lavfi: allow hwdec_interop selection for filters
Today, lavfi filters are provided a hw_device from the first
hwdec_interop that was loaded, regardless of whether it's the right one
or not. In most situations where a hardware based filter is used, we
need more control over the device.

In this change, a `hwdec_interop` option is added to the lavfi wrapper
filter configuration and this is used to pick the correct hw_device to
inject into the filter or graph (in the case of a graph, all filters
get the same device).

Note that this requires the use of the explicit lavfi syntax to allow
for the extra configuration.

eg:

```
mpv --vf=hwupload
```

becomes

```
mpv --vf=lavfi=[hwupload]:hwdec_interop=cuda-nvdec
```

or

```
mpv --vf=lavfi-bridge=[hwupload]:hwdec_interop=cuda-nvdec
```
2022-09-21 09:39:34 -07:00
Philip Langdale
5629ed81ee filters: support loading new hwdec_interops from filters
If we want to be able to handle conversion between hw formats in filter
chains, then we need to be able to load hwdec_interops from filters, as
the VO is only ever going to initialise one interop, based on its
configuration. That means that in almost all situations, only one of
the required interops will be loaded at the time the filter is
initialised.

The existing code has some assumptions that new hwdec_interops will not
be loaded after the vo has picked one to use. This change fixes two
instances:

* Refusing to load a new hwdec_interop if there is at least one
  loaded already.
* Not recalculating the set of formats known to the autoconvert
  filter when a new output format shows up. This leads to autoconvert
  not knowing that a new format is supported when the hwdec interop is
  lazily loaded.
2022-09-21 09:39:34 -07:00
Philip Langdale
4081f45501 gpu/hwdec: reorder drmprime below drmprime_drm
The older overlay based drmprime hwdec should be preferred to the new
texture mapping one. This is for a few reasons:

1. In any situation where both hwdecs work, it's probably right to use
   the more mature one by default, for now.
2. It seems like the overlay path primarily works on older SoCs
   where the texture path is less performant, and in at least one
   tested case is visually buggy, so you definitely want it to be
   tried first.
3. In situations where the old hwdec doesn't work, it will fall through
   to the new one.
2022-08-09 22:19:45 -07:00
Philip Langdale
25fa1b0b45 hwdec/drmprime: add drmprime hwdec-interop
In the confusing landscape of hardware video decoding APIs, we have had
a long standing support gap for the v4l2 based APIs implemented for the
various SoCs from Rockship, Amlogic, Allwinner, etc. While VAAPI is the
defacto default for desktop GPUs, the developers who work on these SoCs
(who are not the vendors!) have preferred to implement kernel APIs
rather than maintain a userspace driver as VAAPI would require.

While there are two v4l2 APIs (m2m and requests), and multiple forks of
ffmpeg where support for those APIs languishes without reaching
upstream, we can at least say that these APIs export frames as DRMPrime
dmabufs, and that they use the ffmpeg drm hwcontext.

With those two constants, it is possible for us to write a
hwdec-interop without worrying about the mess underneath - for the most
part.

Accordingly, this change implements a hwdec-interop for any decoder
that produces frames as DRMPrime dmabufs. The bulk of the heavy
lifting is done by the dmabuf interop code we already had from
supporting vaapi, and which I refactored for reusability in a previous
set of changes.

When we combine that with the fact that we can't probe for supported
formats, the new code in this change is pretty simple.

This change also includes the hwcontext_fns that are required for us to
be able to configure the hwcontext used by `hwdec=drm-copy`. This is
technically unrelated, but it seemed a good time to fill this gap.

From a testing perspective, I have directly tested on a RockPRO64,
while others have tested with different flavours of Rockchip and on
Amlogic, providing m2m coverage.

I have some other SoCs that I need to spin up to test with, but I don't
expect big surprises, and when we inevitably need to account for new
special cases down the line, we can do so - we won't be able to support
every possible configuration blindly.
2022-08-09 10:19:18 -07:00
Niklas Haas
81c5ed5b13 vo_gpu: fix 3DLUT precision
Using cmsFLAGS_HIGHRESPRECALC results in Little-CMS generating an
internal 49x49x49 3DLUT, from which it then samples our own 3DLUT. This
is completely pointless and not only destroys the accuracy of the 3DLUT,
but also results in no additional gain from increasing the 3DLUT
precision further.

The correct flag for us to be using is cmsFLAGS_NOOPTIMIZE, which
suppresses this internal 3DLUT generation and gives us the full
precision. We can also specify cmsFLAGS_NOCACHE, which is negligible but
in theory prevents Little-CMS from unnecessary pixel equality tests.
2022-07-15 16:34:11 +02:00
Guido Cella
fe9e074752 various: remove trailing whitespace 2022-05-14 14:51:34 +00:00
Cœur
bb5b4b1ba6 various: fix typos 2022-04-25 09:07:18 -04:00
Niklas Haas
26a3a06861 vo_gpu_next: switch to unpooled hwdec mapping
This makes use of the new frame acquire/release callbacks to hold on to
hwdec images only as long as necessary. This should greatly improve the
smoothness/efficiency of hwdec interop, by not holding on to them for
longer than needed.

This also avoids the need to pool hwdec mappers altogether.

Should fix #10067 as well, since frames are now only mapped when we
actually use them.
2022-04-11 15:43:51 +02:00
Philip Langdale
fcc81cd940 vo_gpu[_next]: hwdec: fix logging regression when probing
When I introduced the concept of lazy loading of hwdecs by img format,
I did not propagate the probing flag correctly, leading to the new
normal loading path not runnng with probing set, meaning that any
errors would show up, creating unnecessary noise.

This change fixes this regression.
2022-03-21 09:53:37 -07:00
Niklas Haas
1c49d5735d hwdec: fix out-of-date preprocessor variable name
This was renamed to HAVE_VAAPI_LIBPLACEBO.
2022-03-04 14:33:26 +01:00
Niklas Haas
d4fc44e711 vo_gpu: move hwdec loading code to common helper
So I can reuse it in vo_gpu_next without having to reinvent the wheel.
In theory, a lot of the stuff could be made more private inside the
hwdec code itself, but for the time being I don't care about refactoring
this code, merely sharing it.
2022-03-03 13:06:05 +01:00
Niklas Haas
bb434a60ed hwdec: release images as soon as possible after mapping
We don't need to hold on to buffers longer than necessary. Doesn't
matter for vo_gpu but greatly matters for vo_gpu_next, since it persists
hwdec mapped textures for longer periods.

Unfortunately, only provides benefits for hwdecs which do explicit
copies in their decode path, which currently just means cuda and
d3d11va.
2022-03-03 13:06:05 +01:00
Philip Langdale
5186651f30 vo_gpu: hwdec: load hwdec interops on-demand by default
Historically, we have treated hwdec interop loading as a completely
separate step from loading the hwdecs themselves. Some hwdecs need an
interop, and some don't, and users generally configure the exact
hwdec they want, so interops that aren't relevant for that hwdec
shouldn't be loaded to save time and avoid warning/error spam.

The basic approach here is to recognise that interops are tied to
hwdecs by imgfmt. The hwdec outputs some format, and an interop is
needed to get that format to the vo without read back.

So, when we try to load an hwdec, instead of just blindly loading all
interops as we do today, let's pass the imgfmt in and only load
interops that work for that format. If more than one interop is
available for the format, the existing logic (whatever it is) will
continue to be used to pick one.

We also have one callsite in filters where we seem to pre-emptively
load all the interops. It's probably possible to trace down a specific
format but for now I'm just letting it keep loading all of them; it's
no worse than before.

You may notice there is no documentation update - and that's because
the current docs say that when the interop mode is `auto`, the interop
is loaded on demand. So reality now reflects the docs. How nice.
2022-02-17 20:02:32 -08:00
James Ross-Gowan
4629fe577f vo_gpu: d3d11_helpers: don't create UNORDERED_ACCESS backbuffers in Win7
We're getting bug reports that the recent change to add extra usage
flags to swapchain buffers (for gpu-next) breaks mpv on some Windows 7
systems, and it seems like this is only happening with flip-model
swapchains.

Creating swapchains with DXGI_USAGE_UNORDERED_ACCESS should be valid. At
least, it's not specifically disallowed, unlike some other flags[1]. So,
just disable it for flip-model swapchains in Windows 7, rather than
disabling it everywhere.

[1]: https://docs.microsoft.com/en-us/windows/win32/direct3ddxgi/dxgi-usage
2022-02-09 17:20:07 +02:00
Niklas Haas
1ba0547bfb vo_gpu: add HOOKED_gather
Can be used conditionally (via #ifdef HOOKED_gather) to use
textureGather in custom shaders.
2022-01-15 10:27:21 +01:00
Niklas Haas
9e2c0b8baa sub: rename SUBBITMAP_RGBA to SUBBITMAP_BGRA
This was a misnomer, the actual channel order is IMGFMT_BGRA (as the
comment explicitly point out). Rename the enum for consistency.
2022-01-11 23:45:08 +02:00
Niklas Haas
d09c73c7b2 vo_gpu: add --tone-mapping-mode
This merges the old desaturation control options into a single
enumeration, with the goal of both simplifying how these options work
and also making this list more extensible (including, notably, new
options only supported by vo_gpu_next).

For the hybrid option, I decided to port the (slightly tweaked) values
from libplacebo's pre-refactor defaults, rather than the old values we
had in mpv, to more visually match the look of the vo_gpu_next hybrid.
2022-01-07 06:28:14 +01:00
Niklas Haas
f3fccfc395 vo_gpu: add --gamut-mapping-mode
Merge --gamut-clipping and --gamut-warning into a single option,
--gamut-mapping-mode, better corresponding to the new vo_gpu_next APIs
and allowing us to easily extend this option as new modes are added in
the future.
2022-01-07 06:28:14 +01:00
Niklas Haas
a9cb2e2821 vo_gpu_next: update for new tone mapping options
This was significantly refactored upstream. Switch to new APIs and add
new tone mapping curves and options.

cf. https://code.videolan.org/videolan/libplacebo/-/merge_requests/212
2022-01-07 06:28:14 +01:00
sfan5
57bc5ba6d6 vo_gpu: move image2D precision qualifier to point of use
image2D is only defined from GLSL ES 3.1 onwards, so those statements
broke GLES 2.0. Move the qualifier to a place that is only reached with
the right version requirements.
fixes commit 584ab29c88
2022-01-02 14:01:33 +01:00
Philip Langdale
fd63bf398a vo_gpu: stop hard-coding max compute group threads
We've been assuming that maximum number of compute group threads is
never less than the 1024 defined by the desktop GL spec. Given that we
haven't had working compute shaders for GLES and I guess the Vulkan
spec defines at least as high a value, we've gotten away with it so
far.

But we should really look the value up and respect it.
2021-12-19 01:51:54 +01:00
sfan5
72915e8b76 {player,video}: remove references to obsolete opengl-cb API 2021-12-15 12:29:10 +01:00
Philip Langdale
584ab29c88 vo_gpu: opengl: some fixes to make compute shaders work with GLES
It's supposed to work with GLES >= 3.1 but we had all sorts of bad
assumptions in our version handling, and then our compute shaders
turn out not to be GLSL-ES compliant.

This change contains some necessary, but insufficient, tweaks to the
shaders. Perhaps we'll make it actually work some day.
2021-12-12 20:23:31 -08:00
Niklas Haas
8bd0dee531 osdep: rename MP_UNREACHABLE
It was pointed out on IRC that the name is misleading, since the actual
semantics of the macro is to assert first.
2021-11-03 15:15:20 +01:00
Niklas Haas
c704824b45 osdep: add MP_UNREACHABLE
This seems to work on gcc, clang and mingw as-is, but I made it
conditional on __GNUC__ just in case, even though I can't figure out
which compilers we care about that don't export this define.

Also replace all instances of assert(0) in the code by MP_UNREACHABLE(),
which is a strict improvement.
2021-11-03 14:09:27 +01:00
Niklas Haas
edb0caa441 vo_gpu: allow using bare windows as --(t)scale
A lot of people seem to do something like --tscale=box
--tscale-window=<function>. Just let them use --tscale=<function>
directly, by also accepting raw windows.

Kinda hacky but needed for feature parity with vo_gpu_next, which no
longer has `--tscale=box`. Note that because the option struct is still
shared, vo_gpu_next inherits the same option handling code, so we have
to export this feature for vo_gpu as well.
2021-11-03 14:09:27 +01:00
Niklas Haas
432581b604 vo_gpu: lift ra_ctx_* opts to a global struct
So I can re-use them for vo_gpu_next.
2021-11-03 14:09:27 +01:00
Niklas Haas
254730d891 vo_gpu: fix rotated compute shader vertex simulation
Upon re-examination I have no idea why this code was ever written or
what problem it was trying to solve. But, getting rid of it fixes #9291.

It might be a remnant from before 2af2fa7a27. Who knows. This code will
be replaced soon(tm) anyways.
2021-10-22 19:08:13 +02:00
Philip Langdale
dbbf4a415d vo_gpu: vulkan: implement a VkDisplayKHR backed context
This is the Vulkan equivalent of the drm context for OpenGL, with
the big difference that it's implemented purely in terms of Vulkan
calls and doesn't actually require drm or kms.

The basic idea is to identify a display, mode, and plane on a device,
and then create a display backed surface for the swapchain. In theory,
past that point, everything is the same, and this is in fact the case
on Intel hardware. I can get a video playing on a vt.

On nvidia, naturally, things don't work that way. Instead, nvidia only
implemented the extension for scenarios where a VR application is
stealing a display from a running window system, and not for
standalone scenarios. With additional code, I've got this scenario to
work but that's a separate incremental change.

Other people have tested on AMD, and report roughly the same behaviour
as on Intel.

Note, that in this change, the VT will not be correctly restored after
qutting. The only way to restore the VT is to introduce some drm
specific code which I will illustrate in a separate change.
2021-06-11 09:54:16 -07:00
Niklas Haas
353cccfa8c vo_gpu: replace --icc-contrast by --icc-force-contrast
Not only does this have semantics that make far more sense, it also has
a default that makes far more sense. (Equivalent to the old
`icc-contrast=inf`)

This removes the weird 1000:1 contrast default assumption which
especially broke perceptual profiles and also screws things up for
OLED/CRT/etc.

Should probably close some issues but I honestly can't be bothered to
figure out which of the thousands colorimetry-related issues are
affected.
2021-05-26 17:35:29 +02:00
Niklas Haas
6c1dd02f32 vo_gpu: fix extreme clipping with --gamut-clipping for HDR outputs
Checking against 1.0 is wrong for this code, because it's in an absolute
luminance scale relative to reference white. We should be normalizing
this by `dst.sig_peak`.
2021-05-22 21:18:51 +02:00
Your Name
d9008d2aa8 Revert "vo_gpu: revert 8a09299 and conditionally clear framebuffer again"
This reverts commit b8156a9a86.

Apparently there are two problems here. One on Linux (fixed by the
original change and this revert) and one on alternative-medicine-Job's
OS. Since the latter has deprecated OpenGL and OpenGL is a second-class
citizen, I think it's better to prefer the fix for a platform that is
still alive.
2021-05-07 15:01:15 +02:00
Niklas Haas
474ee003ed vo_gpu: greatly increase maximum shader cache size
See #8137 for justification.

This is not ideal, because a large cache results in a lot of `strcmp`
invocations for every single shader invocation. But it's way better than
resulting in a lot of shader recompilations for every single shader
invocation.

The only reason I don't want to uncap it altogether is because there are
conceivable edge cases in which users load dynamically generated shaders
with updated parameters (indeed, I've seen IRC discussions to this
effect), and in this case, we don't want to grow the cache infinitely as
a result of something like a floating point parameter being continuously
updated. (Never mind that this *would* actually trigger worst case
behavior for the `strcmp`, since the updated float constant is likely to
be near the bottom of the shader body)

Whatever. vo_placebo will liberate us all in the end.
2021-04-18 20:13:43 +02:00
LaserEyess
dd86f195a6 vo_gpu: adjust interpolation_threshold's default
When mpv attempts to play a video that is, on average, 60 FPS on a
display that is not exactly 60.00 Hz, two options try to fight each
other: `video-sync-max-video-change` and `interpolation-threshold`.
Normally, container FPS in something such as an .mp4 or a .mkv is
precise enough such that the video can be retimed exactly to the display
Hz and interpolation is not activated.

In the case of something like certain live streaming videos or other scenario
where container FPS is not known, the default option of 0.0001 for
`interpolation-threshold` is extremely low, and while
`video-sync-max-video-change` retimes the video to what it approximately
knows as the "real" FPS, this may or may not be outside of
`interpolation-threshold`'s logic at any given time, which causes
interpolation to be frequently flipped on and off giving an appearance
of stuttering or repeated frames that is oftern quite jarring and makes
a video unwatchable.

This commit changes the default of `interpolation-threshold` to 0.01,
which is the same value as `video-sync-max-video-change`, and guarantees
that if the user accepts a video being retimed to match the display,
they do not additionally have to worry about a much more
precise interpolation threshold randomly flipping on or off. No internal
logic is changed so setting `interpolation-threshold` to -1 will still
disable this logic entirely and always enable interpolation.

The documentation has been updated to reflect this change and give
context to the user for which scenarios they might want to disable
`interpolation-threshold` logic or change it to a smaller value.
2021-03-28 20:26:41 +03:00
Philip Langdale
3f006eced4 options: Make validation and help possible for all option types
Today, validation is only possible for string type options. But there's
no particular reason why it needs to be restricted in this way, and
there are potential uses, to allow other options to be validated
without forcing the option to have to reimplement parsing from
scratch.

The first part, simply making the validation function an explicit
field instead of overloading priv is simple enough. But if we only do
that, then the validation function still needs to deal with the raw
pre-parsed string. Instead, we want to allow the value to be parsed
before it is validated. That in turn leads to us having validator
functions that should be type aware. Unfortunately, that means we need
to keep the explicit macro like OPT_STRING_VALIDATE() as a way to
enforce the correct typing of the function. Otherwise, we'd have to
have the validator take a void * and hope the implementation can cast
it correctly.

For help, we don't have this problem, as help doesn't look at the
value.

Then, we turn validators that are really help generators into explicit
help functions and where a validator is help + validation, we split
them into two parts.

I have, however, left functions that need to query information for both
help and validation as single functions to avoid code duplication.

In this change, I have not added an other OPT_FOO_VALIDATE() macros as
they are not needed, but I will add some in a separate change to
illustrate the pattern.
2021-03-28 19:46:27 +03:00
Niklas Haas
c766e47b70 vo_gpu: don't abort() if plane tex creation fails
In this case, we just treat all uploads as automatically failing. That
error path already exists and is already handled correctly.

Fixes #8566
2021-02-16 14:22:29 +01:00
Mia Herkt
12ffce0f22
vo_gpu: lower default deband threshold
The previous default was found to be too aggressive for most video.
Change to a lower value to prevent destroying too much detail.
2021-02-05 15:15:55 +01:00
Niklas Haas
3e175dff4a vo_gpu: don't segfault if 3DLUT texture fails uploading
This failure path was never properly checked.
2021-01-01 17:14:57 +01:00
Niklas Haas
be167c227b vo_gpu: cast bvecN to vecN for mix() on older GLSL
Fixes https://github.com/mpv-player/mpv/issues/8415, among others
2020-12-28 19:39:41 +01:00
der richter
b8156a9a86 vo_gpu: revert 8a09299 and conditionally clear framebuffer again
in the original commit, that removed the conditional clearing, an
incorrect assumption was made that clearing "should be practically free"
and can be done always. though, at least on macOS + intel this can have
a performance impact of up to 50% increased usage. it might have an
impact on other platforms and setups as well, but this is unconfirmed.

the reason for removing the conditional clearing was to partially work
around a driver bug on very specific setups, X11 with amdgpu and OpenGL,
to clear garbled frames on start. though it still has issues with
garbled frames in other situation like fullscreening. there is also an
open bug report on the mesa bug tracker about this. setting the
radeonsi_zerovram flag works around all of those issues.

since the flag works around all these issues and the original fix
doesn't work completely we revert it and keep our optimisation.

Fixes #8273
2020-12-06 21:46:29 +02:00
Niklas Haas
dc0e9644cd vo_gpu: improve gamut warning bounds checks
Test for signals exceeding 0.5% of the permitted gamut, in either
direction. (Before, it was 1% above and 0% below)

Should fix https://github.com/mpv-player/mpv/issues/8161
2020-10-21 14:39:59 +02:00
Dudemanguy
aacefa4ae5 vo_gpu: update render options on runtime
vo_gpu has a small set of options for ra_ctx that can be set. In
practice, runtime toggling doesn't matter for most of these as they have
no effect while a video is playing. However, changing the alpha option
during runtime can actually work depending on the backend used. mpv
already detected when one of these options changed, but it made no
attempt to update the options in the ra_ctx accordingly (likely because
nothing made any use of this information). Another related change is to
add an update_render_opts to the fns and allow invidiual backends to
(optionally) use it.
2020-10-15 13:43:45 +00:00
Niklas Haas
c53a47eda3 vo_gpu: clip highlights before tone-mapping
Rather than after tone-mapping. This prevents overflow when the
pre-tonemapped signal contains inputs exceeding sig_peak. I also
realized that with this clipping in place, post-clipping no longer needs
to be done, so this isn't even particularly slower.

The only two exceptions to the rule are "clip" and "linear", which
relied on the post-clipping to do their tone mapping properly.

Fixes #7929
2020-07-19 08:07:48 +02:00
wm4
b93f142011 client API: add software rendering API
This can be used to make vo_libmpv render video to a memory buffer. It
only adds a new backend API that takes memory surfaces. All the render
API (such as frame rendering control and so on) is reused.

I'm not quite convinced of the usefulness of this, and until now I
always resisted providing something like this. It only seems to
facilitate inefficient implementation. But whatever.

Unfortunately, this duplicates the software rendering glue code yet
again (like it exists in vo_x11, vo_wlshm, vo_drm, and probably more).
But in theory, these could reuse this backend in the future, just like
vo_gpu could reuse the render_gl API.

Fixes: #7852
2020-07-08 22:42:05 +02:00
sfan5
afe2b83183 vo_gpu: fix typo in struct name 2020-06-24 08:58:50 +02:00
Stephen Salerno
5141662427 vo_gpu: use highp float if available for GLES
Using mediump float on GLES causes problems with kernel resampling,
PQ HDR, and possibly others. The issues are fixed by using highp,
which is available when GL_FRAGMENT_PRECISION_HIGH is defined.
2020-06-21 19:14:16 +03:00
Niklas Haas
dc24a437fb vo_gpu: add better gamut clipping option
See https://code.videolan.org/videolan/libplacebo/-/commit/d63eeb1ecc204

Enabled by default because I think it looks better. YMMV.
2020-06-19 08:09:19 +02:00
Niklas Haas
ae5ac7e90a vo_gpu: fix scaler/window validation to allow unsetting
--dscale= and --*scale-window= (i.e. an empty string) are respectively
valid settings for their options (and, in fact, the defaults).

This fixes the bug that it was impossible to reset e.g. tscale-window
back to the default "unset" setting after setting it once.

Credit goes to @CounterPillow for locating the cause of this bug.
2020-06-18 02:02:45 +02:00
Niklas Haas
c9f6c458ea vo_gpu: add BT.2390 tone-mapping
Implementation copy/pasted from:
https://code.videolan.org/videolan/libplacebo/-/commit/f793fc0480f

This brings mpv's tone mapping more in line with industry standard
practices, for a hopefully more consistent result across the board.

Note that we ignore the black point adjustment of the tone mapping
entirely. In theory we could revisit this, if we ever make black point
compensation part of the mpv rendering pipeline.
2020-06-15 01:24:09 +02:00
Niklas Haas
ef6bc8504a vo_gpu: reinterpret SDR white levels based on ITU-R BT.2408
This standard says we should use a value of 203 nits instead of 100 for
mapping between SDR and HDR.

Code copied from https://code.videolan.org/videolan/libplacebo/-/commit/9d9164773

In particular, that commit also includes a test case to make sure the
implementation doesn't break roundtrips.

Relevant to #4248 and #7357.
2020-06-15 01:24:04 +02:00
Niklas Haas
c7fe4ae73a vo_gpu: move coherent specifier to the correct location
glslang accepted this, perhaps erronneously, but mesa does not. It seems
to be incorrect. A caveat is that this means *all* SSBOs are now
coherent, but since we only use SSBOs for peak detection, that's a
non-issue. (And besides, marking something as coherent when we don't
perform any synchronization commands on it should be a no-op anyway)

Fixes #7823
2020-06-10 17:16:43 +02:00
wm4
5ad73ccbe9 vo_gpu: fix display corruption with window screenshots
The "screenshot window" command (ctrl+s by default) somehow broke video
colors with --gpu-api=vulkan --profile=gpu-hq when playback was paused.
I don't know the cause, but the rest of the code seems to imply
gl_video_reset_surfaces() needs to be called manually to flush some
caches, and it fixes the issue, so I assume there's no great mystery
here.
2020-06-06 12:26:37 +02:00
Niklas Haas
7174c063de vo_gpu: mark peak detection buffer coherent
This is required for buffer memory barriers to actually work
2020-06-06 02:20:43 +02:00
Niklas Haas
03171b19a9 vo_gpu: make storage images/buffers as restrict
This informs the GPU that we don't alias it with any other descriptors
(which we don't).
2020-06-06 02:19:32 +02:00
Niklas Haas
6d8b3f9333 vo_gpu: un-fix storable fbo format check
The original solution for #7017 was sort of a hack, but this hack is no
longer needed because c05e5d9d fixed the underlying issue causing this
error to be spammed in the first place. So just remove the "fix" that
apparently introduced about as many issueas it fixed.

Fixes #7719, hopefully.
2020-05-13 18:54:30 +02:00
wm4
0b09771ba9 vo_gpu: manually resolve user shader prefixes
This resolves prefixes such as "~/" and "~~/" at the caller, instead of
relying on stream_read_file() to do it. One of the following commits
will remove this from stream_read_file() itself.

Untested.
2020-05-10 16:36:47 +02:00
wm4
9e48085043 vo_gpu: fix green shit with float yuv input
This was incorrect at least because the colorspace matrix attempted to
center chroma at (conceptually) 0.5, instead of 0. Also, it tried to
apply the fixed point shift logic for component sizes > 8 bit.

There is no float yuv format in mpv/ffmpeg yet, but see next commit,
which enables zimg to output it. I'm assuming zimg defines this format
such that luma is in range [0,1] and chroma in range [-0.5,0.5], with
the levels flag being ignored. This is consistent with H264/5 Annex E (I
think...), and it sort of seems to look right, so that's it.
2020-05-09 18:02:57 +02:00
Niklas Haas
4f0206ab04 vo_gpu: enable frame caching for still frames
For some reason this was never done? Looking through the code, it was
never the case that the frame cache was hit for still frames. I have no
idea why not. It makes a lot of sense to do so.

Notably, this massively improves the performance of updating the OSC
when viewing e.g. large still images, or while paused. (Tested on a
4000x8000 image, the OSC now responds smoothly to user input)
2020-04-30 00:23:13 +02:00
wm4
71295fb872 video: add alpha type metadata
This is mostly for testing. It adds passing through the metadata through
the video chain. The metadata can be manipulated with vf_format. Support
for zimg alpha conversion (if built with zimg after it gained alpha
support) is implemented. Support premultiplied input in vo_gpu.

Some things still seem to be buggy.
2020-04-24 14:41:50 +02:00
wm4
0f8f6a665b video: change chroma_w/chroma_h fields to use shift instead of size
When I added mp_regular_imgfmt, I made the chroma subsampling use the
actual chroma division factor, instead of a shift (log2 of the actual
value). I had some ideas about how this was (probably?) more intuitive
and general. But nothing ever uses non-power of 2 subsampling (except
jpeg in rare cases apparently, because the world is a bad place).

Change the fields back to use shifts and rename them to avoid mistakes.
2020-04-23 13:24:35 +02:00
wm4
26f4f18c06 options: change option macros and all option declarations
Change all OPT_* macros such that they don't define the entire m_option
initializer, and instead expand only to a part of it, which sets certain
fields. This requires changing almost every option declaration, because
they all use these macros. A declaration now always starts with

   {"name", ...

followed by designated initializers only (possibly wrapped in macros).
The OPT_* macros now initialize the .offset and .type fields only,
sometimes also .priv and others.

I think this change makes the option macros less tricky. The old code
had to stuff everything into macro arguments (and attempted to allow
setting arbitrary fields by letting the user pass designated
initializers in the vararg parts). Some of this was made messy due to
C99 and C11 not allowing 0-sized varargs with ',' removal. It's also
possible that this change is pointless, other than cosmetic preferences.

Not too happy about some things. For example, the OPT_CHOICE()
indentation I applied looks a bit ugly.

Much of this change was done with regex search&replace, but some places
required manual editing. In particular, code in "obscure" areas (which I
didn't include in compilation) might be broken now.

In wayland_common.c the author of some option declarations confused the
flags parameter with the default value (though the default value was
also properly set below). I fixed this with this change.
2020-03-18 19:52:01 +01:00
Avi Halachmi (:avih)
8861bfa913 vo_gpu: warn if correct-downscaling is ignored
And document that it's ignored with bilinear scaler.
2020-03-14 19:47:12 +02:00
Niklas Haas
c05e5d9d78 vo_gpu: generally allow non-storable FBOs
We have this cap now thanks to e2976e662, but we don't actually make
sure our FBOs are storable before we blindly attempt using them with
compute shaders.

There's no more need to unconditionally set `storage_dst = true` as long
as we make sure to include an extra condition on the `fbo_format`
selection to prevent users from accidentally enabling
compute-shader-only features with non-storable FBOs, alongside some
other miscellaneous adjustments to eliminate instances of "assumed
storability" from vo_gpu.
2020-03-08 21:41:16 +01:00
Niklas Haas
c145c80085 vo_gpu: avoid error spam when ra_fbo fmt is non-storable
This simply makes the "is the destination FBO format bad?" check a tiny
bit less awful, by making sure we prefer storable FBO formats over
non-storable FBO formats. I'd love to make this also conditional on
whether or not we actually *need* a storable FBO format, but that logic
is decided later, in `pass_draw_to_screen`, and I don't want to
replicate the logic.

Fixes #7017.
2020-03-08 21:32:06 +01:00
wm4
7d11eda72e Remove remains of Libav compatibility
Libav seems rather dead: no release for 2 years, no new git commits in
master for almost a year (with one exception ~6 months ago). From what I
can tell, some developers resigned themselves to the horrifying idea to
post patches to ffmpeg-devel instead, while the rest of the developers
went on to greener pastures.

Libav was a better project than FFmpeg. Unfortunately, FFmpeg won,
because it managed to keep the name and website. Libav was pushed more
and more into obscurity: while there was initially a big push for Libav,
FFmpeg just remained "in place" and visible for most people. FFmpeg was
slowly draining all manpower and energy from Libav. A big part of this
was that FFmpeg stole code from Libav (regular merges of the entire
Libav git tree), making it some sort of Frankenstein mirror of Libav,
think decaying zombie with additional legs ("features") nailed to it.
"Stealing" surely is the wrong word; I'm just aping the language that
some of the FFmpeg members used to use. All that is in the past now, I'm
probably the only person left who is annoyed by this, and with this
commit I'm putting this decade long problem finally to an end. I just
thought I'd express my annoyance about this fucking shitshow one last
time.

The most intrusive change in this commit is the resample filter, which
originally used libavresample. Since the FFmpeg developer refused to
enable libavresample by default for drama reasons, and the API was
slightly different, so the filter used some big preprocessor mess to
make it compatible to libswresample. All that falls away now. The
simplification to the build system is also significant.
2020-02-16 15:14:55 +01:00
wm4
f612de1712 vo_gpu: fix crash if dither texture fails to allocate
Theoretically possible (and quite unlikely due to the small texture
size). The code was originally written with the assumption that texture
allocations can't fail, and it was never updated out of laziness.
Untested.
2020-01-08 03:45:17 +01:00
Dudemanguy
7d08491cd9 Revert "vo_gpu: move wayland below X11 in autoprobe order"
This reverts commit a6d8e9b7ff.
2020-01-01 20:27:54 +00:00
wm4
a6d8e9b7ff vo_gpu: move wayland below X11 in autoprobe order
I'm sick of mpv being accused of not doing it right.
You can't do it right on Wayland? Long live X11.

Fixes: #7307
2019-12-30 13:12:33 +01:00
Philip Langdale
9c05be8999 video: cuda: add explicit context creation for copy hwaccels
In the distant past, the cuviddec backed copy hwaccel could be
configured directly using lavc options. However, since that time,
we gained support for automatic hw ctx creation which ended up
bypassing the lavc options.

Rather than trying to find a way to pass those options again, a
better idea is to make the 'cuda-decode-device' option, used by
the interop hwaccels, work for the copy hwaccels too.

And that's pretty simple: we have to add a create function that
checks the option and passes it on to ffmpeg.

Note that this does require a slight re-jig to the configuration
flags, as we now have a scenario where we want to build with support
for the cuda copy hwaccels but not the interop ones. So we need
a distinct configuration flag for that combination.

Fixes #7295.
2019-12-29 14:32:47 -08:00
Philip Langdale
eb852dc50c vo_gpu: hwdec_vdpau: remove direct_mode
As we are less and less interested in vpdpau, with nvdec and vaapi
being better choices in general on nvidia and AMD respectively, we
might consider removing direct_mode, where we bypass the vdpau
mixer and work directly with yuv textures. Normally, working with
yuv textures would be great, but vdpau built in an assumption that
all frames are delivered as separate fields, causing us to have
to re-interleave them.

nvidia then introduces a new OpenGL extension that can return the
yuv frames as frames, but we can't just unconditionally switch to
that as we'd want to keep supporting older hardware where the drivers
are no longer getting new features. The end result is that we
wouldn't be able to get rid of the old code paths.

Removing direct_mode means we always use the mixer, and work with
rgba frame textures. There are some theoretical limitations to
this, but in practice they probably don't matter much - unsupported
colourspaces don't matter because without 10bit decoding support,
we can't use them anyway, and apparently we're not doing separate
chroma scaling these days, so scaling the rbga doesn't really lose
anything (and the vdpau hq scaling option remains available).
2019-12-28 14:31:06 -08:00
wm4
78f1629a53 vf_gpu: render subtitles
Pretty annoying affair. The vo_gpu code could of course not trigger
rendering from filters yet, so it needed to be extended. Also, this uses
some icky stuff made for vf_sub (and this was the reason I marked vf_sub
as deprecated), so everything is terrible.
2019-11-30 18:09:31 +01:00
Niklas Haas
b31f2f6cb9 vo_gpu: fix infinite scaler reinit spam
Handling the window with this function makes no sense, since windows
and kernels are not the same thing and don't share the same option list.

The only reason it's done is to make sure the char* points at the static
string rather than the dynamically allocated one, which we can do
manually in this function. Rewrite a bit for clarity/quality.
2019-11-23 11:46:52 +01:00
Michael Forney
bea582f383 video/out/gpu: Remove stray top-level ';' 2019-11-18 16:50:21 +01:00
wm4
73c3dc0a7b vo_gpu: sync duplicated condition on peak computation
pass_color_map() (in video_shaders.c) and pass_colormanage() (video.c)
both duplicate the condition on whether to do peak computation. Peak
computation requires a compute shader, so if the duplicated conditions
don't match, video_shaders.c will generate a compute shader, but video.c
will try to run it as fragment shader. This leads to a "blue screen".

This can be reproduced by playing a HDTV video with --target-peak=99.

It's not clear how to fix this. Should pass_tone_map() be only invoked
if mp_trc_is_hdr() == true (what pass_colormanage() uses to decide
whether to enable peak computation), or should pass_colormanage() just
tell pass_color_map() to skip peak computation? Decide for the latter,
as it's more robust.

Even if not correct, at least it gets rid of the blue shit.

Fixes: #7149
2019-11-16 19:02:36 +01:00
wm4
35de8ea0a8 vo_gpu: yuv alpha is always full range
Probably. It's not like these pixel formats are formally specified -
FFmpeg added them because _some_ file format or decoder supports it, and
while that format/codec may define it precisely, the pixel format is
sort of disconnected and just a FFmpeg thing.

In any case, the yuva sample I had at hand uses the full range the
component data type can provide. The old code used the same "shifted"
range as for Y/U/V components, which must have been wrong.

This will not work correctly for packed YUVA formats, but fortunately
they matter even less.
2019-11-09 23:56:44 +01:00
wm4
8a0929973d vo_gpu: unconditionally clear framebuffer on start of frame
For some reason, the first frame displayed on X11 with amdgpu and OpenGL
will be garbled. This is especially visible if the player starts,
displays a frame, but then still takes a while to properly start
playback.

With --interpolation, the behavior somehow changes (usually gets worse).
I'm not sure what exactly is going on, and the code in video.c is way
too abstruse. Maybe there is some slight possibility that a frame with
uncleared contents gets displayed, which somehow also corrupts another
frame that is displayed immediately after that.

If clear is unconditionally run, this somehow doesn't happen, and you
see a video frame. By any logic this shouldn't happen: a video frame
should always overwrite the background. So I can't exclude that this
isn't some sort of driver bug, or at least very obscure interaction.

Clearing should be practically free anyway, so always do it.

Fixes: #7105
2019-11-06 22:42:44 +01:00
wm4
6d92e55502 Replace uses of FFMIN/MAX with MPMIN/MAX
And remove libavutil includes where possible.
2019-10-31 11:24:20 +01:00
Jan Ekström
fc29620ec8 vo_gpu/d3d11: add support for configuring swap chain color space
By default utilizes the color space of the desktop on which the
swap chain is located. If a specific value is defined, it will be
instead be utilized.

Enables configuration of the PQ color space (BT.2020 primaries,
PQ transfer function) for HDR.

Additionally, signals the swap chain color space to the renderer,
so that the render looks correct without having to specify
target-trc or target-prim manually.

Due to all of the APIs being Win10+ only, will only work starting
with Windows 10.
2019-10-30 02:41:25 +02:00
Jan Ekström
93dd77b38e vo_gpu/d3d11: add helpers for getting names for DXGI formats & CSPs
Additionally, define the few enum values that are currently missing
in mingw-w64 headers.
2019-10-30 02:41:25 +02:00
Jan Ekström
4e712e627c vo_gpu: add and utilize color space information from ra_fbo
This lets us set primaries, transfer function and the target peak
based on what the presenting layer would want us to have.

Now that this mechanism is available, warn if the user has
overridden values such as primaries or transfer function.
2019-10-30 02:41:25 +02:00
James Ross-Gowan
8e50d7a746 vo_gpu: log ra_format.storable with the other flags
This seems to have been missed when the storable flag was added, since
all the other flags were logged here. It can be useful to know if an RA
format is storable, so log it as well.
2019-10-27 00:45:27 +11:00
wm4
a908101258 vo_gpu: attempt to fix 0bgr format
Using e.g. --vf=format=0bgr showed obviously wrong colors with --vo=gpu.
The reason is that leading padding wasn't handled correctly.

Try to hack fix it. While the code in copy_image() is somewhat
reasonable, I can't tell what the fuck is going on with that HOOKED
shit. For some reason this HOOKED shit doesn't use copy_image() (???),
or uses it incorrectly. It affects debanding. --deband=no works
correctly. If it's enabled, the crap in hook_prelude() is needed.

I bet there are many more bugs with this. For example, the deband shader
will try to deband the alpha channel if the format abgr is used (because
the correct component order is only established later). This can be
tested by inserting a "color.x = 0;" at the end of the deband shader,
and using --vf=format=rgba vs. abgr.

I cannot comprehend why it doesn't just store explicitly which
components a texture contains, and why it doesn't just read the
components always in an uniform way.

There's a big chance this fix works only by coincidence. This shouldn't
have been so hard either. Time for a complete rewrite?
2019-10-26 00:02:55 +02:00
wm4
77f309c94f vo_gpu, options: don't return NaN through API
Internally, vo_gpu uses NaN for some options to indicate a default value
that is different depending on the context (e.g. different scalers).
There are 2 problems with this:

1. you couldn't reset the options to their defaults
2. NaN is a damn mess and shouldn't be part of the API

The option parser already rejected NaN explicitly, which is why 1.
didn't work. Regarding 2., JSON might be a good example, and actually
caused a bug report.

Fix this by mapping NaN to the special value "default". I think I'd
prefer other mechanisms (maybe just having every scaler expose separate
options?), but for now this will do. See you in a future commit, which
painfully deprecates this and replaces it with something else.

I refrained from using "no" (my favorite magic value for "unset" etc.)
because then I'd have e.g. make --no-scale-param1 work, which in
addition to a lot of effort looks dumb and nobody will use it.

Here's also an apology for the shitty added test script.

Fixes: #6691
2019-10-25 00:25:05 +02:00
wm4
b7eae31834 vo_gpu: hwdec_d3d11eglrgb: remove this
Finally. Since with the previous commit we can (probably) handle
P010 directly, this hack isn't needed anymore.
2019-10-16 23:41:06 +02:00
Jan Ekström
eaa3c1c922 vo_gpu/d3d11: fix memleak of the adapter description string 2019-10-15 22:12:48 +03:00
Jan Ekström
03e7a36a73 vo_gpu/d3d11: remove unnecessary nullptr check
mp_to_utf8 will abort in case of either invalid input or OOM.
2019-10-15 22:12:48 +03:00
Jan Ekström
89f4ce9d6f vo_gpu/d3d11: switch adapter selection to case-insensitive startswith
This lets users set values such as "intel" or "nvidia" as the
adapter vendor is generally noted in the beginning of the
description string.
2019-10-15 22:12:48 +03:00
Jan Ekström
684ffd13b4 vo_gpu/d3d11: fixup adapter selection by switching it all to bstr
I did ponder if I should have done this right away, and it seems
like not doing it at first was a mistake.
2019-10-15 22:12:48 +03:00
Jan Ekström
648d785930 vo_gpu/d3d11: add support for configuring swap chain format
Query information on the system output most linked to the swap chain,
and either utilize a user-configured format, or either 8bit
RGBA or 10bit RGB with 2bit alpha depending on the system output's
bit depth.
2019-10-13 22:31:33 +11:00
Jan Ekström
1f76e69145 vo_gpu/d3d11: add adapter name validation and listing with "help"
Not the prettiest way to get it done, but seems to work.
2019-09-29 19:39:26 +03:00
Jan Ekström
bca6e14702 vo_gpu/d3d11: refactor pthread_once d3d11 loading to function
Lets us reuse this in the future.
2019-09-29 19:39:26 +03:00
Jan Ekström
b7438d3aff vo_gpu/d3d11: utilize the passed adapter name
Normalize nullptr and an empty string both to nullptr to simplify
handling. API users cannot set a value back to nullptr, so both
an empty string as well as nullptr should behave the same.
2019-09-29 19:39:26 +03:00
Jan Ekström
e6447e2e89 vo_gpu/d3d11: add an option for the adapter name
Set it from the adapter name in the d3d11 options.
2019-09-29 19:39:26 +03:00
Jan Ekström
e205e179e0 vo_gpu/d3d11_helpers: also load up CreateDXGIFactory1
Just a factory, without a device, is required for listing of devices.
2019-09-29 19:39:26 +03:00
Anton Kindestam
6290420380 vo: make swapchain-depth option generic for all VOs
In preparation for making vo_drm able to use swapchain-depth
2019-09-28 14:10:01 +03:00
Wessel Dankers
643417dd17 video: add pure gamma TRC curves for 2.0, 2.4 and 2.6. 2019-09-27 13:21:41 +02:00
Philip Sequeira
21a5c416d5 options: add M_OPT_FILE to some more options that take files 2019-09-27 13:19:29 +02:00
sfan5
e350ceef4c vo_gpu: vulkan: add Android context 2019-09-27 00:05:06 +03:00
Cameron Cawley
db09d77e46 rpi: Update for modern systems 2019-09-20 11:39:06 +02:00
wm4
c6773692ad vo_gpu: remove vdpau/GLX backend
Useless garbage.

This was once added to test whether vdpau presentation feedback could be
used. Results were always unsatisfactory, and now vdpau is dead.
2019-09-19 20:37:05 +02:00
wm4
83d7123dc3 vo_gpu: remove mali-fbdev
Useless at this point, I don't even know if it still works, or how to
test it.
2019-09-19 20:37:05 +02:00
Anton Kindestam
e08f235578 drm: fix libmpv ABI breakage introduced in 351c083487
Extending the client-allocated mpv_opengl_drm_params struct
constituted a break of ABI that could cause UB.

Create a clean break by deprecating "drm_params" and related structs
and enum values, and replacing it with "drm_params_v2".

Also fix some comments and code that wrongly assumed that open could
return any other negative number than -1 for failure.

This commit updates the libmpv version to 1.104
2019-09-18 23:59:32 +03:00
Philip Langdale
fa0a905ea0 vo_gpu: hwdec_vaapi: Refactor Vulkan and OpenGL interops for VAAPI
Like hwdec_cuda, you get a big #ifdef mess if you try and keep the
OpenGL and Vulkan interops in the same file. So, I've refactored
them into separate files in a similar way.
2019-09-15 17:51:47 -07:00
wm4
0abe34ed21 vo_gpu: x11: remove special vdpau probing, use EGL by default
Originally, vo_gpu/vo_opengl considered the case of Nvidia proprietary
drivers, which required vdpau/GLX, and Intel open source drivers, which
require vaapi/EGL. Since window creation and GPU context creation are
inseparable in mpv's internal API, it had to pick the correct API very
early, or hardware decoding wouldn't work. "x11probe" was introduced for
this reason. It created a GLX context (without showing the window yet),
and checked whether vdpau was available. If yes, it used GLX, if not, it
continued probing x11/EGL. (Obviously it couldn't always fail on GLX
without vdpau, which is why it was a separate "probe" backend.)

Years passed, and now the situation is different. Vdpau is dead. Nvidia
drivers and libavcodec now provide CUDA interop, which requires EGL, and
fixes some of the vdpau problems. AMD drivers now provide vaapi, which
generally works better than vdpau. Intel didn't change.

In particular, vaapi provides working HEVC Main10 support. In theory, it
should work on vdpau too, with quality reduction (no 10 bit surfaces),
but I couldn't get it to work.

So always prefer EGL. And suddenly hardware decoding works. This is
actually rather important, because HEVC is unfortunately on the rise,
despite shitty encoders and unoptimized decoders. The latter may mean
that hardware decoding works better than libavcodec.

This should have been done a long, long time ago.
2019-09-15 20:00:52 +03:00
Niklas Haas
a416b3f084 vo_gpu: correctly normalize src.sig_peak
In some cases, src.sig_peak remains undefined as 0, which was definitely
the case when using the OSD, since it never got passed through the usual
color space normalization process. Most robust work-around is to simply
force the normalization at the site where it's needed. This ensures this
value is always valid and defined, to make the peak-dependent logic in
these two functions always work.

Fixes 4b25ec3a9d
Fixes #6917
Fixes #6918
2019-09-15 01:33:27 +02:00
Niklas Haas
4b25ec3a9d vo/gpu: fix check on src/dst peak mismatch
In the past, src peak was always equal to or higher than dst peak. But
since `--target-peak` got introduced, this could no longer be the case.
This leads to an incorrect result (scaling for peak mismatch in gamma
light) unless some other option (CMS, --linear-scaling, etc.) forces the
linearization.

Fixes #6533
2019-09-05 19:13:44 +03:00
wnoun
ae8cb39ab2 vo_gpu: fix taking screenshots of rotated videos 2019-08-14 21:54:14 +02:00
Philip Langdale
e2976e662d video/out/gpu: Add a storable flag to ra_format
While `ra` supports the concept of a texture as a storage
destination, it does not support the concept of a texture format
being usable for a storage texture. This can lead to us attempting
to create a texture from an incompatible format, with undefined
results.

So, let's introduce an explicit format flag for storage and use
it. In `ra_pl` we can simply reflect the `storable` flag. For
GL and D3D, we'll need to write some new code to do the compatibility
checks. I'm not going to do it here because it's not a regression;
we were already implicitly assuming all formats were storable.

Fixes #6657
2019-07-08 00:59:28 +02:00
Bin Jin
c9e7473d67 vo_gpu: process three component together in error diffusion
This started as a desperate attempt to lower the memory requirement
of error diffusion, but later it turns out that this change also
improved the rendering performance a lot (by 40% as I tested).

Errors was stored in three uint before this change, each with 24bit
precision. This change encoded them into a single uint, each with 8bit
precision. This reduced the shared memory usage, as well as number of
atomic operations, all by three times.

Before this change, with the minimum required 32kb shared memory, only
the `simple` kernel can be used to render 1080p video, which is mostly
useless compare to `--dither=fruit`. After this change, 32kb can
handle `burkes` kernel for 1080p, or `sierra-lite` for 4K resolution.
2019-06-16 11:19:44 +02:00
Bin Jin
f6fd127fe8 vo_gpu: fix use of existing textures in error diffusion
error diffusion requires two texture rendering pass. The existing code
reuses `screen_tex` and creates another for such purpose. This works
generally well for opengl, but could potentially be problematic for
vulkan, due to its async natural.
2019-06-16 11:19:44 +02:00
Bin Jin
ca2f193671 vo_gpu: implement error diffusion for dithering
This is a straightforward parallel implementation of error diffusion
algorithms in compute shader. Basically we use single work group with
maximal possible size to process the whole image. After a shift
mapping we are able to process all pixels column by column.

A large ring buffer are allocated in shared memory to speed things up.
However the size of required shared memory depends linearly on the
height of video window (or screen height in fullscreen mode). In case
there is no enough shared memory, it will fallback to `--dither=fruit`.

The maximal allowed work group size is hardcoded as 1024. Ideally we
could query `GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS`. But for whatever
reason, it seems most high end card from nvidia and amd support only
the minimal required value, so I guess we can stick to it for now.
2019-06-16 11:19:44 +02:00
Bin Jin
fbe267150d vo_gpu: fix --scaler-resizes-only for fractional ratio scaling
The calculation of scale factor involves 32-bit float, and a strict
equality test will effectively ignore `--scaler-resizes-only` option
for some non-integer scale factor.

Fix this by using non-strict equality check.
2019-06-06 20:01:56 +02:00
Bin Jin
f2119d9d88 vo_gpu: expose texture_off to user shader
It will provide low level access to coordinate mapping other than
texmap().
2019-06-06 20:01:56 +02:00
Bin Jin
ae1c489b31 vo_gpu: allow user shader to fix texture offset
This commit essentially makes user shader able to fix offset (produced
by other prescaler, for example) like builtin `--scale`.
2019-06-06 20:01:56 +02:00
Philip Langdale
b74b39dfb5 vo_gpu: vulkan: Add back context_win for libplacebo
Feature parity with the original ra_vk obviously requires win32 support,
so let's put it back in.
2019-04-21 23:55:22 +03:00
Niklas Haas
7006d6752d vo_gpu: vulkan: use libplacebo instead
This commit rips out the entire mpv vulkan implementation in favor of
exposing lightweight wrappers on top of libplacebo instead, which
provides much of the same except in a more up-to-date and polished form.

This (finally) unifies the code base between mpv and libplacebo, which
is something I've been hoping to do for a long time.

Note: The ra_pl wrappers are abstract enough from the actual libplacebo
device type that we can in theory re-use them for other devices like
d3d11 or even opengl in the future, so I moved them to a separate
directory for the time being. However, the rest of the code is still
vulkan-specific, so I've kept the "vulkan" naming and file paths, rather
than introducing a new `--gpu-api` type. (Which would have been ended up
with significantly more code duplicaiton)

Plus, the code and functionality is similar enough that for most users
this should just be a straight-up drop-in replacement.

Note: This commit excludes some changes; specifically, the updates to
context_win and hwdec_cuda are deferred to separate commits for
authorship reasons.
2019-04-21 23:55:22 +03:00
Niklas Haas
a3c808c6c8 vo_gpu: fix segfault when OSD tex creation fails
If !osd->texture, then mpgl_osd_draw_prepare fails.
2019-04-21 23:55:22 +03:00
Niklas Haas
f0b6860d62 vo_gpu: index desc namespaces by ra
No reason to require them be constant. This allows them to depend on
runtime characteristics of the `ra`.
2019-04-21 23:55:22 +03:00
Bin Jin
dd83b66652 vo_gpu: increase user shader size limit
The old size limit was chosen before LUT texture was supported in user
shader. At that time, the whole user shader will be compiled and run
on GPU, which makes large user shader impractical to be used.

With the introduction of LUT texture, the old size limit doesn't make
any sense. For example, a 1024x1024 rgba16f LUT will cost 32MB shader
size.

Fix this by increasing the size limit to a value that's unlikely be
reached.
2019-03-13 21:47:24 +02:00
Jan Ekström
199aabddcc Merge branch 'master' into pr6360
Manual changes done:
  * Merged the interface-changes under the already master'd changes.
  * Moved the hwdec-related option changes to video/decode/vd_lavc.c.
2019-03-11 01:00:27 +02:00
Bin Jin
1d0349d3b5 vo_gpu: add two useful operators to user shader
modulo operator could be used to check if size is multiple of a
certain number.

equal operator could be used to verify if size of different textures
aligns.
2019-03-09 12:56:11 +01:00
Bin Jin
b3cbd46509 vo_gpu: make texture offset available to CHROMA hooks
Before this commit, texture offset is set after all source textures
are finalized. Which means CHROMA hooks won't be able to align with
luma planes. This could be problematic for chroma prescalers utilizing
information from luma plane.

Fix this by find the reference texture early, and set global texture
offset early.
2019-03-09 12:56:11 +01:00
zc62
e37c253b92 lcms: allow infinite contrast
Fixes #5980
2019-03-09 12:55:44 +01:00
Niklas Haas
8b563a0346 vo_gpu: fix initial seeding of the peak detect ssbo
This solves some edge cases when using files with very weird metadata
(e.g. MaxCLL 10k and so forth). Instead of just blindly seeding it with
the tagged metadata, forcibly set the initial state from the detected
values.
2019-02-18 01:54:06 +02:00
Niklas Haas
3f1bc25d4d vo_gpu: use dB units for scene change detection
Rather than the linear cd/m^2 units, these (relative) logarithmic units
lend themselves much better to actually detecting scene changes,
especially since the scene averaging was changed to also work
logarithmically.
2019-02-18 01:54:06 +02:00
Niklas Haas
b4b719e337 vo_gpu: clamp sigmoid function
Can explode on some clips otherwise
2019-02-18 01:54:06 +02:00
Niklas Haas
258ed5d471 vo_gpu: tone map before gamut mapping
Gamut mapping can take very bright out-of-gamut colors into the
negatives, which completely destroys the color balance (which tone
mapping tries its best to preserve).
2019-02-18 01:54:06 +02:00
Niklas Haas
677ae4f8fe vo_gpu: make --gamut-warning warn on negative colors
As is the case for actually out-of-gamut colors (rather than just too
bright colors).
2019-02-18 01:54:06 +02:00
Niklas Haas
11b58415d5 vo_gpu: improve numerical accuracy of PQ OETF constant
Not a huge deal, but we can do the division in C, which makes the float
constant larger.
2019-02-18 01:54:06 +02:00
Niklas Haas
4e8022da26 vo_gpu: allow color management in dumb mode
There's no point to disallow target-trc/prim in dumb mode, since they
still work fine.
2019-02-18 01:54:06 +02:00
Niklas Haas
fdd671188d vo_gpu: improve accuracy of HDR brightness estimation
This change switches to a logarithmic mean to estimate the average
signal brightness. This handles dark scenes with isolated highlights
much more faithfully than the linear mean did, since the log of the
signal roughly corresponds to the perceptual brightness.
2019-02-18 01:54:06 +02:00
Niklas Haas
12e58ff8a6 vo_gpu: allow boosting dark scenes when tone mapping
In theory our "eye adaptation" algorithm works in both ways, both
darkening bright scenes and brightening dark scenes. But I've always
just prevented the latter with a hard clamp, since I wanted to avoid
blowing up dark scenes into looking funny (and full of noise).

But allowing a tiny bit of over-exposure might be a good thing. I won't
change the default just yet (better let users test), but a moderate
value of 1.2 might be better than the current 1.0 limit. Needs testing
especially on dark scenes.
2019-02-18 01:54:06 +02:00
Niklas Haas
6179dcbb79 vo_gpu: redesign peak detection algorithm
The previous approach of using an FIR with tunable hard threshold for
scene changes had several problems:

- the FIR involved annoying hard-coded buffer sizes, high VRAM usage,
  and the FIR sum was prone to numerical overflow which limited the
  number of frames we could average over. We also totally redesign the
  scene change detection.

- the hard scene change detection was prone to both false positives and
  false negatives, each with their own (annoying) issues.

Scrap this entirely and switch to a dual approach of using a simple
single-pole IIR low pass filter to smooth out noise, while using a
softer scene change curve (with tunable low and high thresholds), based
on `smoothstep`. The IIR filter is extremely simple in its
implementation and has an arbitrarily user-tunable cutoff frequency,
while the smoothstep-based scene change curve provides a good, tunable
tradeoff between adaptation speed and stability - without exhibiting
either of the traditional issues associated with the hard cutoff.

Another way to think about the new options is that the "low threshold"
provides a margin of error within which we don't care about small
fluctuations in the scene (which will therefore be smoothed out by the
IIR filter).
2019-02-18 01:54:06 +02:00
Niklas Haas
3fe882d4ae vo_gpu: improve tone mapping desaturation
Instead of desaturating towards luma, we desaturate towards the
per-channel tone mapped version. This essentially proves a smooth
roll-off towards the "hollywood"-style (non-chromatic) tone mapping
algorithm, which works better for bright content, while continuing to
use the "linear" style (chromatic) tone mapping algorithm for primarily
in-gamut content.

We also split up the desaturation algorithm into strength and exponent,
which allows users to use less aggressive desaturation settings without
affecting the overall curve.
2019-02-18 01:54:06 +02:00