At least with gnome-shell (I know, I know), the compositor does
not provide the old window size when leaving the maximized state.
Instead, we get a toplevel_config event with a 0x0 size and no
additional states.
Today, we already save the window geometry to restore it when leaving
the fullscreen state, so we just need a small change for it to
kick in for leaving the maximized state. If I read this correctly,
we'll still respect the size passed by a compositor that actually
provides the old size.
Rather than hard-coding the edge grab zone width, we can make it
user configurable. It seems worthwhile to have separate configs
for pointer and touch usage as the defaults should be different,
and a user might have both input methods in use.
Today, we support resizing wayland windows when we detect a touch
event in a defined grab zone. As part of implementing
pseudo-decorations, we should have equivalent functionality for
mouse input. And if we detect support for actual decorations we
will not activate the grab zone as the decorations will provide this.
Use x11->opts instead of vo->opts. This doesn't matter currently, and
x11->opts is actually set to vo->opts. However, there's a chance that
either option access changes, or that the way backends integrate with
struct vo changes. This is just a preemptive change to make this less of
a mess, and it's generally a good idea to reduce accesses to struct vo
anyway.
This is preparation to get rid of the option-to-property bridge
(mp_on_set_option). This is a pretty insane thing that redirects
accesses to options to properties. It was needed in the ever ongoing
transition from something to... something else.
A good example for the need of this bridge is applying profiles at
runtime. This obviously goes through the config parser, but should also
make all changes effective, for which traditionally the property layer
is used.
There isn't much left that needs this bridge. This commit changes a
bunch of options (which also have a property implementation) to use
option change notifications instead. Many of the properties are still
left, but perform unrelated functions like OSD formatting.
This should be mostly compatible. There may be some subtle behavior
changes. For example, "hwdec" and "record-file" do not check for changes
anymore before applying them, so writing the current value to them
suddenly does something, while it was ignored before.
DVB changes untested, but should work.
Handling the window with this function makes no sense, since windows
and kernels are not the same thing and don't share the same option list.
The only reason it's done is to make sure the char* points at the static
string rather than the dynamically allocated one, which we can do
manually in this function. Rewrite a bit for clarity/quality.
We should only be printing errors that occur when not probing, to
avoid creating the impression that something is wrong - and errors
during probing isn't a problem.
Surprisingly, we've managed to get this far without context_glx ever
adding the X11 display as a native resource. But with the recent change
to attempt to enable vdpau when using EGL, the hwdec now requires the
display to be added. So let's add it.
eglGetPlatform() is a broken API, since it takes a windowing specific
argument, yet is supposed to work for multiple APIs at the same time. On
Linux, it can take both a X11 "Display" and a "wl_display". Obviously
there is no way to specify what kind of display the argument is (it's
just a void*).
Mesa has _eglNativePlatformDetectNativeDisplay, which does funny stuff
to try to guess the display type, including trying to call mincore() to
determine whether the pointer can be accessed at all. I guess this
recently accidentally broke (as a bug), but on the other hand, maybe
it's time to do this properly.
The fix is using eglGetPlaformDisplay(). This requires EGL 1.5, plus
Mesa needs to support the associated platform extension
(EGL_KHR_platform_x11).
Since I see no reasonable way to do this in a compatible way, just
require that EGL 1.5 is available. The problem is that EGL 1.4 seems to
require you to create a display to query EGL version and extension, and
you have a chicken-and-egg problem. It's very stupid. Maybe you could
jump through some more hoops to get something compatible, but fuck that.
Users on "too old" Mesa will fall back to GLX (which we keep around for
a regrettable company known by the name of Nvidia).
I think Wayland and GBM should do the same. They're sufficiently
bleeding-edge that you can expect them to have EGL 1.5. On the other
hand, the cursed RPI code will have to stay with a eglGetDisplay().
Speculative fix for #7154.
(Rant about EGL follows. Actually I deleted it.)
pass_color_map() (in video_shaders.c) and pass_colormanage() (video.c)
both duplicate the condition on whether to do peak computation. Peak
computation requires a compute shader, so if the duplicated conditions
don't match, video_shaders.c will generate a compute shader, but video.c
will try to run it as fragment shader. This leads to a "blue screen".
This can be reproduced by playing a HDTV video with --target-peak=99.
It's not clear how to fix this. Should pass_tone_map() be only invoked
if mp_trc_is_hdr() == true (what pass_colormanage() uses to decide
whether to enable peak computation), or should pass_colormanage() just
tell pass_color_map() to skip peak computation? Decide for the latter,
as it's more robust.
Even if not correct, at least it gets rid of the blue shit.
Fixes: #7149
This makes the condition for including each init match the condition for
compiling the file that defines it.
It's possible to e.g. HAVE_GL and HAVE_VAAPI without HAVE_VAAPI_EGL,
which resulted in "undefined reference to `vaapi_gl_init'" with the old
code.
Probably. It's not like these pixel formats are formally specified -
FFmpeg added them because _some_ file format or decoder supports it, and
while that format/codec may define it precisely, the pixel format is
sort of disconnected and just a FFmpeg thing.
In any case, the yuva sample I had at hand uses the full range the
component data type can provide. The old code used the same "shifted"
range as for Y/U/V components, which must have been wrong.
This will not work correctly for packed YUVA formats, but fortunately
they matter even less.
This is fragile enough that it warrants getting "monitored".
This takes the commented test program code from img_format.c, makes it
output to a text file, and then compares it to a "ref" file stored in
git.
Originally, I wanted to do the comparison etc. in a shell or Python
script. But why not do it in C. So mpv calls /usr/bin/diff as a
sub-process now.
This test will start producing different output if FFmpeg adds new pixel
formats or pixel format flags, or if mpv adds new IMGFMT (either aliases
to FFmpeg formats or own formats). That is unavoidable, and requires
manual inspection of the results, and then updating the ref file.
The changes in the non-test code are to guarantee that the format ID
conversion functions only translate between valid IDs.
The use of glXGetCurrentDisplay() restricted this to the GLX backend.
But actually it works under EGL as well. Removing the GLX-specific call
and using the general mpv-internal method to get the X "Display" makes
it work in mpv.
I didn't know this. Nvidia didn't list this as extension in the EGL
context when I still used their GPUs.
Note that this might in theory break use of vdpau in some libmpv clients
using the render API. But only if MPV_RENDER_PARAM_X11_DISPLAY is not
used, and they relied on mpv using glXGetCurrentDisplay(). EGL does not
provide such an API, and hwdec_vaapi.c also uses what hwdec_vdpau.c uses
now. Considering that vaapi is preferable these days, it's not bad at
all if these clients get "broken". They can be easily fixed by passing
the display to mpv correctly.
For some reason, the first frame displayed on X11 with amdgpu and OpenGL
will be garbled. This is especially visible if the player starts,
displays a frame, but then still takes a while to properly start
playback.
With --interpolation, the behavior somehow changes (usually gets worse).
I'm not sure what exactly is going on, and the code in video.c is way
too abstruse. Maybe there is some slight possibility that a frame with
uncleared contents gets displayed, which somehow also corrupts another
frame that is displayed immediately after that.
If clear is unconditionally run, this somehow doesn't happen, and you
see a video frame. By any logic this shouldn't happen: a video frame
should always overwrite the background. So I can't exclude that this
isn't some sort of driver bug, or at least very obscure interaction.
Clearing should be practically free anyway, so always do it.
Fixes: #7105
They were used at some point, but then fell into disuse. In general,
these old flags are all a bit fuzzy, so it's a good idea to remove them
as much as possible.
The comment about MP_IMGFLAG_PAL isn't true anymore. The old meaning was
deprecated at some point, and the flag was removed from "pseudo
paletted" formats. I think mpv at one point changed its own flag from
AV_PIX_FMT_FLAG_PSEUDOPAL to AV_PIX_FMT_FLAG_PAL, when the former was
deprecated, and it became unnecessary to allocate a palette for
non-paletted formats. (The one who deprecated in FFmpeg was me, if you
wonder.)
MP_IMGFLAG_PLANAR was used in command.c, use a relatively similar flag
as replacement.
This is slightly helpful for testing, and otherwise useless and without
consequence.
I'm not using the correct output format and using IMGFMT_RGB0 as
placeholder. This doesn't matter currently, as both sws and zimg support
this as output (and support any input for it). I'm doing this because
it's surprisingly tricky to get the correct output format at this point,
without digging deeper into x11 shit or refactoring parts of the VO. I
don't care enough about this.
The user can raise the number of tolerated hardware decoding errors. On
the other hand, we have a static limit on packets that are "saved" for
fallback handling (and that's a good idea to avoid unbounded memory
usage). In this case, it could happen that the start of a file was fine
after a fallback, but after that buffered amount of data, it would
suddenly skip.
It's more useful to skip buffering entirely if the number of tolerated
decoding errors exceeds the fixed buffer.
(And also, I'm sure nobody gives a shit about this feature.)
prepare_decoding() returned a bool that was supposed to tell whether
decoding could work, or if something was fucked. After recent changes to
the decoder loop, this did not work anymore, and caused an endless loop.
Redo it, so it makes more sense. avctx being NULL (software fallback
initialization failed) now signals EOF. hwdec_failed needs to be handled
on send_packet() only, where it probably never happens anyway.
(Who was the idiot who made libavcodec have two entrypoints for
decoding? Oh right, it was me. PEBKAC.)
Shovel the code around to make the data flow slightly simpler (?). At
least there's only one send_packet function now. The old code had the
problem that send_packet() could be called even if there were queued
packets; due to sending the queued packets in the receive_frame
function, this should not happen anymore (the code checking for this
case in send_packet should normally never be called).
Untested with actual full stream hw decoders (none available here); I
created a test case by making hwaccel decoding fail.
Forgotten in commit 5d5fdb7. This failed to return the error code
properly. In particular, if the decoder rejected the packet, this was
not properly detected. Normally, this mattered only in specific cases.
Fixes: #7115
Instead, use a YUV planar format. It doesn't matter, since we use the
format only internally and for "management" purposes. We're only
interested in the physical layout, not what colorspace FFmpeg "forcibly"
associates with it.
Also get rid of using the old and slightly sketchy mp_imgfmt_find()
function. Yep, the IMGFMT_RGB30 now "constructs" the planar format,
instead of using a pixfmt constant. Slightly inconvenient, tricky, and
fragile, but I like it, so bugger off.
This whole thing gets rid of some of the strange plane permutations that
were needed earlier.
According to the definition of the GL format, and the definition in
img_format.h, and the actual output by vo_gpu, the order of components
was probably wrong. It's exceedingly likely that the vo_drm format (for
which this was originally written) has the same layout, so this was
probably a bug from when the zimg wrapper code was refactored.
This was too hardcoded to libswscale. In particular, IMGFMT_RGB30 output
is only possible with the zimg wrapper, so the context needs to be taken
into account (since this depends on the --sws-allow-zimg option
dynamically). This is still slightly risky, because zimg currently will
still fall back to swscale in some cases, such as when it refuses to
initialize the particular color conversion that is requested.
f_autoconvert.c could actually handle this better, but I'm tool fucking
lazy right now, and nobody cares anyway, so go away, OK?
This integrates it as "special" format, with no alpha component, as the
equivalent IMGFMT_RGB30 isn't meant to contain any.
Nothing can produce this format in the video chain yet, so the next
commits are needed to make this actually work.
This is mostly just because of the odd RGB default gamma issue, which
shouldn't have any real impact. This also sets allow_approximate_gamma,
which I hope is fine for normal use cases.
Normally, the Y plane can just be passed directly to zimg, and only the
chroma plane needs to be (de)interleaved. It still needs a copy if the Y
pointer is not aligned, though. (Whether this is actually a problem
depends on the CPU and probably zimg's compiler.)
This requires deciding per plane whether the plane should go through the
repack buffer or not. This logic is active in non-nv12 cases, because
not doing so would require extra code (maybe 2 lines or so).
repack_align is now always called, even if it's planar->planar with all
input aligned, but it won't actually do anything in that case. The
assumption is that zimg won't change behavior if you pass a callback
that does nothing versus passing NULL as callback.
This is for formats like nv12 (including p010, nv24, etc.). Might be
important for hardware decoding. Previously, this would have forced a
libswscale fallback.
The genericism makes this only slightly more complicated. The main
complication is due to the fact that mixing planar and packed stuff is
insane (thanks, Nvidia).
P010 output will actually happily set any of the 6 bit "padding" LSB,
that are normally supposed to be 0 (for unpadded data there is P016).
Scaling happens with 16 bit precision. Not going to bother adding an
extra packer which zeros them out, or with shifting them in
packing/unpacking. Lets just hope nobody notices.
This is similar to mp_imgfmt_find(), but probably a bit saner. Used by
the next commit. The previous commit is required to map this
unambiguously between all formats.
As the code comment says, this is needed to disambiguate FFmpeg formats.
This struct only describes the "physical" layout of a format, while
FFmpeg also attaches part of the colorspace information to the format.
We've set all planes to the same zmask. But for subsampled chroma, the
zmask obviously needs to be smaller. This could lead to out of bounds
memory read and write accesses.
Move the align repacker to a single function, since this is now more
convenient.
This probably covers all packed formats which have byte-aligned
component, no alpha, and no subsampling. Everything else needs more
imgfmt metadata, or something even more complicated. Alpha is primarily
not supported, because zimg requires a second scaler instance for it,
and handling packing/unpacking with it is an unacceptable mess.
Raise swscale and zimg default parameters. This restores screenshot
quality settings (maybe) unset in the commit before. Also expose some
more libswscale and zimg options.
Since these options are also used for VOs like x11 and drm, this will
make x11/drm/etc. much slower. For compensation, provide a profile that
sets the old option values: sw-fast. I'm also enabling zimg here, just
as an experiment.
The core problem is that we have a single set of command line options
which control the settings used for most swscale/zimg uses. This was
done in the previous commit. It cannot differentiate between the VOs,
which need to be realtime and may accept/require lower quality options,
and things like screenshots or vo_image, which can be slower, but should
not sacrifice quality by default.
Should this have two sets of options or something similar to do the
right thing depending on the code which calls libswscale? Maybe. Or
should I just ignore the problem, make it someone else's problem (users
who want to use software conversion VOs), provide a sub-optimal
solution, and call it a day? Definitely, sounds good, pushing to master,
goodbye.
Lots of dumb crap to do... something. Instead of adding yet another dumb
helper, just use the main" sws_utils API in both callers. (Which,
unfortunately, has been duplicated for glorious webp screenshots,
despite the fact that webp is crap.)
Good part: can enable zimg for screenshots (as far as needed).
Bad part: uses "default" swscale parameters instead of HQ now.
Purpose uncertain. I guess it's slightly better, maybe.
The move of the sws/zimg options from VO opts (vo_opt_list) to the
top-level option list is tricky. VO opts have some helper code in vo.c,
that sends VOCTRL_SET_PANSCAN to the VO on every VO opts change. That's
because updating certain VO options used to be this way (and not just
the panscan option). This isn't needed anymore for sws/zimg options, so
explicitly move them away.
By default utilizes the color space of the desktop on which the
swap chain is located. If a specific value is defined, it will be
instead be utilized.
Enables configuration of the PQ color space (BT.2020 primaries,
PQ transfer function) for HDR.
Additionally, signals the swap chain color space to the renderer,
so that the render looks correct without having to specify
target-trc or target-prim manually.
Due to all of the APIs being Win10+ only, will only work starting
with Windows 10.
This lets us set primaries, transfer function and the target peak
based on what the presenting layer would want us to have.
Now that this mechanism is available, warn if the user has
overridden values such as primaries or transfer function.
This seems to have been missed when the storable flag was added, since
all the other flags were logged here. It can be useful to know if an RA
format is storable, so log it as well.
Using e.g. --vf=format=0bgr showed obviously wrong colors with --vo=gpu.
The reason is that leading padding wasn't handled correctly.
Try to hack fix it. While the code in copy_image() is somewhat
reasonable, I can't tell what the fuck is going on with that HOOKED
shit. For some reason this HOOKED shit doesn't use copy_image() (???),
or uses it incorrectly. It affects debanding. --deband=no works
correctly. If it's enabled, the crap in hook_prelude() is needed.
I bet there are many more bugs with this. For example, the deband shader
will try to deband the alpha channel if the format abgr is used (because
the correct component order is only established later). This can be
tested by inserting a "color.x = 0;" at the end of the deband shader,
and using --vf=format=rgba vs. abgr.
I cannot comprehend why it doesn't just store explicitly which
components a texture contains, and why it doesn't just read the
components always in an uniform way.
There's a big chance this fix works only by coincidence. This shouldn't
have been so hard either. Time for a complete rewrite?
With hwdec copy modes, mp_image_copy_attributes() is used to transfer
metadata other than the image data when copying the image from the
hardware surface. It didn't copy the closed caption data.
Fix this. This makes closed captions in copy mode work.
Fixes: #6376
sdl_gamepad.c and vo_sdl.c both have their own event loops and run in
separate threads. They don't know of each other (and shouldn't). Since
SDL only has one global event loop (why didn't they fix this in SDL2?),
these obviously clash. The actual behavior is relatively subtle, which
event being randomly dispatched to either of the threads.
This is very regrettable. Very.
Work this around. "Fortunately" SDL exposes its global state to some
degree. SDL_WasInit() returns whether a "subsystem" was initialized, and
you could say the one who initialized it owns it. Both SDL_INIT_VIDEO
and SDL_INIT_GAMECONTROLLER implicitly enable SDL_INIT_EVENTS, and the
event loop is indeed the resource that cannot be shared.
Unfortunately, this is still racy, since SDL_InitSubSystem is a second
call, and succeeds if the subsystem is already initialized (increases a
refcount I think). But good enough. Blame SDL for everything.
(I think I made this commit message too long. Nobody cares even.)
Fixes: #7085
It seems some users try to use it (!). This VO was always an experiment,
and intended for low power devices. Whether this experiment succeeded or
not, it's a rather obscure VO. Recently I've seen a regrettable user,
who seemed to use this only because mpv was built without x11 support
(!). Add this warning, like other fallback VOs have it. (The message was
copied from vo_x11.)
Commit 5d5fdb77e9 changed details of the decoding control flow, and
called it a "high-risk" change. It turns out that this broke with with
hwdec copy mode, where there is some sort of delay queue (supposedly
increases efficiency, but more likely worthless cargo-cult).
It simply used the wrong (basically inverted) condition for the draining
case.
This was the only case that did not work properly. Other tests,
including video/audio decoding errors, software decoding fallbacks,
etc., seemed to work well. Might still not be exhaustive, as there are
so many corner cases.
Also change two error code returns. This don't/shouldn't really matter,
though the second error code led it to return both a frame and
AVERROR_EOF, which is unexpected, and makes lavc_process() leak a frame.
But also see next commit.
Fixes: 5d5fdb77e9
If expected_sync_pc is greater than submit_count, the unsigned
subtraction will wraparound, which breaks playback. This bug was found
while experimenting with bit-blt model present, but it might be possible
to trigger it with the flip model as well, if there was a dropped frame.
Internally, vo_gpu uses NaN for some options to indicate a default value
that is different depending on the context (e.g. different scalers).
There are 2 problems with this:
1. you couldn't reset the options to their defaults
2. NaN is a damn mess and shouldn't be part of the API
The option parser already rejected NaN explicitly, which is why 1.
didn't work. Regarding 2., JSON might be a good example, and actually
caused a bug report.
Fix this by mapping NaN to the special value "default". I think I'd
prefer other mechanisms (maybe just having every scaler expose separate
options?), but for now this will do. See you in a future commit, which
painfully deprecates this and replaces it with something else.
I refrained from using "no" (my favorite magic value for "unset" etc.)
because then I'd have e.g. make --no-scale-param1 work, which in
addition to a lot of effort looks dumb and nobody will use it.
Here's also an apology for the shitty added test script.
Fixes: #6691
ad_lavc and vd_lavc use the lavc_process() helper to translate the
FFmpeg push/pull API to the internal filter API (which completely
mismatch, even though I'm responsible for both, just fucking kill me).
This interface was "slightly" too tight. It returned only a bool
indicating "progress", which was not enough to handle some cases (see
following commit).
While we're at it, move all state into a struct. This is only a single
bool, but we get the chance to add more if needed.
This fixes mpv falling asleep if decoding returns an error during
draining. If decoding fails when we already sent EOF, the state machine
stopped making progress. This left mpv just sitting around and doing
nothing.
A test case can be created with: echo $RANDOM >> image.png
This makes libavformat read a proper packet plus a packet of garbage.
libavcodec will decode a frame, and then return an error code. The
lavc_process() wrapper could not deal with this, because there was no
way to differentiate between "retry" and "send new packet". Normally, it
would send a new packet, so decoding would make progress anyway. If
there was "progress", we couldn't just retry, because it'd retry
forever.
This is made worse by the fact that it tries to decode at least two
frames before starting display, meaning it will "sit around and do
nothing" before the picture is displayed.
Change it so that on error return, "receiving" a frame is retried. This
will make it return the EOF, so everything works properly.
This is a high-risk change, because all these funny bullshit exceptions
for hardware decoding are in the way, and I didn't retest them. For
example, if hardware decoding is enabled, it keeps a list of packets,
that are fed into the decoder again if hardware decoding fails, and a
software fallback is performed. Another case of horrifying accidental
complexity.
Fixes: #6618
The code is very basic:
- only handles gamepads, could be extended for generic joysticks in the
future.
- only has button mappings for controllers natively supported by SDL2.
I heard more can be added through env vars, there's also ways to load
mappings from text files, but I'd rather not go there yet. Common ones
like Dualshock are supported natively.
- analog buttons (TRIGGER and AXIS) are mapped to discrete buttons using an
activation threshold.
- only supports one gamepad at a time. the feature is intented to use
gamepads as evolved remote controls, not play multiplayer games in mpv :)
Form some reason (and because of my fault), vf_format converts image
formats, but nothing else. For example, setting the "colormatrix"
sub-parameter would not convert it to the new value, but instead
overwrite the metadata (basically "reinterpreting" the image data
without changing it).
Make the historical mistake worse, and go all the way and extend it such
that it can perform a conversion. For compatibility reasons, this needs
to be requested explicitly. (Maybe this would deserve a separate filter
to begin with, but things are messed up anyway. Feel free to suggest an
elegant and simple solution.)
This demonstrates how zimg can properly perform some conversions which
swscale cannot (see examples added to vf.rst).
Stupidly this requires 2 code paths, one for conversion, and one for
overriding the parameters.
Due to the filter bullshit (what was I thinking), this requires quite
some acrobatics that would not be necessary without these abstractions.
On the other hand, it'd definitely be more of a mess without it. Oh
whatever.
f_reset, which is called on seeks, was a good place for resetting the
warning flag (so the warning would be print again). Except some other
code abused f_reset when all metadata was read (in both cases you want
to clear the metadata). Instead of spending more time on getting this
flag reset correctly, just never reset it.
There's 2 stupid things here that need to be fixed. First of all,
vulkan wasn't actually using presentation time because somehow the
get_vsync function in context.c disappeared. Secondly, if the mpv window
was hidden it was updating the ust time based on the refresh_usec but
really it should simply just not feed any information to the vsync info
structure. So this adds some logic to assume whether or not a window is
hidden.
Normally, input and output are orthogonal. But zimg may gain image
formats not supported by FFmpeg, which means the conversion will only
work if zimg is used at all. This on the other hand, depends on whether
the other format is also supported by zimg. (For example, a later commit
adds RGB30 output to zimg. libswscale does not support this format. But
if you have P010 as input, which the zimg wrapper does not support at
all, the conversion won't work.)
This makes such a function needed; so add it.
The FFmpeg version was last bumped a long time ago, except in commit
1638fa7b46, where it was used for some obscure pixel
format.
This is pretty annoying, so make it optional.
The RGB pack/unpack code in theory supports packed, non-subsampled YUV,
although in practice FFmpeg defines no such formats. (Only one with
alpha, but all alpha input is rejected by the current code.)
This would in theory have failed, because we would have selected a GBRP
format (instead of YUV), which makes no sense and would either have been
rejected by zimg (inconsistent parameters), or lead to broken output
(wrong permutation of planes).
Select the correct format and don't permute the planes in the YUV case.
As suggested by the zimg author. This is mostly related to XYZ support.
It's unclear whether this works. Using the only XYZ test sample we know,
and the next commits to consume the pixfmt, it looks wrong.
According to the zimg author, YUV->GREY conversion does not even read
the chroma planes, as long as no matrix conversion is involved. Since we
try to avoid the latter anyway by forcing the source parameters on the
target image, passing only the Y plane will not help with anything.
An unscientific test seems to confirm this, so remove this.
This would probably help with libswscale (I didn't test this), but on
the other hand, libswscale will rarely be used in cases where we can
extract the Y plane. (Except nv12, which should probably be added to the
zimg wrapper's unpacking.)
And update the comment both explaining why this defaulting matters and
why we use BT.2020 instead.
tl;dr BT.709 clips even the one test file we *do* have, so if we don't
handle XYZ "natively" in vo_gpu we might as well at least handle it in a
way that runs less risk of clipping
This no longer hard-codes BT.709, it converts to whatever primaries are
tagged in the same metadata struct. The actual BT.709 defaulting comes
from `mp_image_params_guess_csp`.
This will perform conversion and scaling of video with zimg, if
--sws-allow-zimg is used.
The performance probably depends on how well the compiler optimizes the
RGB pack code in zimg.c, which is written in C.
Awful shit. I probably wouldn't accept this code from someone else, just
so you know.
The idea is that a sws_utils user can automatically use zimg without
large code changes. Basically, laziness. Since zimg support is still
very new, and I don't want that anything breaks just because zimg was
enabled at build time, an option needs to be set to enable it. (I have
especially especially obscure stuff in mind, which is all what
libswscale is used in mpv.)
This _still_ doesn't cause zimg to be used anywhere, because the
sws_utils user has to opt-in by setting allow_zimg. This is because some
users depend on certain libswscale features.
This provides a very similar API to sws_utils.h, which can be used to
convert and scale from one mp_image to another.
This commit adds only the code, but does not use it anywhere.
The code is quite preliminary and barely tested. It supports only a few
pixel formats, and will return failure for many others. (Unlike
libswscale, which tries to support anything that FFmpeg knows.)
zimg itself accepts only planar formats. Supporting other formats
requires manual packing/unpacking. (Compared to libswscale, the zimg API
is generally lower level, but allows for more flexibility.) Only BGR0
output was actually tested. It appears to work.
This used to be needed for the "GPU memcpy" (shitty Intel methods to
deal with certain uncached memory types). This is now done in FFmpeg,
and the code in mp_image.c was just unnecessarily convoluted.
The plane pointer checking assert() triggered at least on gray8, because
that has a "pseudo palettes" in ffmpeg, which mpv refuses to allocate.
Remove a strange duplicated printf().
Log the component type where available.
(Why is this even here, I hate it when there are commented test programs
in source files.)
This was obviously missing from the recent commit, which probably broke
10 bit decoding. The original commit didn't test this for lack of
working hardware; this commit isn't tested either.
Fixes: a1c7d61393
This was speculatively added 2 years ago in preparation for something
that apparently never happened. The D3D code was added as an "example",
but this too was never used/finished.
No reason to keep this.
This syscall avoids the need to guess an unused filename in /dev/shm and
allows seals to be placed on it. We immediately return if no fd got
returned, as there isn’t anything we can do otherwise.
Seals especially allow the compositor to drop the SIGBUS protections,
since the kernel promises the fd won’t ever shrink.
This removes support for any platform but Linux from this vo.
vo_wayland was removed during the wayland rewrite done in 0.28. However,
it is still useful for systems that do not have OpenGL.
The new wayland_common code makes vo_wayland much simpler, and
eliminates many of the issues the previous vo_wayland had.
With the previous commit, this is dead code.
This also makes the f_autoconvert.c code for this dead code
(fortunately). Will probably remove this later.
2 years ago, ANGLE removed the old NV12-specific extension, and added
a new one that supports a number of formats, including P010. Actually
they just renamed it and removed their initial annoying and obvious
design error (bravo, Google).
Since it broke 2 years ago, nobody should give a shit about this code,
and it should just be removed. But for some reason I still dived the
shit-tank (Windows development).
I guess Intel code monkeys can't write drivers (or maybe the issue is
because we're doing zero-copy, which probably maybe is not actually
allowed by D3D11 due to array textures, see --d3d11va-zero-copy), so
the P010 path is completely untested. It doesn't work, I'll delete all
this ANGLE hwdec code.
Fixes: #7054
Query information on the system output most linked to the swap chain,
and either utilize a user-configured format, or either 8bit
RGBA or 10bit RGB with 2bit alpha depending on the system output's
bit depth.
The old way of using wayland in mpv relied on an external renderloop for
semi-accurate timings. This had multiple issues though. Display sync
would break whenever the window was hidden (since the frame callback
stopped being executed) which was really annoying. Also the entire
external renderloop logic was kind of fragile and didn't play well with
mpv's internal structure (i.e. using presentation time in that old
paradigm breaks stats.lua).
Basically the problem is that swap buffers blocks on wayland which is
crap whenever you hide the mpv window since it looks up the entire
player. So you have to make swap buffers not block, but this has a
different problem. Timings will be terrible if you use the unblocked
swap buffers call.
Based on some discussion in #wayland, the trick here is relatively
simple and works well enough for our purposes. Instead we basically
build a way to block with a timeout in the wayland buffer swap
functions.
A bool is set in the frame callback function that indicates whether or
not mpv is waiting for a frame to be displayed. In the actual buffer
swap function, we enter into a while loop waiting for this flag to be
set. At the same time, the wl_display is polled to block the thread and
wakeup if it receives any events from the compositor. This loop only
breaks if enough time has passed or if the frame callback bool is
received.
In the near future, it is better to set whether or not frame a frame has
been displayed in the presentation feedback. However as a first pass,
doing it in the frame callback is more than good enough.
The "downside" is that we render frames that aren't actually shown on
screen when the player is hidden (it seems like wayland people don't
like that). But who cares. Accurate timings are way more important. It's
probably not too hard to add that behavior back in the player though.
The externally driven renderloop was originally added for the wayland
context (to make display sync somewhat work), but it has a lot of issues
with mpv's internal structure. A different approach should be used.
This reverts commit a743fef837.
This affects hwdec_dxva2dxgi, which uses ra_d3d11_wrap_tex to wrap RGB
video frames that are shared with a D3D9 device. Without it, mpv uses
nearest instead of bilinear scaling with --scale=bilinear (the default)
and --hwdec=dxva2. It's kind of hard to believe this bug has gone
unnoticed for almost two years, but that seems to have been the case.
Fixes: #7042
all the get_property_* usages were removed because in some circumstances
they can lead to deadlocks. they were replaced by accessing the vo and
mp_vo_opts structs directly, like on other vos.
additionally the mpv helper was split into a mpv and libmpv helper, to
differentiate between private and public APIs and for future changes
like a macOS vulkan context for vo=gpu.
This is the proper fix to the memory leak @wm4 pointed out. It turns out
that when you autoprobe opengl and vo_wayland_init returns false,
vo_wayland_uninit is never actually executed. So you have a leftover
pointer. The vulkan context does this correctly which was why my old,
dumb "fix" broke it.
Dumb idea. The correct thing to do is to fix the preinit and context
creation so that the uninit is correctly executed when probing fails
(and then everything gets freed).
This reverts commit defc8f359c.
wm4 mentioned that the wayland autoprobe leaked. A simple oversight in
the wayland_common code forgot to free the vo_wayland_state if
vo_wayland_init returned false.
Leaks the entire zimg state on filter deinit. Not sure what I was
thinking; with some luck, I just didn't give a shit about this case, but
most likely I was thinking the same thing as always: nothing.
As pointed out by @olifre in #7016, this line of code was wrong. p->opts
at this point is a struct allocated and managed by m_config. opts->file
is a string, and m_config explicitly frees it on destruction. The line
of code in question replaced the opts->file value, and made both the old
and new value children of the talloc allocation, so they were _also_
freed on destruction.
This crashed due to a double-free. First, talloc auto-freeing freed the
string, then m_config explicitly called talloc_free() on the stale
pointer again.
As @v-fox pointed out, commit 36dd2348a1 seems to have triggered the
crash. I suspect this code merely worked out of coincidence, since it
allowed m_config to free the value first. This removed it from the
auto-free list, and thus did not result in a double-free. The change
in order of calling alloc destructors changed the order of these calls.
There is no strong reason why new behavior (as introduced by commit
36dd2348a1) would be wrong (it feels like cleaner behavior). On the
other hand, what the vf_vapoursynth code did is clearly unclean and
going by the m_config API, you're not at all supposed to do this.
m_config manages all memroy referenced by option structs, the end.
@olifre's suggested fix also would have been correct (not just hiding
the issue), I prefer my fix, as it doesn't mess with the option struct
in tricky ways.
This wouldn't have happened if mpv were written in Haskell.
I previously skipped creating the wl_output if the --fullscreen flag
with no --fsscreen_id was inputted, so the fullscreen video lands on the
correct output (where mpv was launched). This has breakage if someone
combines the --autofit flag (or other similar options with it). Instead,
just actually read xdg_shell spec and realize that you can pass NULL to
xdg_toplevel_set_fullscreen and let the compositor choose the output if
the user doesn't specify it. If this has issues, get a better
compositor.
Basically predicts what mp_image_hw_download() will do. It's pretty
simple if it gets the full mp_image. (Taking just a imgfmt would make
this pretty hard/impossible or inaccurate.)
Used in one of the following commits.
This reverts commit 6385a5fd1b, and in
addition removes the code that automatically inserts the vavpp filter.
The reason is the same as the commit that is being reverted: this
filter seems to trigger driver bugs. It can cause GPU freezes or
just doesn't work.
This variant of disabling the filter is better. There was no way to
add the "force" parameter to the automatically inserted filter, so
the old approach just made manual filter insertion (with the --vf
option or "vf" command) more cumbersome.
Normalize nullptr and an empty string both to nullptr to simplify
handling. API users cannot set a value back to nullptr, so both
an empty string as well as nullptr should behave the same.
Turns out clearing all frambuffers in reconfig isn't such a great idea
when you also end up here when setting pan/scan.
I guess this is just a leftover from a previous iteration of vo_drm
where doing this made sense.
Previously, there appeared to be implicit synchronisation in the
GL interop path, and we never observed any visual glitches. However,
recently, I started seeing stuttering in the GL path and on closer
examination it looked like read-before-write behaviour where GL
would display an old frame again rather than the current one.
After verifying that disabling hwdec made the problem go away,
I tried adding a cuStreamSynchronize() after the memcpys and that
also resolved the problem, so it's clearly sync related.
cuStreamSynchronize() is a CPU sync and so more heavy-weight than
you want, but it's the only tool we have. There is no mechanism
defined for synchronising GL to CUDA (It looks like there is a way
to synchronise CUDA to EGL but it appears one way and so wouldn't
directly address this problem).
Anyway, empirically, the output now looks the same as with hwdec
off.
Swapchain depth currently hard-coded to 3 (4 buffers).
As we now avoid redrawing on repeat frames (we simply requeue the same fb
again), this should give a nice performance boost when playing videos with a
lower FPS than the display FPS in video-sync=display-resample mode.
Presentation feedback has also been implemented to help counter the
significant amounts of jitter we would otherwise be seeing.
This allows to use drm hwaccels that require a hwdevice.
Tested with v4l2request hwaccel and cedrus driver on an allwinner device
running mpv with --vo=gpu --gpu-context=drm --hwdec=drm.
At first, this code used only 1 plane, so the compatibility stuff was
sufficient. But then use of planes 1 and 2 was added, without extending
the compatibility stuff.
I think I've seen a case recently where this broke the build and caused
users to apply invalid fixes, but I don't remember where.
It's possible that I didn't get all defines that are needed.
I think I was wrong about having to reset the stats when mpv stops
producing frames, eg. when it's paused. As long as the swapchain doesn't
underflow, last_queue_display_time will still be accurate, because the
next frame produced should still be presented one vsync after the
last one in the swapchain.
If the swapchain underflows (which is the common case for when mpv is
paused for more than 150ms,) the next predicted frame time should be in
the past. It should be fine to leave last_queue_display_time unset in
this case, since vo.c will use the current time instead, which is a
decent guess (though it doesn't take vsync phase into account.)
last_sync_refresh_count and last_sync_qpc_time should be kept on
swapchain underflow as well. Assuming the display refresh rate doesn't
change while mpv is paused, they'll only provide a more accurate guess
of the vsync duration when mpv starts playing again. Also,
vsync_duration_qpc never needs to get reset. It will get overwritten
immediately in most cases, and when it doesn't, it's still a better
guess of the vsync duration than nothing.
The render API (vo_libmpv) had potential deadlock problems with
MPV_RENDER_PARAM_ADVANCED_CONTROL. This required vd-lavc-dr to be
enabled (the default). I never observed these deadlocks in the wild
(doesn't mean they didn't happen), although I could specifically provoke
them with some code changes.
The problem was mostly about DR (direct rendering, letting the video
decoder write to OpenGL buffer memory). Allocating/freeing a DR image
needs to be done on the OpenGL thread, even though _lots_ of threads are
involved with handling images. Freeing a DR image is a special case that
can happen any time. dr_helper.c does most of the evil magic of
achieving this. Unfortunately, there was a (sort of) circular lock
dependency: freeing an image while certain internal locks are held would
trigger the user's context update callback, which in turn would call
mpv_render_context_update(), which processed all pending free requests,
and then acquire an internal lock - which the caller might not release
until a further DR image could be freed.
"Solve" this by making freeing DR images asynchronous. This is slightly
risky, but actually not much. The DR images will be free'd eventually.
The biggest disadvantage is probably that debugging might get trickier.
Any solution to this problem will probably add images to free to some
sort of queue, and then process it later. I considered making this more
explicit (so there'd be a point where the caller forcibly waits for all
queued items to be free'd), but discarded these ideas as this probably
would only increase complexity.
Another consequence is that freeing DR images on the GL thread is not
synchronous anymore. Instead, it mpv_render_context_update() will do it
with a delay. This seems roundabout, but doesn't actually change
anything, and avoids additional code.
This also fixes that the render API required the render API user to
remain on the same thread, even though this wasn't documented. As such,
it was a bug. OpenGL essentially forces you to do all GL usage on a
single thread, but in theory the API user could for example move the GL
context to another thread.
The API bump is because I think you can't make enough noise about this.
Since we don't backport fixes to old versions, I'm specifically stating
that old versions are broken, and I'm supplying workarounds.
Internally, dr_helper_create() does not use pthread_self() anymore, thus
the vo.c change. I think it's better to make binding to the current
thread as explicit as possible.
Of course it's not sure that this fixes all deadlocks (probably not).
This adds vsync reporting to the D3D11 backend using the presentation
feedback provided by DXGI, which is pretty similar to what's provided by
GLX_OML_sync_control in the GLX backend. In DirectX, PresentCount is the
SBC, PresentRefreshCount and SyncRefreshCount are kind of like the MSC
and SyncQPCTime is the UST.
Unlike GLX, the DXGI API makes it possible for PresentCount and
SyncQPCTime to refer to different physical vsyncs, in which case
PresentRefreshCount and SyncRefreshCount will be different. The code
supports this possibility, even though it's not clear whether it can
happen when using flip-model presentation. The docs say for flip-model
apps, PresentRefreshCount is equal to SyncRefreshCount "when the app
presents on every vsync," but on my hardware, they're always equal, even
when mpv misses a vsync. They can definitely be different in exclusive
fullscreen bitblt mode, though, which mpv doesn't support now, but might
support in future.
Another difference to GLX is that, at least on my hardware,
PresentRefreshCount and SyncRefreshCount always refer to the last
physical vsync on which mpv presented a frame, but glxGetSyncValues can
apparently return a MSC and UST from the most recent physical vsync,
even if mpv didn't present a new frame on it. This might result in
different behaviour between the two backends after dropped frames or
brief pauses.
Also note, the docs for the DXGI presentation feedback APIs are pretty
awful, even by Microsoft standards. In particular the docs for
DXGI_FRAME_STATISTICS are misleading (PresentCount really is the number
of times Present() has been called for that frame, not "the running
total count of times that an image was presented to the monitor since
the computer booted.")
For good documentation, try these:
https://docs.microsoft.com/en-us/windows/win32/direct3ddxgi/dxgi-flip-modelhttps://docs.microsoft.com/en-us/windows/win32/direct3d9/d3dpresentstatshttps://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/d3dkmthk/ns-d3dkmthk-_d3dkmt_present_stats
(Yeah, the docs for the D3D9Ex and even the kernel-mode version of this
structure are better than the DXGI ones. It seems possible that they're
all rewordings of the same internal Microsoft docs, but whoever wrote
the DXGI one didn't really understand it.)
Certain mpv config options require wl->current_output to be created
before the video can actually start rendering. Just always create it
here if the current_output doesn't exist (the one exception being the
--fs option with no --fs-screen flag). Incidentally, this also fixes
--fs-screen not working on wayland.
To find the correct ICC profile X atom, the screen number was calculated
directly from the xrandr order of the screens.
But if a primary screen is set, it should be the first Xinerama screen,
even if it is not the first xrandr screen.
Calculate the the proper atom id for each screen.
On wayland the cursor has to be configured each time the pointer enters.
Currently if the window (re)gains the focus, the pointer is not hidden,
even when configured. After the mouse has been moved the pointer hides
correctly.
https://wayland.freedesktop.org/docs/html/apa.html#protocol-spec-wl_pointer:
wl_pointer::enter - enter event
...
When a seat's focus enters a surface, the pointer image is undefined
and a client should respond to this event by setting an appropriate
pointer image with the set_cursor request.
Fixes#6185.
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Previously, the only mouse buttons supported in wayland were left,
right, and middle click. This adds the thumb back/forward buttons as
valid bindings. Also it removes the old, default behavior of always
sending a right click if an unrecognized mouse button is clicked.
In a related but different fix, the magnitude of an axis event in
wayland is not important to mpv since it internally handles all scaling.
The only thing we care about is getting the sign when the event occurs.
Check if eglGetPlatformDisplayEXT is available and try to
use it to obtain the display connection. Fall back to eglGetDisplay
if eglGetPlatformDisplayEXT is not available or failing.
From PR #5992
This is mostly for the case when mpv_render_context_free() is called
while video is going on. This is supposed to gracefully stop video and
deinitialize everything properly. (I feel like it would put too much on
the API user to require that video is stopped before calling this
function. Whether video is running or not is a fairly highlevel thing,
and the API user could not do it in a race-free way.)
One problem was that unit() accessed ctx after ctx->in_use was set to
false. The update(ctx) call was basically a racy use-after-free. It
needed that call to wake up the mpv_render_context_free() loop that
waited for VO uninit. Fix this by triggering the wakeup inside the lock,
and then doing "barrier" locking in mpv_render_context_free().
Another problem was that the wait loop didn't really wait properly. IT
seems the had_kill_update field was a botched attempt to do that. It's
indeed quite hairy to do that with update(). Instead make use of the
dispatch queue (infinite timeout, using mp_dispatch_interrupt()), which
handles the problem of having to wait both for dispatch queue updates
and VO uninit at the same time.
Preparation for the next commit. Until now, it was only needed if DR was
involved. One reason for not always creating it was that you normally
must not use it if advanced_control is not enabled. This is why e.g.
VOCTRL_SCREENSHOT now checks for that variable; it still can't use
ctx->dispatch if the render API user did not enable it.
render api needs to wait for vo to be destroyed before frees the context.
The purpose of kill_cb is to wake up render api after vo is destroyed,
but uninit did that before kill_cb, so kill_cb tries using the freed
memory. Remove kill_cb to fix the issue as uninit is able to do the
work.
Equalizer control was redone in 03cf150ff3 (over 2 years
ago). Ever since, the equalizer control structs and the GET voctrl have
been unused. Only the SET voctrl is still used as notification mechanism
(actually a bad hack to avoid some further option change handling
complexity).
Remove the unused parts.
I was assuming posix_memalign was the most portable function to use, but
MinGW does not provide it for some reason. Switch to C11 aligned_alloc()
which someone suggested was provided by MinGW (but actually isn't,
someone probably confused it with the incompatible _aligned_malloc),
and add a configure check.
Even though it turned out that MinGW doesn't provide it, the function
is slightly more elegant than posix_memalign(), so stay with it.
skip-logo.lua is just what I wanted to have. Explanations are on the top
of that file. As usual, all documentation threatens to remove this stuff
all the time, since this stuff is just for me, and unlike a normal user
I can afford the luxuary of hacking the shit directly into the player.
vf_fingerprint is needed to support this script. It needs to scale down
video frames as part of its operation. For that, it uses zimg. zimg is
much faster than libswscale and generates more correct output. (The
filter includes a runtime fallback, but it doesn't even work because
libswscale fucks up and can't do YUV->Gray with range adjustment.)
Note on the algorithm: seems almost too simple, but was suggested to me.
It seems to be pretty effective, although long time experience with
false positives is missing. At first I wanted to use dHash [1][2], which
is also pretty simple and effective, but might actually be worse than
the implemented mechanism. dHash has the advantage that the fingerprint
is smaller. But exact matching is too unreliable, and you'd still need
to determine the number of different bits for fuzzier comparison. So
there wasn't really a reason to use it.
[1] https://pypi.org/project/dhash/
[2] http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html
Generally, using x86 SIMD efficiently (or crash-free) requires aligning
all data on boundaries of 16, 32, or 64 (depending on instruction set
used). 64 bytes is needed or AVX-512, 32 for old AVX, 16 for SSE. Both
FFmpeg and zimg usually require aligned data for this reason.
FFmpeg is very unclear about alignment. Yes, it requires you to align
data pointers and strides. No, it doesn't tell you how much, except
sometimes (libavcodec has a legacy-looking avcodec_align_dimensions2()
API function, that requires a heavy-weight AVCodecContext as argument).
Sometimes, FFmpeg will take a shit on YOUR and ITS OWN alignment. For
example, vf_crop will randomly reduce alignment of data pointers,
depending on the crop parameters. On the other hand, some libavfilter
filters or libavcodec encoders may randomly crash if they get the wrong
alignment. I have no idea how this thing works at all.
FFmpeg usually doesn't seem to signal alignment internal anywhere, and
usually leaves it to av_malloc() etc. to allocate with proper alignment.
libavutil/mem.c currently has a ALIGN define, which is set to 64 if
FFmpeg is built with AVX-512 support, or as low as 16 if built without
any AVX support. The really funny thing is that a normal FFmpeg build
will e.g. align tiny string allocations to 64 bytes, even if the machine
does not support AVX at all.
For zimg use (in a later commit), we also want guaranteed alignment.
Modern x86 should actually not be much slower at unaligned accesses, but
that doesn't help. zimg's dumb intrinsic code apparently randomly
chooses between aligned or unaligned accesses (depending on compiler, I
guess), and on some CPUs these can even cause crashes. So just treat the
requirement to align as a fact of life.
All this means that we should probably make sure our own allocations are
64 bit aligned. This still doesn't guarantee alignment in all cases, but
it's slightly better than before.
This also makes me wonder whether we should always override libavcodec's
buffer pool, just so we have a guaranteed alignment. Currently, we only
do that if --vd-lavc-dr is used (and if that actually works). On the
other hand, it always uses DR on my machine, so who cares.
Dear diary,
today I fixed a shitty bug that was all my fault because I made a
horrible mess. (Except it was a horrible mess before I even touched
this shit, but let's not blame others.)
Sometimes, updates to VO option that control video sizing (like panscan)
didn't update the screen correctly. They were delayed until the next
option change or so.
It turns out that if the option update happens at the "same" time as a
VOCTRL, update_opts() doesn't actually notify the vo_driver of the
change. This in turn happened because run_control() called
m_config_cache_update(). The latter function returns true if the options
changed since the last call, and update_opts() also calls it (on the
same config cache) for the same purpose. The update_opts() call, which
is triggered by a third mechanism, comes later, but the cache update
call will return false (as it should). Basically, given the config API,
you can't act differently on multiple update calls and expect it to
work. The skipped handling in update_opts() meant that the notification
required to apply the changed option wasn't run.
Fix this by simply calling update_opts() directly instead. Now there's
only 1 m_config_cache_update() call on this specific instance. Fix the
call in run_reconfig() too, so the previous sentence isn't a lie (but it
probably doesn't make a difference in practice due to certain details).
I'm not sure how I even ran into this sort-of race condition. The VOCTRL
that messed up the option update was VOCTRL_UPDATE_PLAYBACK_STATE, which
happens semi-regularly.
Why this config cache shit and all the other shit? Rediscovering this
crap wasn't pleasant. It's a bunch of hacks that became necessary when
the ancient MPlayer architecture made it hard to move the VO to a
separate thread.
All the VO code typically accesses vo->opts (whose fields all used to be
global variables in MPlayer). The frontend changes these on user input.
Putting locking around all the options would be a nightmare, and keeping
a copy of the options in the thread was much simpler. You need a way to
propagate option changes, notify the thread, and update the local copy
too. And the result of these thoughts was the config cache mechanism.
In this specific case, the relevant cache update call in update_opts()
triggers a VOCTRL_SET_PANSCAN to the VO driver, which isn't related to
its former function anymore. Instead, it causes the VO driver to update
the video sizing/placing options, which the generic VO code can't do.
(Mostly because the VO driver includes the windowing stuff and is
responsible for resizing etc. itself.)
VOCTRLs sent by the frontend are even worse. MPlayer had no real runtime
option change mechanism. Some options were vaguely duplicated by
properties, so you could effectively change those options at runtime.
Each of these options had its own VOCTRL, which still exist today, e.g.
VOCTRL_FULLSCREEN, or VOCTRL_ONTOP. I tried to make all options runtime
changeable, and to unify properties with options. But I couldn't be
bothered with updating all VO drivers to listen to option changes
directly, because that would be pretty tedious. So the property code is
still all there and sends the old VOCTRLs. But of course you need to
sync up the options, which is why the run_control() code did that.
(Unrelated: VO_EVENT_FULLSCREEN_STATE is the worst shithack of them all.
Currently, only the frontend can actually write to options (for awful
reasons), so if the fullscreen state changes due to outside interaction,
the VO driver can't update the corresponding option fields. So the VO
notifies the frontend with said VO_EVENT_, and the frontend then sends
VOCTRL_GET_FULLSCREEN, and updates the global copy of the option with
the value returned by that. I still like to think the situation is not
that bad considering the monstrous effort of converting single-threaded
code that had hundreds of options in global variables to multi-threaded
code with no global variables at all.)
m_geometry_apply() will read and modify the dummy variable. It's not
actually used for anything, but valgrind will still warn against
uninitialized data. I'm not sure whether this was UB, but in any case
it's annoying when running valgrind.
--hwdec=auto-copy was preferring vdpau over vaapi. In the HEVC 10 bit
case, this also led to hardware decoding not being enabled. (Probably
because the probing can't start over after enabling hw decoding fails at
runtime, or something like that.)
Possible that this subtly breaks on some setups. You can't always win.
During probing on a system with AMD GPU, mpv used to output the
following messages if hardware decoding was enabled:
[ffmpeg] AVHWFramesContext: Failed to create surface: 2 (resource
allocation failed).
[ffmpeg] AVHWFramesContext: Unable to allocate a surface from internal
buffer pool.
This commit removed the message, with hopefully no other side effects.
Long explanations follow, better don't read them, it's just tedious
drivel about the details. People should learn to write concise commit
messages, not drone on and on endlessly all while they have no fucking
point.
The code probes supported hardware pixel format, and checks whether they
can be mapped as textures. av_hwdevice_get_hwframe_constraints() returns
a list of hardware pixel formats in the valid_sw_formats field (the "sw"
means software, but they're still hardware pixel formats, makes sense).
This contained the format yuv420p, even though this is not a valid
hardware format. Trying to create a surface of this type results in VA
surface creation failure, upon which FFmpeg prints the error messages
above. We'd be fine with this, except FFmpeg has a global log callback,
and there's no way to suppress these messages without creating other
issues.
It turns out that FFmpeg's vaapi implementation returns all formats from
vaQueryImageFormats() if no "hwconfig" is provided. This list includes
yuv420p, which is probably supported for surface upload/download, but
not as native format. Following FFmpeg's logic, it should not appear in
the valid_sw_formats list, because formats for transfers are returned by
another roundabout API.
Idiotically, there doesn't seem to be any vaapi call that determines
whether a format is a valid surface format. All mechanisms to do this
are bound to a VAConfigID (= video codec or video processor), all while
the actual surface creation API strangely does not take a VAConfigID (a
big WTF).
Also, calling the vaCreateSurfaces() API ourselves for probing is out of
the question, because that functions is utterly and idiotically complex.
Look at the FFmpeg code and how much effort it requires to setup a
complete set of attributes - we can't duplicate this.
So the only way left to do this is the most idiotic and tedious way:
enumerating all VAProfile (and VAEntrypoints) to create all possible
VAConfigIDs. Each of the VAConfigIDs is associated with a list of
formats, which FFmpeg can return (by passing the ID along with the
"hwconfig"), and which is probed separately.
Note that VAConfigID actually refers to a dynamic instance of something,
and creating a VAConfigID takes not only the VAProfile and the
VAEntrypoint, but also an arbitrary attribute array. In theory, this
means our attempt to get to know all possible configurations cannot
work, but in practice this attribute array seems to be pointless for
decoding and video processing, and FFmpeg doesn't use it (though the
encoding path does use it). This probably just makes it _barely_ OK to
do it this way.
Could we discard all this probing shit, and somehow do it another way?
Probably not. The EGL API for mapping surfaces doesn't even seem to
provide a way to enumerate supported formats, we may not even know
whether DRM/dmabuf interop is actually supported (AFAIR the EGL
extensions are present even if they don't work), nor do we know whether
the VAAPI driver supports this interop (not sure). So actually trying is
the only way.
Further, mpv initializes the decoder on a another thread, where you
can't just access OpenGL state. This suckage is mostly to be blamed on
OpenGL itself and its crazy thread boundedness. In theory, this could be
done anyway (see how software decoding "direct rendering" tries to get
around this). But to make it worse, the decoder never cares about the
list of supported formats determined by this code; instead,
f_autoconvert.c tries to deal with it and insert a video processor
(well, good luck with this crap, I bet it doesn't even work). So this
whole endeavor might be pointless, other than the fact that failed
probing can disable use of vaapi (which is correct and necessary). But
if you have a shovel, you don't use it to smash the flat end on the heap
of shit that's piled up before you, or do you?
While this method probably works, it's still orgasmically tedious. It
was tedious before: we had to create a real surface, create a GL
texture, map the surface with it, then destroy everything again. But the
added code is tedious on its own. Highlights include the need to malloc
a FFmpeg struct just to pass a single damn integer, the need to
enumerate "entrypoints" for each VA profile, even though all profiles
have exactly 1 entrypoint, and the kind of obnoxious way how vaapi
requires you to preallocate arrays for returned things, even they could
for example reasonably be returned as immutable arrays or have some
other simpler API.
The main grand fuckup is of course that vaapi requires a VAConfigID to
query surface properties, but not for creating surfaces. This
awkwardness even affected the FFmpeg API design, which has a "hwconfig"
concept that is only used by vaapi (vaapi is only 1 out of 10 hardware
decoding APIs supported by the FFmpeg hwcontext stuff). Maybe I'm just
missing something. It's as if vaapi required setting radioactive shit on
fire. Look how clean the native D3D11 code is instead. (Even the ANGLE
code manages to avoid being this fucked up. Or the VDPAU code, despite
supporting multiple mapping methods.)
Another only barely related change is that the valid_sw_formats field
can be NULL, and the API explicitly documents this. Technically, the mpv
code was buggy for not checking this, although until now the FFmpeg
implementation so far could not return it when we still passed NULL for
the hwconfig parameter.
No functional changes, just preparation for the next commit. Split the
probing into multiple functions. Prepare for the yet unused possibility
to pass AVVAAPIHWConfig to probing. try_format_pixfmt() now assumes it
can be called multiple times with the same format, so it filters the
format.
The format probing is now something like O(n^2) for n formats, but n
will most likely remain something under 50 or so.
I once created this because someone wanted to use vapoursynth without
the Python dependency. No idea if anyone ever actually used it. It's
sort of icky (it calls itself "lazy" to preempt complaints about how
much it sucks), and complicates the build process. Kill it.
It seems much more promising to have something like this:
https://github.com/vapoursynth/vapoursynth/issues/386
This would either solve the build distribution problem by relaxing the
Python dependency, and/or allow a Lua backend to be included without
pain.
Useless garbage.
This was once added to test whether vdpau presentation feedback could be
used. Results were always unsatisfactory, and now vdpau is dead.
Semantics a bit questionable. This is done for the OSC (next commit),
and a comment added the manpage explicitly states this. Meaning this is
probably garbage and needs to revisit when the OSC changes and/or
someone wants to use this margin feature for something else.
Not sure about the subtitle thing. It's imaginable that someone uses
these options to create empty borders for subtitles on the bottom, so
subtitles should be located there. On the other hand, this gives a
rather unpolished user experience when using the (later added) OSC
feature to not overlap with the video. There's not much of a point if
the OSC still overlaps the video. However, I'm too lazy to think about
this, so it stays like it is.
--video-margin-ratio-left=0.2 --video-margin-ratio-right=0.9 (added in
the the next commit) will set f_w to inf, resulting in some garbage
being propagated. Later, the OSD margins are computed from values before
various sanity clamping is applied, which makes libass suffer from
bullshit values.
I'm very sure it's OK and more correct to compute the OSD margins using
the later values, but I'm not sure about that.
See manpage additions. This is a huge hack. You can bet there are shit
tons of bugs. It's literally forcing square pegs into round holes.
Hopefully, the manpage wall of text makes it clear enough that the whole
shit can easily crash and burn. (Although it shouldn't literally crash.
That would be a bug. It possibly _could_ start a fire by entering some
sort of endless loop, not a literal one, just something where it tries
to do work without making progress.)
(Some obvious bugs I simply ignored for this initial version, but
there's a number of potential bugs I can't even imagine. Normal playback
should remain completely unaffected, though.)
How this works is also described in the manpage. Basically, we demux in
reverse, then we decode in reverse, then we render in reverse.
The decoding part is the simplest: just reorder the decoder output. This
weirdly integrates with the timeline/ordered chapter code, which also
has special requirements on feeding the packets to the decoder in a
non-straightforward way (it doesn't conflict, although a bugmessmass
breaks correct slicing of segments, so EDL/ordered chapter playback is
broken in backward direction).
Backward demuxing is pretty involved. In theory, it could be much
easier: simply iterating the usual demuxer output backward. But this
just doesn't fit into our code, so there's a cthulhu nightmare of shit.
To be specific, each stream (audio, video) is reversed separately. At
least this means we can do backward playback within cached content (for
example, you could play backwards in a live stream; on that note, it
disables prefetching, which would lead to losing new live video, but
this could be avoided).
The fuckmess also meant that I didn't bother trying to support
subtitles. Subtitles are a problem because they're "sparse" streams.
They need to be "passively" demuxed: you don't try to read a subtitle
packet, you demux audio and video, and then look whether there was a
subtitle packet. This means to get subtitles for a time range, you need
to know that you demuxed video and audio over this range, which becomes
pretty messy when you demux audio and video backwards separately.
Backward display is the most weird (and potentially buggy) part. To
avoid that we need to touch a LOT of timing code, we negate all
timestamps. The basic idea is that due to the navigation, all
comparisons and subtractions of timestamps keep working, and you don't
need to touch every single of them to "reverse" them.
E.g.:
bool before = pts_a < pts_b;
would need to be:
bool before = forward
? pts_a < pts_b
: pts_a > pts_b;
or:
bool before = pts_a * dir < pts_b * dir;
or if you, as it's implemented now, just do this after decoding:
pts_a *= dir;
pts_b *= dir;
and then in the normal timing/renderer code:
bool before = pts_a < pts_b;
Consequently, we don't need many changes in the latter code. But some
assumptions inhererently true for forward playback may have been broken
anyway. What is mainly needed is fixing places where values are passed
between positive and negative "domains". For example, seeking and
timestamp user display always uses positive timestamps. The main mess is
that it's not obvious which domain a given variable should or does use.
Well, in my tests with a single file, it suddenly started to work when I
did this. I'm honestly surprised that it did, and that I didn't have to
change a single line in the timing code past decoder (just something
minor to make external/cached text subtitles display). I committed it
immediately while avoiding thinking about it. But there really likely
are subtle problems of all sorts.
As far as I'm aware, gstreamer also supports backward playback. When I
looked at this years ago, I couldn't find a way to actually try this,
and I didn't revisit it now. Back then I also read talk slides from the
person who implemented it, and I'm not sure if and which ideas I might
have taken from it. It's possible that the timestamp reversal is
inspired by it, but I didn't check. (I think it claimed that it could
avoid large changes by changing a sign?)
VapourSynth has some sort of reverse function, which provides a backward
view on a video. The function itself is trivial to implement, as
VapourSynth aims to provide random access to video by frame numbers (so
you just request decreasing frame numbers). From what I remember, it
wasn't exactly fluid, but it worked. It's implemented by creating an
index, and seeking to the target on demand, and a bunch of caching. mpv
could use it, but it would either require using VapourSynth as demuxer
and decoder for everything, or replacing the current file every time
something is supposed to be played backwards.
FFmpeg's libavfilter has reversal filters for audio and video. These
require buffering the entire media data of the file, and don't really
fit into mpv's architecture. It could be used by playing a libavfilter
graph that also demuxes, but that's like VapourSynth but worse.
This one is probably not terribly obvious from just the valgrind log,
but a wayland dev explained it to me just a second ago. Whenever mpv
sends events to the screen with wl_display_dispatch, wayland internally
allocates memory to a struct wl_proxy object if a new id is found. Quite
a few more things happen to that proxy object, but eventually mpv stores
the data on the client-side in a wrapper type of struct (struct
wl_data_offer). mpv's data_device_listener keeps track of those proxies
and frees the memory when appropriate. Of course, mpv is constantly
sending events to the screen and does so until the user quits the
player. What happens here is that one final wl_display_dispatch is called
right before the user quits the player and before mpv's
data_device_listener can handle that object. So the result is that you
always have one extra dangling proxy that doesn't get properly freed.
The solution is to just simply call wl_data_offer_destroy before closing
the wl_display to free that final dangling wl_proxy.
Extending the client-allocated mpv_opengl_drm_params struct
constituted a break of ABI that could cause UB.
Create a clean break by deprecating "drm_params" and related structs
and enum values, and replacing it with "drm_params_v2".
Also fix some comments and code that wrongly assumed that open could
return any other negative number than -1 for failure.
This commit updates the libmpv version to 1.104
Like hwdec_cuda, you get a big #ifdef mess if you try and keep the
OpenGL and Vulkan interops in the same file. So, I've refactored
them into separate files in a similar way.
Originally, vo_gpu/vo_opengl considered the case of Nvidia proprietary
drivers, which required vdpau/GLX, and Intel open source drivers, which
require vaapi/EGL. Since window creation and GPU context creation are
inseparable in mpv's internal API, it had to pick the correct API very
early, or hardware decoding wouldn't work. "x11probe" was introduced for
this reason. It created a GLX context (without showing the window yet),
and checked whether vdpau was available. If yes, it used GLX, if not, it
continued probing x11/EGL. (Obviously it couldn't always fail on GLX
without vdpau, which is why it was a separate "probe" backend.)
Years passed, and now the situation is different. Vdpau is dead. Nvidia
drivers and libavcodec now provide CUDA interop, which requires EGL, and
fixes some of the vdpau problems. AMD drivers now provide vaapi, which
generally works better than vdpau. Intel didn't change.
In particular, vaapi provides working HEVC Main10 support. In theory, it
should work on vdpau too, with quality reduction (no 10 bit surfaces),
but I couldn't get it to work.
So always prefer EGL. And suddenly hardware decoding works. This is
actually rather important, because HEVC is unfortunately on the rise,
despite shitty encoders and unoptimized decoders. The latter may mean
that hardware decoding works better than libavcodec.
This should have been done a long, long time ago.
In some cases, src.sig_peak remains undefined as 0, which was definitely
the case when using the OSD, since it never got passed through the usual
color space normalization process. Most robust work-around is to simply
force the normalization at the site where it's needed. This ensures this
value is always valid and defined, to make the peak-dependent logic in
these two functions always work.
Fixes 4b25ec3a9dFixes#6917Fixes#6918