For the sake of vaapi interop, we want to use EGL, but on the other
hand, but because driver developers are full of shit, vdpau interop will
not work on EGL (even if the driver supports EGL). The latter happens
with both nvidia and AMD Mesa drivers.
Additionally, EGL vaapi interop support can apparently only detected at
runtime by actually using it. While hwdec_vaegl.c already does this, it
would require initializing libva on _every_ system, which will cause
libav to print an unpreventable bullshit message to the terminal.
Try to counter these huge loads of bullshit by adding more fucking
bullshit.
We want the following behavior:
- VO probed, backend probed: only accept non-sw, fail completely
otherwise
- VO forced, backend probed: use the first non-sw, or if none is found,
fall back to the first working sw backend
- VO probed, backend forced: (I don't care about this case)
- VO forced, backend forced: just use that backend
Also, on backend probe failure the vo->probed field was left in its old
state.
This adds support for the progress indicator taskbar extension
that was introduced with Windows 7 and Windows Server 2008 R2.
I don’t like this solution because it keeps its own state and
introduces another VOCTRL, but I couldn’t come up with anything
less messy.
closes#2399
In the display-sync, non-interpolation case, and if the display refresh
rate is higher than the video framerate, we duplicate display frames by
rendering exactly the same screen again. The redrawing is cached with a
FBO to speed up the repeat.
Use glBlitFramebuffer() instead of another shader pass. It should be
faster.
For some reason, post-process was run again on each display refresh.
Stop doing this, which should also be slightly faster. The only
disadvantage is that temporal dithering will be run only once per video
frame, but I can live with this.
One aspect is messy: clearing the background is done at the start on the
target framebuffer, so to avoid clearing twice and duplicating the code,
only copy the part of the framebuffer that contains the rendered video.
(Which also gets slightly messy - needs to compensate for coordinate
system flipping.)
Currently, vo.c will always continue to render the currently queued
frame, which sets last_flip, which in turn confuses vo_get_delay(),
which in turn will show a bogus A/V desync message on unpause. So just
reset it again on unpause.
I guess the removed code is an old leftover, and makes no sense anymore.
Should fix weird A/V diff dropouts when frames are being dropped with
display-sync.
If the player sends a frame with duration==0 to the VO, it can trivially
underrun. Don't panic, but keep the correct time.
Also, returning the absolute time from vo_get_next_frame_start_time()
just to turn it into a float with relative time was silly. Rename it and
make it return what the caller needs.
"Missed" implies the frame was dropped, but what really happens is that
the following frame will be shown later than intended (due to the
current frame skipping a vsync).
(As of this commit, this property is still inactive and always
returns 0. See git blame for details.)
Apparently Windows treats windows that use OpenGL, cover an entire
screen and have the WS_POPUP style set or are topmost windows as
exclusive fullscreen windows that bypass DWM and cannot be covered
by other windows.
This means we can’t use dwmflush in fullscreen mode, and it also
means that no other window can cover mpv, and it makes the screen
flicker when switching to fullscreen mode.
This can be avoided by not setting the WS_POPUP flag.
Users can still access the old behavior by enabling stay-on-top
(which IMO at least makes sense—now we just need to get dwmflush
autodetection right to avoid nasty surprises).
fixes#2177
Until now, we've relied on the following things:
- you can send flush packets to the decoder even if it's fully flushed,
- you can send new packets to a flushed decoder,
- you can send new packers to a partially flushed decoder.
("flushing" refers to sending flush packets to the decoder until the
decoder does not return new pictures, not avcodec_flush_buffers().)
All of these are questionable. The libavcodec API probably doesn't
guarantee that these work well or at all, even though most decoders have
no issue with these. But especially with hardware decoding wrappers
(like MMAL), real problems can be expected. Isolate us from these corner
cases by handling them explicitly.
This applies to unexpected freezes or deadlocks, not e.g. normal
framedrops. The verbose messages also might remind an API user if the
API usage is incorrect, such as not calling mpv_opengl_cb_draw() when a
redraw request was issued.
The nnedi3 prescaler requires a normalized range to work properly,
but the original implementation did the range normalization after
the first step of the first pass. This could lead to severe quality
degradation when debanding is not enabled for NNEDI3.
Fix this issue by passing `tex_mul` into the shader code.
Fixes#2464
vo_opengl_cb is a special case, because we somehow have to render video
asynchronously, all while "trusting" the API user to do it correctly.
This didn't quite work, and a while ago a compromise using a timeout to
prevent theoretically possible deadlocks was added.
Make it even more synchronous. Basically, go all the way, and
synchronize rendering between VO and user renderer thread to the
full extent possible.
This means the silly frame queue is dropped, and we event attempt to
synchronize the GL SwapBuffer call (via mpv_opengl_cb_report_flip()).
The changes introduced with commit dc33eb56 are effectively dropped. I
don't even remember if they mattered.
In the future, we might make all VOs fetch asynchronously from a frame
queue, which would mostly remove the differences between vo_opengl and
vo_opengl_cb, but this will take a while (if it will even be done).
Pick the correct GLSL version from the GL_SHADING_LANGUAGE_VERSION
string. Might be somewhat questionable, as we expect the minor version
number not to have leading 0s.
Should help with cases when the reported GLSL version is much higher
than the equivalent of the reported GL version. This problem was
observed in combination with GL_ARB_uniform_buffer_object, which
can't be used if the declared GLSL version is too low.
Simplifies some auto detection matters.
I _still_ don't want to remove the lazy loading mechanism, because it's
still slightly useful for filters using the hwdec APIs. My main
motivation for not always preloading them is actually that libva prints
random useless crap to the terminal with no way to prevent this.
Notes:
- Unfortunately the only way to talk to EGL from within DRM I could find
involves linking with GBM (generic buffer management for Mesa.)
Because of this, I'm pretty sure it won't work with proprietary NVidia
drivers, but then again, last time I checked NVidia didn't offer
proper screen resolution for VT.
- VT switching doesn't seem to work at all. It's worth mentioning that
using vo_drm before introduction of VT switcher had an anomaly where
user could switch to another VT and input text to it, while video
played on top of that VT. However, that isn't the case with drm_egl:
I can't switch to other VT during playback like this. This makes me
think that it's either a limitation coming from my firmware or from
EGL/KMS itself rather than a bug with my code. Nonetheless, I still
left (untestable) VT switching code in place, in case it's useful to
someone else.
- The mode_id, connector_id and device_path should be configurable for
power users and people who wish to watch videos on nonprimary screen.
Unfortunately I didn't see anything that would allow OpenGL backends
to register their own set of options. At the same time, adding them to
global namespace is pointless.
- A few dozens of lines could be shared with vo_drm (setting up VT
switching, most of code behind page flipping). I don't have any strong
opinion on this.
- Sometimes I get minor visual glitches. I'm not sure if there's a race
condition of some sort, unitialized variable (doubtful), or if it's
buggy driver. (I'm using integrated Intel HD Graphics 4400 with Mesa)
- .config and .control are very minimal.
Signed-off-by: wm4 <wm4@nowhere>
glXCreateContextAttribsARB() by design can throw some X11 errors. We
ignore these, but we generally still print error messages to the
terminal. This was confusing/annoying users, so silence it. The stupid
part is that the Xlib error handler is global, so we have to be slightly
careful here.
They are evil and should be eradicated. Some of these were pretty dumb
anyway.
There are probably some more around in platform specific code or other
code not enabled by default on Linux.
This is based on an older patch by James Ross-Gowan. It was rebased and
cleaned up. Also, the DWM API usage present in the older patch was
removed, because DWM reports nonsense rates at least on Windows 8.1
(they are rounded to integers, just like with the old GDI API - except
the GDI API had a good excuse, as it could report only integers).
Signed-off-by: wm4 <wm4@nowhere>
This simplifies update_screen_rect a bit. Unless --fs-screen=all is
used, it will always get an HMONITOR and call GetMonitorInfo to
determine its dimensions. This will make it easier for the next few
commits to determine the colour profile and the refresh rate from the
HMONITOR.
There is a slight change in behaviour. When selecting a screen that is
out of range, such as --screen=9 on a machine with only two monitors,
the old code would silently select the last existing monitor. The new
code prints an error message and falls back to the default screen (same
as the Cocoa code.)
Signed-off-by: wm4 <wm4@nowhere>
The call to EnumDisplaySettings seems to be a relic from when MPlayer
ran on systems that didn't have GetMonitorInfo or SM_CX/CYVIRTUALSCREEN.
GetMonitorInfo was loaded dynamically, so it was possible for MPlayer to
run without it and use the values returned by EnumDisplaySettings.
These are always present in modern versions of Windows, so the values
returned from EnumDisplaySettings are always overwritten. Remove the
call to EnumDisplaySettings and assume SM_CX/CYVIRTUALSCREEN is always
present.
Signed-off-by: wm4 <wm4@nowhere>
Commit 27dc834f added it as such.
Also remove the check for glUniformBlockBinding() - it's part of an
extension, and the check glGetUniformBlockIndex() already checks whether
the extension is fully available.
Implement NNEDI3, a neural network based deinterlacer.
The shader is reimplemented in GLSL and supports both 8x4 and 8x6
sampling window now. This allows the shader to be licensed
under LGPL2.1 so that it can be used in mpv.
The current implementation supports uploading the NN weights (up to
51kb with placebo setting) in two different way, via uniform buffer
object or hard coding into shader source. UBO requires OpenGL 3.1,
which only guarantee 16kb per block. But I find that 64kb seems to be
a default setting for recent card/driver (which nnedi3 is targeting),
so I think we're fine here (with default nnedi3 setting the size of
weights is 9kb). Hard-coding into shader requires OpenGL 3.3, for the
"intBitsToFloat()" built-in function. This is necessary to precisely
represent these weights in GLSL. I tried several human readable
floating point number format (with really high precision as for
single precision float), but for some reason they are not working
nicely, bad pixels (with NaN value) could be produced with some
weights set.
We could also add support to upload these weights with texture, just
for compatibility reason (etc. upscaling a still image with a low end
graphics card). But as I tested, it's rather slow even with 1D
texture (we probably had to use 2D texture due to dimension size
limitation). Since there is always better choice to do NNEDI3
upscaling for still image (vapoursynth plugin), it's not implemented
in this commit. If this turns out to be a popular demand from the
user, it should be easy to add it later.
For those who wants to optimize the performance a bit further, the
bottleneck seems to be:
1. overhead to upload and access these weights, (in particular,
the shader code will be regenerated for each frame, it's on CPU
though).
2. "dot()" performance in the main loop.
3. "exp()" performance in the main loop, there are various fast
implementation with some bit tricks (probably with the help of the
intBitsToFloat function).
The code is tested with nvidia card and driver (355.11), on Linux.
Closes#2230
Add the Super-xBR filter for image doubling, and the prescaling framework
to support it.
The shader code was ported from MPDN extensions project, with
modification to process luma only.
This commit is largely inspired by code from #2266, with
`gl_transform_trans()` authored by @haasn taken directly.
The noframe event is logged whenever there is no new frame. This can
happen due to normal redraws, but also due to video frame queue
underflow.
The mpv_opengl_cb_report_flip() API function is currently pretty
useless, because blocking on the video frame queue is more reliable and
simpler. But at least we can log the actual vsync.
next_vsync/prev_vsync was only used to retrieve the vsync duration. We
can get this in a simpler way.
This also removes the vsync duration estimation from vo_opengl_cb.c,
which is probably worthless anyway. (And once interpolation is made
display-sync only, this won't matter at all.)
This affects only the display-sync code path, as for normal timing the
wakeup_pts stuff handles proper wakeup. It's probably mostly a
theoretical issue.
A hw decoder might fail to decode a frame for multiple reasons, and not
always just because decoding is impossible. We can't generally
distinguish these reasons well. Make it more tolerant by accepting
failures of 3 frames, but not more. The threshold can be adjusted by the
repurposed --vd-lavc-software-fallback option.
(This behavior was suggested much earlier in some PR, but at the time
the "proper" hwdec fallback was indistinguishable from decoding error.
With the current situation, "proper" fallback is still instantious.)
The uninit() function was called twice if the uninit() function failed
(once by init(), once by vd_lavc.c code), which caused crashes due to
double-free. (This failure is a corner case, and all other hwdec
backends appear to handle this case gracefully.)
I do not think this code should be able to deal with uninit() being
called other than once. Guarantee that it's called exactly once.
Quoting MSDN: "Notifies the Desktop Window Manager (DWM) to opt in to or
out of Multimedia Class Schedule Service (MMCSS) scheduling while the
calling process is alive.". Whatever this means. (An application can
change the scheduling priority of the window manager?)
Does this improve anything? I have no idea. Certainly this is a program
that does multimedia and graphics, so we seem to be a good match for
this.
Is it bad if we enable this even while playback is inactive or paused? I
have no idea either.
Is there a magic cargo cult function that will mark our renderer thread
as multimedia thing? I have no idea. (We use a function to enable MMCSS
for our audio thread in ao_wasapi.)
Enable it by default, but not unconditionally. Add an "auto" mode, which
disable DwmFlush if the compositor is (probably) inactive. Let's see how
this goes.
Since I accidentally enabled DwmFlush always by default (more or less)
in a previous commit touching this code, this is probably mostly just
cargo-culting, and it's uncertain whether it does anything.
Note that I still got bad vsync behavior when fullscreening mpv, and
making another window visible on the same screen. This happens even if
forcing DWM.
Commit acd5816a broke this. It was stopping playback occasionally.
Another case where the non-display-sync interpolation mode
(in->vsync_timed==true) is causing a lot of subtle issues and will be
removed soon.
PAL8 is the only format that is RGB, has only 1 component, is byte-
aligned. It was accidentally detected by the GBRP case as planar RGB.
(It would have been ok if it were gray; what ruins it is that it's
actually paletted, and the color values do not correspond to colors (but
palette entries).
Pseudo-pal formats are ok; in fact AV_PIX_FMT_GRAY is rightfully marked
as MP_IMGFLAG_YUV_P.
Regression since commit 93db4233. I think the bit that was forgotten
here was to remove the vo_w32_config() return value completely. The VO
failed to init because that function always returned 0. This commit
removes these bits and fixes the VO.
Fixes#2434.
Yet another relatively useless option that tries to make OpenGL's sync
behavior somewhat sane. The results are not too encouraging. With a
value of 1, vsync jitter is gone on nVidia, but there are frame drops
(less than with glfinish). With 2, I get the usual vsync jitter _and_
frame drops.
There's still some hope that it might prevent too deep queuing with some
GPUs, I guess.
The timeout for the wait call is 1 second. The value is pretty
arbitrary; it should just not be too high to freeze the process (if
the GPU is un-nice), and not too low to trigger the timeout in normal
cases, even if the GPU load is very high. So I guess 1 second is ok
as a timeout.
The idea to use fences this way to control the queue depth was stolen
from RetroArch:
df01279cf3/gfx/drivers/gl.c (L1856)
Commit a1315c76 broke this slightly. Frame drops got counted multiple
times, and also vo.c was actually trying to "render" the dropped frame
over and over again (normally not a problem, since frames are always
queued "tightly" in display-sync mode, but could have caused 100% CPU
usage in some rare corner cases).
Do not repeat already dropped frames, but still treat new frames with
num_vsyncs==0 as dropped frames. Also, strictly count dropped frames in
the VO. This means we don't count "soft" dropped frames anymore (frames
that are shown, but for fewer vsyncs than intended). This will be
adjusted in the next commit.
vo_frame.num_vsyncs can be != 1 in some cases in normal sync mode too.
This is not a very exact fix, but in exchange it's robust. (These
vo_frame flags are way too tricky in combination with redrawing and
such.)
There were occasional shader compilation and rendering failures if FBOs
were unavailable. This is caused by the FBO caching code getting active,
even though FBOs are unavailable (i.e. dumb-mode).
Boken by commit 97fc4f.
Fixes#2432.
Fixes linker failure. How did this ever work? Apparently it did most of
the time, but apparently we just got the first case where it didn't.
Fixes#2433.
av_free_packet() got finally deprecated. Use av_packet_unref() instead,
which has almost the same semantics, has existed for a while, and is
available in all FFmpeg and Libav versions we support.
This was not very reliable.
In the normal vo_opengl case, this didn't deal well enough with vsync
jitter. Vsync timings can jitter quite extremely, up to a whole vsync
duration, in which case the "missed" frame counter keeps growing, even
though nothing is wrong. This behavior also messes up the A/V difference
calculation, but as long as it's within tolerance, it won't provoke
extra frame dropping/repeating. Real misses are harder to detect, and I
might add such detection later.
In the vo_opengl_cb case, this was additionally broken due to the
asynchronity between renderer and VO threads.
This speeds up redraws considerably (improving eg. <60 Hz material on a 60 Hz
monitor with display-sync active, or redraws while paused), but slightly
slows down the worst case (eg. video FPS = display FPS).
The IME is not useful for key-bindings. Handle the base ASCII chars
instead and don't show the IME window. For the sake of libmpv users, the
IME should only be disabled on mpv's GUI thread and not application-
wide.
No IME on the GUI thread should also mean that VK_PROCESSKEY will never
have to be handled, so the logic for that can be removed as well.
Older systems have certain EGL extension definitions missing. We
redefine them to make the build system easier, and because it's trivial.
But we forgot to define the EGL_LINUX_DMA_BUF_EXT identifier. (I hope
it's the only missing one.)
The equalizer code as it exists in vo_opengl works perfectly fine. The
situation in vo_opengl_cb is pretty different. The playback thread can't
communicate with the renderer thread synchronously (essentially to give
the API user more flexibility). So the equalizer communication has to be
done in an asynchronous way too.
There were two problems. First, the eq capabilities can change with the
pixel format, and the renderer initializes them on config only. This
means equalizers were disabled on the first config run, and options like
--video-output-levels or --brightness would not work. So we just
initialize the caps with a known superset. The player will not correctly
indicate when setting an eq doesn't work, but we're fine with it, as it
is a relatively cosmetic issue.
Second, it copied back the eq settings in the "wrong" moment (what
for?), which overwrote the settings in some cases.
Third, the eq was not reset correctly on vo init. This is needed to make
it behave the same as vo_opengl.
It's great that the new algorithm supports multiple placebo iterations
and all, but it's really not necessary and hurts performance in the
general case for the sake of the 0.1% that actually pause the screen
and look for minute differences.
Signed-off-by: wm4 <wm4@nowhere>
This reverts commit c10fb4ce9f.
This is already done in vo_wayland.c:resize,324 doing it here makes the window bigger before the video resizes showing a black area while dragging the border.
The previous commit moved the av_frame_unref() after the got_picture
check. This accidentally also deferred the software fallback
reinitialization to until a software picture was decoded (instead of the
exact time of the fallback), which is not ideal.
Just rely on the fact that calling av_frame_unref() on a frame is ok
even if nothing was decoded.
Commit 12cd48a8 started setting the hwdec_failed field even if hwdec was
not active, and because it also checked this field even if hwdec was not
active, broke decoding forever.
Fix this, and also avoid a memory leak or API misuse by releasing the
decoded picture. Passing an unreleased frame to the decoder has as far
as I know no defined effects.
Adds support for AV_PIX_FMT_GBRP9, AV_PIX_FMT_GBRP10, AV_PIX_FMT_GBRP12,
AV_PIX_FMT_GBRP14, AV_PIX_FMT_GBRP16, AV_PIX_FMT_GBRAP, and
AV_PIX_FMT_GBRAP16.
(Not that it matters, because nobody uses these anyway.)
This VA_FOURCC isn't even defined by latest drivers, so I'm just
assuming it doesn't exist and never existed. For planar 4:2:0,
VA_FOURCC_YV12 is normally preferred, and there's even a VA_FOURCC_IYUV
for 4:2:0 with unswapped planes.
0.34 and 0.35 don't have the buffer API, such as vaAcquireBufferHandle.
This is only needed for the EGL interop, but why bother staying
compatible for such old things (0.36 was released over a year ago).
We also can drop some minor compatibility ifdeffery.
When seeking, the current frame will have the wrong timestamp, until the
seek is done and a new frame from the seek target is shown. We still use
the "old" frame for redrawing during seeks. This frame has to be
explicitly marked with still=true in order not to confuse the
interpolation queue. If this is not done, it will ignore frames until a
frame with approximately the same timestamp as the "old" frame is
reached. (Does this mean interpolation handles timestamp resets
incorrectly? I have no idea.)
Of course we also have to clear possibly queued frames on seeks.
Also, in pausing, explicitly let the frame redraw.
Newer nVidia drivers support EGL, but they seem to work badly,
apparently don't support some needed features or not in a form we want
(such as swap control), and vdpau interop is not available. Disable it
by default, because I'm tired of explaining this issue.
Can be reverted as soon as nVidia release working drivers.
The libavcodec h264 decoder contains some idiotic code with unknown
purpose (no sample or explanation known that necessitates its
existence), that causes the AVCodecContext.get_format callback to be
invoked at a time when hwaccels can't be initialized. By definition, the
get_format callback is supposed to initialize hwaccels (another idiotic
thing now part of the API, but different story). This causes hwdec
initialization sometimes to fail (WolfensteinTwitch.mp4): the first
get_format callback will mark it as failed, so the second get_format
(the "proper" normal one) will not bother restoring the state, and hwdec
init fails.
While this should be fixed in libavcodec (good luck with that), it's
quite easy to workaround.
This fixes a regression since commit f4d62da8. The original code run
vo_cocoa_config_window() once without creating the window, which had the
effect that the last part of the function was run at least once before
the actual window was created. Fix the regression by moving it to before
the window is created.
The regression itself is hard to describe. One test case: start mpv from
a fullscreened terminal window. It should switch to another desktop,
with the mpv window visible. This didn't happen anymore.
Use the first encountered packet PTS/DTS as base, instead of the last
one. This does not add the amount of frames buffered in the codec to the
PTS offset, and thus is better.
Also, don't add the frame time if there was no decoded frame yet. The
first frame should obviously have the timestamp of the first packet
(going by this heuristic).
While b-frame reordering limits the maximum required number to around
16, the number of additionally buffered frames can be much higher.
Guess when this actually matters? (For the libavcodec MMAL wrapper.)
Useless. Sometimes it might be useful to make some extremely broken
files work, but on the other hand --no-correct-pts is sufficient for
these cases.
While we still need some of the code for AVI, the "auto" mode in
particular inflated the size of the code.
This can't be handled correctly at all. Other cases when the decoder
might drop a frame (such as completely failing to decode a frame) will
shift timestamps by a frame, and it can't be avoided.
While we could maybe find a better way to handle this with libavcodec's
main decoders, this seems to be much harder if it should work with
certain HW decoders, which don't passthrough the DTS field (such as
MMAL). Another problem are .avi files with b-frames. So just leave it
as it is.
This was used only by the timestamp sorting code, which is a fallback
for avi files (as well as avi-muxed mkv files). This was supposed to
prevent accumulating timestamps in case the decoder consumes more
packets than it outputs frames (i.e. frames are dropped). This didn't
work very well (timestamps could be off by a large amount), the
estimation of the delay was fragile, and the interdependencies with the
decoder were annoying, so kill it.
This parameter has been unused for years (the last flag was removed in
commit d658b115). Get rid of it.
This affects the general VO API, as well as the vo_opengl backend API,
so it touches a lot of files.
The VOFLAGs are still used to control OpenGL context creation, so move
them to the OpenGL backend code.
This gets rid of an old hack, VOFLAG_HIDDEN. Although handling of it has
been sane for a while, it used to cause much pain, and is still
unintuitive and weird even today.
The main reason for this hack is that OpenGL selects a X11 Visual for
you, and you're supposed to use this Visual when creating the X window
for the OpenGL context. Which means the X window can't be created early
in the common X11 init code, but the OpenGL code needs to do something
before that. API-wise you need separate functions for X11 init and X11
window creation. The VOFLAG_HIDDEN hack conflated window creation and
the entrypoint for resizing on video resolution change into one
function, vo_x11_config_vo_window(). This required all platform backends
to handle this flag, even if they didn't need this mechanism.
Wayland still uses this for minor reasons (alpha support?), so the
wayland backend must be changed before the flag can be entirely removed.
If interpolation is enabled, then this causes heavy artifacts if done
while unpaused. It's preferable to allow a latency of a few frames for
the change to take full effect instead. If this is done paused, the
frame is fully redrawn anyway.
This reverts commit d11184a256.
Unfortunately, there was a lot of unexpected resistance.
Do note that this is still extremely slow, crappy, etc.
Note that vo_x11.c was further edited. Compared to the removed vo_x11.c,
an additional ~200 lines of code was removed in order to simplify it. I
tried to strip it down as much as possible. In particular, support for
odd non-32 bit formats (24, 16, 15, 8 bit) is dropped.
Closes#2300.
The vf_format suboption is replaced with --video-output-levels (a global
option and property). In particular, the parameter is removed from
mp_image_params. The mechanism is moved to the "video equalizer", which
also handles common video output customization like brightness and
contrast controls.
The new code is slightly cleaner, and the top-level option is slightly
more user-friendly than as vf_format sub-option.
This essentially reverts commit 009dfbe3. FFmpeg VideoToolbox support
is being wacky, and can cause major issues, such as not being able
to decode a single frame. (E.g. by playing a .ts file. This should be
fixed in FFmpeg eventually.)
This is not a straight revert of the commit; just a functional one. We
keep the slightly simpler code structure.
It doesn't deal with VDA at all anymore. Rename it to hwdec_osx.c. Not
using hwdec_videotoolbox.c, because that would give it the longest
source path in this project yet. (Also, this code isn't even
VideoToolox-specific, other than the name of the pixel format used.)
VideoToolbox is preferred. Now that FFmpeg released 2.8, there's no
reason to support VDA anymore. In fact, we had a bug that made VDA not
useable with older FFmpeg versions in some newer mpv releases.
VideoToolbox is supported even on slightly older OSX versions, and if
not, you still can run mpv without hw decoding.
Definitely not needed anymore, and fixes a crash in some weird corner-
cases.
The extradata freeing is apparently still needed, though. (Because a
codec context can be opened again, which makes no sense, but ok.)
There are at least 2 ways of using VAAPI without X11 (Wayland, DRM).
Remove the X11 requirement from the decoder part and the EGL interop.
This will be used by a following commit, which adds Wayland support.
The worst about this is the decoder part, which includes a bad hack for
using the decoder without any VO interop (also known as "vaapi-copy"
mode). Separate the X11 parts so that they're self-contained. For the
EGL interop code we do something similar (it's kept slightly simpler,
because it essentially only has to translate between our silly
MPGetNativeDisplay abstraction and the vaGetDisplay...() call).
It looks like my hope that we can unconditionally include EGL headers in
the OpenGL code is not coming true, because OSX does not support EGL at
all. So I prefer loading the VAAPI EGL/GL specific extensions manually,
because it's less of a mess. Partially reverts commit d47dff3f.
While EGL 1.4 seemed a bit ambiguous about this to me, it actually says
quite clearly that core functions are not supported with
eglGetProcAddress() in the following paragraph.
Normally, we prefer GLX on X11. But for the VAAPI EGL interop, we
obviously want EGL. Since nvidia does not provide EGL with desktop GL
yet, we can leave it to the autoprobing. Just make sure some failure
messages don't unnecessarily show up in the nvidia case.
This breaks VAAPI GLX interop by default, but I don't care much. If
you use --hwdec=auto (which you should if you want hw decoding), this
should fallback to vaapi-copy instead.
Probe the surface format, and check whether it's really something we
support. This also does a complete check whether the EGL interop works
at all (the only way to find this out is actually running this code).
Also, support YV12. Under some circumstances, vaapi (with Intel
drivers) can be made to use this format.
Unfortunately, the Intel drivers show some very weird behavior, which
is hopefully a bug. insane_hack() provides a very evil workaround (see
comments). A proper solution might be passing the hw format as part of
mp_image_params, but as long as hw surfaces appear to be able to change
the format on the fly, attempting this is probably not worth the extra
complexity and likely fragility. The hack allows us to pretend that
there is sane behavior for now.
This makes it much faster if the surface is really mapped from GPU
memory. It's slightly slower than system memcpy if used on system
memory. We don't really know definitely in which type of memory
it's located, so we use the GPU memcpy in all cases.
Fixes#2317.
Make the GPU memcpy from the dxva2 code generally useful to other parts
of the player.
We need to check at configure time whether SSE intrinsics work at all.
(At least in this form, they won't work on clang, for example. It also
won't work on non-x86.)
Introduce a mp_image_copy_gpu(), and make the dxva2 code use it. Do some
awkward stuff to share the existing code used by mp_image_copy(). I'm
hoping that FFmpeg will sooner or later provide a function like this, so
we can remove most of this again. (There is a patch, bit it's stuck in
limbo since forever.)
All this is used by the following commit.
Broken by commit d47dff3f. If something is going to include EGL.h,
header_fixes.h has to know. This definitely affected vo_rpi, and
probably affects wayland builds (with x11egl didabled) as well.
Checking and resetting the VAImage.buf field is non-sense, even if it
happened to work out in the normal case. buf is actually freed when
vaDestroyImage() is called (not quite intuitive), and we need an extra
field to know whether vaReleaseBufferHandle() has to be called.
Printing "Using vaDeriveImage()" every frame is too verbose, so raise
the log level.
mp_image strides are in int and not unsigned int; fix this. It's not
like it actually matters, though.
Finish a comment.
Add the upstream symbolic names as comments. Normally, these should be
defined in libdrm's drm_fourcc.h header. But DRM_FORMAT_R8 and
DRM_FORMAT_GR88 are not defined anywhere, except in the kernel userland
headers of Linux 4.3 (!). We don't want mpv to depend on bleeding-edge
Linux kernel headers, so this will have to do.
Also, just for completeness, add fourccs for the 3 and 4 channel
formats. I didn't manage to test them, though.
Reduces the amount of hardcoded assumptions about the layout
drastically. (Now adding yuv420 support would be just adjusting an if,
if you ignore the other problems, such as determining the hw format at
all early enough.)
Don't call eglDestroyImageKHR() on the same ID possibly more than once.
Clear the image reference on termination, or we would leak up to 1 image
per VO recreation.
Should work much better than the old GLX interop code. Requires Mesa 11,
and explicitly selecting the X11 EGL backend with:
--vo=opengl:backend=x11egl
Should it turn out that the new interop works well, we will try to
autodetect EGL by default.
This code still uses some bad assumptions, like expecting surfaces to be
in NV12. (This is probably ok, because virtually all HW will use this
format. But we should at least check this on init or so, instead of
failing to render an image if our assumption doesn't hold up.)
This repo was a lot of help: https://github.com/gbeauchesne/ffvademo
The kodi code was also helpful (the magic FourCC it uses for
EGL_LINUX_DRM_FOURCC_EXT are nowhere documented, and
EGL_IMAGE_INTERNAL_FORMAT_EXT as used in ffvademo does
not actually exist).
(This is the 3rd VAAPI GL interop that was implemented in this player.)
The VAAPI EGL interop code will need access to the X11 Display. While
GLX could return it from the current GLX context, EGL has no such
mechanism. (At least no standard one supported by all implementations.)
So mpv makes up such a mechanism.
For internal purposes, this is very rather awkward solution, but it's
needed for libmpv anyway.
These extensions use a bunch of EGL types, so we need to include the EGL
headers in common.h to use our GL function loader with this.
In the future, we should probably require presence of the EGL headers to
reduce the hacks. This might be not so simple at least with OSX, so for
now this has to do.
Surfaces used by hardware decoding formats can be mapped exactly like a
specific software pixel format, e.g. RGBA or NV12. p->image_params is
supposed to be set to this format, but it wasn't.
(How did this ever work?)
Also, setting params->imgfmt in the hwdec interop drivers is pointless
and redundant. (Change them to asserts, because why not.)
This is a pseudo-OpenGL extension for letting libmpv query native
windowing system handles from the API user. (It uses the OpenGL
extension mechanism because I'm lazy. In theory it would be nicer to let
the user pass them with mpv_opengl_cb_init_gl(), but this would require
a more intrusive API change to extend its argument list.)
The naming of the extension and associated function was unnecessarily
Windows specific (using "D3D"), even though it would work just fine for
other platforms. So deprecate the old names and introduce new ones. The
old ones still work.
This turns the old scalers (inherited from MPlayer) into a pre-
processing step (after color conversion and before scaling). The code
for the "sharpen5" scaler is reused for this.
The main reason MPlayer implemented this as scalers was perhaps because
FBOs were too expensive, and making it a scaler allowed to implement
this in 1 pass. But unsharp masking is not really a scaler, and I would
guess the result is more like combining bilinear scaling and unsharp
masking.