mirror/mpv - mpv - git.tjdev.de

mirror/mpv

mirror of https://github.com/mpv-player/mpv.git synced 2024-09-20 20:03:10 +02:00

Author	SHA1	Message	Date
Daniel Bermond	8a725ec951	vulkan/wayland: fix another build breakage Commit `07b0c18` introduced some build breakages. Some breakages were fixed on `c1fc535` and `a1adafe`. This one is still remaining. This commit fixes the following build error: [153/521] Compiling video/out/vulkan/context_wayland.c ../video/out/vulkan/context_wayland.c:26:10: fatal error: video/out/wayland/presentation-time.h: No such file or directory 26 \| #include "video/out/wayland/presentation-time.h" \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. Relevant to: #7802	2020-06-05 13:43:50 +02:00
Dudemanguy	055a490cef	wayland: use mp_time deltas for presentation time One not-so-nice hack in the wayland code is the assumption of when a window is hidden (out of view from the compositor) and an arbitrary delay for enabling/disabling the usage of presentation time. Since you do not receive any presentation feedback when a window is hidden on wayland (a feature or misfeature depending on who you ask), the ust is updated based on the refresh_nsec statistic gathered from the previous feedback event. The flaw with this is that refresh_nsec basically just reports back the display's refresh rate (1 / refresh_rate * 10^9). It doesn't tell you how long the vsync interval really was. So as a video is left playing out of view, the wl->last_queue_display_time becomes increasingly inaccurate. This led to a vsync spike when bringing the mpv window back into sight after it was hidden for a period of time. The hack for working around this is to just wait a while before enabling presentation time again. The discrepancy between the "bogus" wl->last_queue_display_time and the actual value you get from the feedback only happens initially after a switch. If you just discard those values, you avoid the dramatic vsync spike. It turns out that there's a smarter way to do this. Just use mp_time_us deltas. The whole reason for these hacks is because wl->last_queue_display_time wasn't close enough to how long it would take for a frame to actually display if it wasn't hidden. Instead, mpv's internal timer can be used, and the difference between wayland_sync_swap calls is a close enough proxy for the vsync interval (certainly better than using the monitor's refresh rate). This avoids the entire conundrum of massive vsync spikes when bringing the player back into view, and it means we can get rid of extra crap like wl->hidden.	2020-04-20 21:02:02 +00:00
wm4	26f4f18c06	options: change option macros and all option declarations Change all OPT_* macros such that they don't define the entire m_option initializer, and instead expand only to a part of it, which sets certain fields. This requires changing almost every option declaration, because they all use these macros. A declaration now always starts with {"name", ... followed by designated initializers only (possibly wrapped in macros). The OPT_* macros now initialize the .offset and .type fields only, sometimes also .priv and others. I think this change makes the option macros less tricky. The old code had to stuff everything into macro arguments (and attempted to allow setting arbitrary fields by letting the user pass designated initializers in the vararg parts). Some of this was made messy due to C99 and C11 not allowing 0-sized varargs with ',' removal. It's also possible that this change is pointless, other than cosmetic preferences. Not too happy about some things. For example, the OPT_CHOICE() indentation I applied looks a bit ugly. Much of this change was done with regex search&replace, but some places required manual editing. In particular, code in "obscure" areas (which I didn't include in compilation) might be broken now. In wayland_common.c the author of some option declarations confused the flags parameter with the default value (though the default value was also properly set below). I fixed this with this change.	2020-03-18 19:52:01 +01:00
dudemanguy	b926f18938	wayland: remove wayland-frame-wait-offset option This originally existed as a hack for weston. In certain scenarios, a frame taking too long to render would cause vo_wayland_wait_frame to timeout which would result in a ton of dropped frames. The naive solution was to just to add a slight delay to the time value. If a frame took too long, it would likely to fall under the timeout value and all was well. This was exposed to the user since the default delay (1000) was completely arbitrary. However with presentation time, this doesn't appear to be neccesary. Fresh frames that take longer than the display's refresh rate (16.666 ms in most cases) behave well in Weston. In the other two main compositors without presentation time (GNOME and Plasma), they also do not experience any ill effects. It's better not to overcomplicate things, so this "feature" can be removed now.	2020-01-31 00:40:44 +00:00
Niklas Haas	2a70140ba8	vo_gpu: vulkan: set allow_suboptimal when possible This was added in libplacebo v1.29.0 and allows making resizes slightly smoother for clients that already handle resize events (such as mpv).	2019-12-22 03:55:07 +01:00
Dudemanguy911	9dead2b932	wayland: fix presentation time There's 2 stupid things here that need to be fixed. First of all, vulkan wasn't actually using presentation time because somehow the get_vsync function in context.c disappeared. Secondly, if the mpv window was hidden it was updating the ust time based on the refresh_usec but really it should simply just not feed any information to the vsync info structure. So this adds some logic to assume whether or not a window is hidden.	2019-10-20 19:50:10 +00:00
dudemanguy	027ca4fb85	wayland: add various render-related options The newest wayland changes have some new logic that make sense to expose to users as configurable options.	2019-10-20 15:34:57 +00:00
dudemanguy	bedca07a02	wayland: add presentation time Use ust/msc/refresh values from wayland's presentation time in mpv's ra_swapchain_fns.get_vsync for the wayland contexts.	2019-10-20 15:34:57 +00:00
dudemanguy	ea4685b233	wayland: use callback flag + poll for buffer swap The old way of using wayland in mpv relied on an external renderloop for semi-accurate timings. This had multiple issues though. Display sync would break whenever the window was hidden (since the frame callback stopped being executed) which was really annoying. Also the entire external renderloop logic was kind of fragile and didn't play well with mpv's internal structure (i.e. using presentation time in that old paradigm breaks stats.lua). Basically the problem is that swap buffers blocks on wayland which is crap whenever you hide the mpv window since it looks up the entire player. So you have to make swap buffers not block, but this has a different problem. Timings will be terrible if you use the unblocked swap buffers call. Based on some discussion in #wayland, the trick here is relatively simple and works well enough for our purposes. Instead we basically build a way to block with a timeout in the wayland buffer swap functions. A bool is set in the frame callback function that indicates whether or not mpv is waiting for a frame to be displayed. In the actual buffer swap function, we enter into a while loop waiting for this flag to be set. At the same time, the wl_display is polled to block the thread and wakeup if it receives any events from the compositor. This loop only breaks if enough time has passed or if the frame callback bool is received. In the near future, it is better to set whether or not frame a frame has been displayed in the presentation feedback. However as a first pass, doing it in the frame callback is more than good enough. The "downside" is that we render frames that aren't actually shown on screen when the player is hidden (it seems like wayland people don't like that). But who cares. Accurate timings are way more important. It's probably not too hard to add that behavior back in the player though.	2019-10-10 17:41:19 +00:00
Anton Kindestam	6290420380	vo: make swapchain-depth option generic for all VOs In preparation for making vo_drm able to use swapchain-depth	2019-09-28 14:10:01 +03:00
sfan5	e350ceef4c	vo_gpu: vulkan: add Android context	2019-09-27 00:05:06 +03:00
Philip Langdale	b539eb222b	vo/gpu: vulkan: Pass the device name option through to libplacebo We collect a 'vulkan-device' option today but then don't actually pass it on, so it's useless. Once that's fixed, it can be used to select a specific vulkan device by name. Tested with the new nvidia offload feature to select between the nvidia and intel GPUs.	2019-08-24 18:38:27 +02:00
Philip Langdale	b70ed35ba4	vo_gpu: hwdec_vaapi: Add Vulkan interop This change introduces a vulkan interop path for the vaapi hwdec. The basic principles are mostly the same as for EGL, with the exported dma_buf being imported by Vukan. The biggest difference is that we cannot reuse the texture as we do with OpenGL - there's no way to rebind a VkImage to a different piece of memory, as far as I can see. So, a new texture is created on each map call. I did not bother implementing a code path for the old libva API as I think it's safe to assume any system with a working vulkan driver will have access to a newer libva. Note that we are using separate layers for the vaapi surface, just as is done for EGL. This is because libplacebo doesn't support multiplane images. This change does not include format negotiation because no driver implements the vk_ext_image_drm_format_modifier extension that would be required to do that. In practice, the two formats we care about (nv12, p010) work correctly, so we are not blocked. A separate change had to be made in libplacebo to filter out non-fatal validation errors related to surface sizes due to the lack of format negotiation.	2019-07-08 01:57:02 +02:00
Philip Langdale	b74b39dfb5	vo_gpu: vulkan: Add back context_win for libplacebo Feature parity with the original ra_vk obviously requires win32 support, so let's put it back in.	2019-04-21 23:55:22 +03:00
Niklas Haas	7006d6752d	vo_gpu: vulkan: use libplacebo instead This commit rips out the entire mpv vulkan implementation in favor of exposing lightweight wrappers on top of libplacebo instead, which provides much of the same except in a more up-to-date and polished form. This (finally) unifies the code base between mpv and libplacebo, which is something I've been hoping to do for a long time. Note: The ra_pl wrappers are abstract enough from the actual libplacebo device type that we can in theory re-use them for other devices like d3d11 or even opengl in the future, so I moved them to a separate directory for the time being. However, the rest of the code is still vulkan-specific, so I've kept the "vulkan" naming and file paths, rather than introducing a new `--gpu-api` type. (Which would have been ended up with significantly more code duplicaiton) Plus, the code and functionality is similar enough that for most users this should just be a straight-up drop-in replacement. Note: This commit excludes some changes; specifically, the updates to context_win and hwdec_cuda are deferred to separate commits for authorship reasons.	2019-04-21 23:55:22 +03:00
Niklas Haas	f0b6860d62	vo_gpu: index desc namespaces by ra No reason to require them be constant. This allows them to depend on runtime characteristics of the `ra`.	2019-04-21 23:55:22 +03:00
Niklas Haas	5bcac8580d	spirv: remove --spirv-compiler=nvidia This option has been deprecated upstream for a long time, probably doesn't even work anymore, and won't work moving forwards as we replace the vulkan code by libplacebo wrappers. I haven't removed the option completely yet since in theory we could still add support for e.g. a native glslang wrapper in the future. But most likely the future of this code is deletion. As an aside, fix an issue where the man page didn't mention d3d11.	2018-12-01 15:50:23 +02:00
Niklas Haas	b8bb5329a5	vulkan: slightly improve vsync jitter measurements By design, some vulkan implementations block until vsync during vkAcquireNextImageKHR. Since mpv only considers the time that `swap_buffers` spent blocking as constituting part of the vsync, we can help it out a bit by pre-emptively calling this function here in order to improve the accuracy of vsync jitter measurements on vulkan. (If it fails, we just ignore the error and have the user call it a second time later - maybe it will work then) On my system this drops vsync-jitter from ~0.030 to ~0.007, an accuracy of +/- 100μs. (Which might have something to do with the fact that this is the polling interval for command polling)	2018-11-19 00:23:15 +02:00
Niklas Haas	34df6bd82f	vo_gpu: vulkan: only rotate the queues on swap Makes performance slightly better when using multiple queues by avoiding unnecessary semaphores due to bad queue selection. Also remove an aeons-old workaround for an nvidia bug that only ever existed in the earliest beta vulkan drivers anyway.	2018-11-19 00:22:41 +02:00
Philip Langdale	c8a065df12	vo_gpu: vulkan: Always use KHR suffix types and defines I was inconsistent about this originally, as the functionality was moved into the core spec in 1.1 and so both suffixed and unsuffixed versions of everything exist and can be mixed together. There's no reason to fail to build with 1.0.39+ so I'm fixing the names.	2018-11-03 23:53:08 +02:00
Philip Langdale	621389134a	vo_gpu: vulkan: Add a function to get the device UUID We need this to do device matching for the cuda interop.	2018-10-22 21:35:48 +02:00
Philip Langdale	a782197d21	vo_gpu: vulkan: Add arbitrary user data for an ra_vk_buf This is arguably a little contrived, but in the case of CUDA interop, we have to track additional state on the cuda side for each exported buffer. If we want to be able to manage buffers with an ra_buf_pool, we need some way to keep that CUDA state associated with each created buffer. The easiest way to do that is to attach it directly to the buffers.	2018-10-22 21:35:48 +02:00
Philip Langdale	93f800a00f	vo_gpu: vulkan: Add support for exporting buffer memory The CUDA/Vulkan interop works on the basis of memory being exported from Vulkan and then imported by CUDA. To enable this, we add a way to declare a buffer as being intended for export, and then add a function to do the export. For now, we support the fd and Handle based exports on Linux and Windows respectively. There are others, which we can support when a need arises. Also note that this is just for exporting buffers, rather than textures (VkImages). Image import on the CUDA side is supposed to work, but it is currently buggy and waiting for a new driver release. Finally, at least with my nvidia hardware and drivers, everything seems to work even if we don't initialise the buffer with the right exportability options. Nevertheless I'm enforcing it so that we're following the spec.	2018-10-22 21:35:48 +02:00
Niklas Haas	facc63b862	vo_gpu: vulkan: suppress bogus error message on --vulkan-device Since the code just broke out of the loop on a match rather than jumping straight to the end of the function body, it ended up hitting the code path for when the end of the list was reached.	2018-10-21 23:33:36 +02:00
Niklas Haas	0f3e25cb0a	vo_gpu: vulkan: fix the buffer size on partial upload This was pased on the texture height, which was a mistake. In some cases it could exceed the actual size of the buffer, leading to a vulkan API error. This didn't seem to cause any problems in practice, since a too-large synchronization is just bad for performance and shouldn't do any harm internally, but either way, it was still undefined behavior to submit a barrier outside of the buffer size. Fix the calculation, thus fixing this issue.	2018-10-19 22:58:16 +02:00
wm4	f17246fec1	vo_gpu: remove old window screenshot glue code and GL implementation There is now a better way. Reading the font framebuffer was always a hack. The new code via VOCTRL_SCREENSHOT renders it into a FBO, which does not come with the disadvantages of reading the front buffer (like not being supported by GLES, possibly black regions due to overlapping windows on some systems). For now keep VOCTRL_SCREENSHOT_WIN on the VO level, because there are still some lesser VOs and backends that use it.	2018-02-13 17:45:29 -08:00
Niklas Haas	0870859e3d	vo_gpu: add RA_CAP for gl_NumWorkGroups SPIRV-Cross doesn't support this for the time being. It's possible this could go away again at a later date.	2018-02-05 23:11:18 -08:00
Niklas Haas	5997248505	vo_gpu: vulkan: correctly enable textureGatherOffset This also requires a vulkan feature / SPIR-V capability to function	2018-02-05 02:49:03 -08:00
Niklas Haas	f151ac57cb	vo_gpu: vulkan: don't issue queries for unused timers The vulkan validation layers warn you if you try requesting a query result from a timer that hasn't even been started yet, so we have to do some extra bit of work to keep track of which indices we've seen so far, and avoid the queries on them.	2018-02-05 02:49:03 -08:00
Niklas Haas	f92e45bb8c	vo_gpu: vulkan: try enabling required features Instead of enabling every feature under the sun, make an effort to just whitelist the ones we actually might use. Turns out the extended storage format support is needed for some of the storage formats we use, in particular rgba16.	2018-02-05 02:49:03 -08:00
Niklas Haas	92778873ad	vo_gpu: vulkan: add missing buffer barrier fields These were accidentally omitted.	2018-02-05 02:49:03 -08:00
Niklas Haas	d588bdaaf7	vo_gpu: vulkan: fix segfault due to index mismatch The queue family index and the queue info index are not necessarily the same, so we're forced to do a check based on the queue family index itself. Fixes #5049	2017-12-25 00:47:53 +01:00
Niklas Haas	12c6700a3c	vo_gpu: vulkan: fix some image barrier oddities A vulkan validation layer update pointed out that this was wrong; we still need to use the access type corresponding to the stage mask, even if it means our code won't be able to skip the pipeline barrier (which would be wrong anyway). In additiona to this, we're also not allowed to specify any source access mask when transitioning from top_of_pipe, which doesn't make any sense anyway.	2017-12-25 00:47:53 +01:00
Niklas Haas	97b1482d53	vo_gpu: vulkan: fix sharing mode on malloc'd buffers Might explain some of the issues in multi-queue scenarios?	2017-12-25 00:47:53 +01:00
Niklas Haas	5abe0bd593	vo_gpu: vulkan: fix dummyPass creation This violates vulkan spec	2017-12-25 00:47:53 +01:00
Niklas Haas	58e201e6bd	vo_gpu: vulkan: fix the rgb565a1 names -> rgb5a1 This is 5 bits per channel, not 565	2017-12-25 00:47:53 +01:00
Niklas Haas	286d421666	vo_gpu: vulkan: allow disabling async tf/comp Async compute in particular seems to cause problems on some drivers, and even when supprted the benefits are not that massive from the tests I have seen, so it's probably safe to keep off by default. Async transfer on the other hand seems to work better and offers a more substantial improvement, so it's kept on.	2017-12-25 00:47:53 +01:00
Niklas Haas	a6aab5dfd6	vo_gpu: vulkan: refine queue family selection algorithm This gets confused by e.g. SPARSE_BIT on the TRANSFER_BIT, leading to situations where "more specialized" is ambiguous and the logic breaks down. So to fix it, only compare the subset we care about.	2017-12-25 00:47:53 +01:00
Niklas Haas	2d1769a534	vo_gpu: vulkan: prefer vkCmdCopyImage over vkCmdBlitImage blit() implies scaling, copy() is the equivalent command to use when the formats are compatible (same pixel size) and the rects have the same dimensions.	2017-12-25 00:47:53 +01:00
Niklas Haas	a42b8b1142	vo_gpu: attempt re-using the FBO format for p->output_tex This allows RAs with support for non-opaque FBO formats to use a more appropriate FBO format for the output tex, possibly enabling a more efficient blit operation. This requires distinguishing between real formats (which can be used to create textures) and fake formats (e.g. ra_gl's FBO hack).	2017-12-25 00:47:53 +01:00
Niklas Haas	80540be211	vo_gpu: vulkan: properly depend on the swapchain acquire semaphore This is now associated with the ra_tex directly and used in the correct way, rather than hackily done from submit_frame.	2017-12-25 00:47:53 +01:00
Niklas Haas	b138bdc01c	vo_gpu: vulkan: use correct access flag for present This needs VK_ACCESS_MEMORY_READ_BIT (spec)	2017-12-25 00:47:53 +01:00
Niklas Haas	8b0a111c59	vo_gpu: vulkan: make the swapchain more robust Now handles both VK_ERROR_OUT_OF_DATE_KHR and VK_SUBOPTIMAL_KHR for both vkAcquireNextImageKHR and vkQueuePresentKHR in the correct way.	2017-12-25 00:47:53 +01:00
Niklas Haas	dcda8bd36a	vo_gpu: aggressively prefer async compute On AMD devices, we only get one graphics pipe but several compute pipes which can (in theory) run independently. As such, we should prefer compute shaders over fragment shaders in scenarios where we expect them to be better for parallelism. This is amusingly trivial to do, and actually improves performance even in a single-queue scenario.	2017-12-25 00:47:53 +01:00
Niklas Haas	bded247fb5	vo_gpu: vulkan: support split command pools Instead of using a single primary queue, we generate multiple vk_cmdpools and pick the right one dynamically based on the intent. This has a number of immediate benefits: 1. We can use async texture uploads 2. We can use the DMA engine for buffer updates 3. We can benefit from async compute on AMD GPUs Unfortunately, the major downside is that due to the lack of QF ownership tracking, we need to use CONCURRENT sharing for all resources (buffers and images!). In theory, we could try figuring out a way to get rid of the concurrent sharing for buffers (which is only needed for compute shader UBOs), but even so, the concurrent sharing mode doesn't really seem to have a significant impact over here (nvidia). It's possible that other platforms may disagree. Our deadlock-avoidance strategy is stupidly simple: Just flush the command every time we need to switch queues, and make sure all submission and callbacks happen in FIFO order. This required lifting the cmds_pending and cmds_queued out from vk_cmdpool to mpvk_ctx, and some functions died/got moved as a result, but that's a relatively minor change. On my hardware this is a fairly significant performance boost, mainly due to async transfers. (Nvidia doesn't expose separate compute queues anyway). On AMD, this should be a performance boost as well due to async compute.	2017-12-25 00:47:53 +01:00
Niklas Haas	6186cc79e6	vo_gpu: allow invalidating FBO in renderpass_run This is especially interesting for vulkan since it allows completely skipping the layout transition as part of the renderpass. Unfortunately, that also means it needs to be put into renderpass_params, as opposed to renderpass_run_params (unlike #4777). Closes #4777.	2017-12-25 00:47:53 +01:00
Niklas Haas	fb1c7bde42	vo_gpu: vulkan: properly track image dependencies This uses the new vk_signal mechanism to order all access to textures. This has several advantageS: 1. It allows real synchronization of image access across multiple frames when using multiple queues for parallelism. 2. It allows using events instead of pipeline barriers, which is a finer-grained synchronization primitive that allows for more efficient layout transitions over longer durations. This commit also restructures some of the implicit transition code for renderpasses to be more flexible and correct. (Note: this technically drops the ability to transition the image out of undefined layout when not blending, but that was a bug anyway and needs to be done properly) vo_gpu: vulkan: remove no-longer-true optimization The change to the output_tex format makes this no longer true, and it actually seems to hurt performance now as well. So just don't do it anymore. I also realized it hurts performance when drawing an OSD, so it's probably not a good idea anyway.	2017-12-25 00:47:53 +01:00
Niklas Haas	f2f91cf570	vo_gpu: vulkan: add a vk_signal abstraction This combines VkSemaphores and VkEvents into a common umbrella abstraction which can resolve to either. We aggressively try to prefer VkEvents over VkSemaphores whenever the conditions are met (1. we can unsignal the semaphore, i.e. it comes from the same frame; and 2. it comes from the same queue).	2017-12-25 00:47:53 +01:00
Niklas Haas	5feaaba0fd	vo_gpu: vulkan: refactor command submission Instead of being submitted immediately, commands are appended into an internal submission queue, and the actual submission is done once per frame (at the same time as queue cycling). Again, the benefits are not immediately obvious because nothing benefits from this yet, but it will make more sense for an upcoming vk_signal mechanism. This also cleans up the way the ra_vk submission interacts with the synchronization/callbacks from the ra_vk_ctx. Although currently, the way the dependency is signalled is a bit hacky: normally it would be associated with the ra_tex itself and waited on in the appropriate stage implicitly. But that code is just temporary, so I'm keeping it in there for a better commit order.	2017-12-25 00:47:53 +01:00
Niklas Haas	885497a445	vo_gpu: vulkan: reorganize vk_cmd slightly Instead of associating a single VkSemaphore with every command buffer and allowing the user to ad-hoc wait on it during submission, make the raw semaphores-to-signal array work like the raw semaphores-to-wait-on array. Doesn't really provide a clear benefit yet, but it's required for upcoming modifications.	2017-12-25 00:47:53 +01:00

1 2

66 Commits