This handle type was only needed for backwards compatibility with
windows 7. Dropping it allows us to simplify the code: there is no
longer a need for runtime checks, as the handle type can now be
statically assigned based on the platform.
The motivating usecase here, apart from code simplification, is a
desired switch to timeline semaphores, which (in the CUDA API) only
supports the non-KMT win32 handles.
It's worth pointing out that we need no runtime check for
IsWindows8OrGreater(), because the `export_caps.sync` check will already
fail on versions of windows not supporting PL_HANDLE_WIN32.
Previously, there appeared to be implicit synchronisation in the
GL interop path, and we never observed any visual glitches. However,
recently, I started seeing stuttering in the GL path and on closer
examination it looked like read-before-write behaviour where GL
would display an old frame again rather than the current one.
After verifying that disabling hwdec made the problem go away,
I tried adding a cuStreamSynchronize() after the memcpys and that
also resolved the problem, so it's clearly sync related.
cuStreamSynchronize() is a CPU sync and so more heavy-weight than
you want, but it's the only tool we have. There is no mechanism
defined for synchronising GL to CUDA (It looks like there is a way
to synchronise CUDA to EGL but it appears one way and so wouldn't
directly address this problem).
Anyway, empirically, the output now looks the same as with hwdec
off.
The amount of code now present that's specific to Vulkan or OpenGL
has reached the point where we really want to split it out to
avoid a mess of #ifdefs.
At the same time, I'm moving the code to an api neutral location.