Implement VK_EXT_acquire_wl_display
c6cb9b19 — Bas Nieuwenhuizen 4 months ago
radv: Support VK_EXT_queue_family_foreign.

Basically same as external for now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Only case we might need to handle differently in the near future
is Raven's case of displayable DCC which is not renderable. But
we don't support that yet.
8a053254 — Bas Nieuwenhuizen 4 months ago
radv: Fix interactions between variable descriptor count and inline uniform blocks.

Fixes: d7e6541cc72 "radv: Only allocate supplied number of descriptors when variable."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
11a3679e — Michel Dänzer 4 months ago
winsys/amdgpu: Make KMS handles valid for original DRM file descriptor

Getting a DMA-buf fd and converting that to a handle using our duplicate
of that file descriptor (getting at which requires passing a
radeon_winsys pointer to the buffer_get_handle hook) makes sure of this,
since duplicated file descriptors reference the same file description
and therefore the same GEM handle namespace.

This is necessary because libdrm_amdgpu may use a different DRM file
descriptor with a separate handle namespace internally, e.g. because it
always reuses any existing amdgpu_device_handle for the same device.
amdgpu_bo_export returns a handle which is valid for that internal
file descriptor.

Bugzilla: https://bugs.freedesktop.org/110903
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
cb446dc0 — Michel Dänzer 4 months ago
winsys/amdgpu: Add amdgpu_screen_winsys

It extends pipe_screen / radeon_winsys and references amdgpu_winsys.
Multiple amdgpu_screen_winsys instances may reference the same
amdgpu_winsys instance, which corresponds to an amdgpu_device_handle.

The purpose of amdgpu_screen_winsys is to keep a duplicate of the DRM
file descriptor passed to amdgpu_winsys_create, which will be needed
in the next change.

v2:
* Add comment in amdgpu_winsys_unref explaining why it always returns
  true (Marek Olšák)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
6fce2964 — Michel Dänzer 4 months ago
winsys/amdgpu: Use amdgpu_winsys helper instead of open-coded casts

Cleanup to prevent breakage with the next change, no functional change
intended in this one.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
e06bc0b1 — Juan A. Suarez Romero 4 months ago
intel: fix wrong format usage

Do not use the view format when filling the surface state.

Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.*

Fixes: fb1350c76f1 ("intel: Add and use helpers for level0 extent")

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
a7b6a869 — Samuel Pitoiset 4 months ago
radv: only allocate a 32-bit value for the TC-compat range metadata

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6baa453d — Samuel Pitoiset 4 months ago
radv: remove unused code in radv_update_tc_compat_zrange_metadata()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
a21f23c8 — Samuel Pitoiset 4 months ago
radv: add radv_get_depth_pipeline() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
e0054704 — Mike Blumenkrantz 5 months ago
iris: assert isl_surf_init success in resource_from_handle

this can fail unexpectedly due to bugs, so it's good to provide feedback
when this occurs

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
e708261c — Jason Ekstrand 5 months ago
anv: Advertise a more accurate minTexelBufferOffsetAlignment

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
0bc657f2 — Jason Ekstrand 5 months ago
anv: Implement VK_EXT_texel_buffer_alignment

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
465ec0b1 — Jason Ekstrand 5 months ago
vulkan: Update the XML and headers to 1.1.113

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
050eb638 — Caio Marcelo de Oliveira Filho 4 months ago
spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup

From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1):

    For objects in the Uniform, StorageBuffer, or PushConstant storage
    classes, the element’s address or location is calculated using a
    stride, which will be the Base-type’s Array Stride when the Base
    type is decorated with ArrayStride. For all other objects, the
    implementation will calculate the element’s address or location.

For non-CL shaders the driver should layout the Workgroup storage
class, so override any explicitly set ArrayStride in the shader.  This
currently fixes only the lower_workgroup_access_to_offsets case, which
is used by anv.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
95a7fd0f — Karol Herbst 4 months ago
nouveau: handle new CAPS

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
fa869f45 — Jason Ekstrand 7 months ago
intel/fs: Use nir_lower_interpolation on gen11+

On gen11, the removed the PLN instruction so we have to emit a pile of
MAD to emulate it.  We may as well do that in NIR so we can optimize and
later schedule it.

Shader-db results on Ice Lake:

    total instructions in shared programs: 17145644 -> 16556440 (-3.44%)
    instructions in affected programs: 11507454 -> 10918250 (-5.12%)
    helped: 35763
    HURT: 42085
    helped stats (abs) min: 1 max: 140 x̄: 19.09 x̃: 18
    helped stats (rel) min: 0.04% max: 37.93% x̄: 15.40% x̃: 14.49%
    HURT stats (abs)   min: 1 max: 248 x̄: 2.22 x̃: 2
    HURT stats (rel)   min: 0.05% max: 50.00% x̄: 5.00% x̃: 2.47%
    95% mean confidence interval for instructions value: -7.67 -7.47
    95% mean confidence interval for instructions %-change: -4.46% -4.29%
    Instructions are helped.

    total loops in shared programs: 4370 -> 4370 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 360624645 -> 368220857 (2.11%)
    cycles in affected programs: 269631244 -> 277227456 (2.82%)
    helped: 15583
    HURT: 65874
    helped stats (abs) min: 1 max: 28561 x̄: 78.45 x̃: 32
    helped stats (rel) min: <.01% max: 67.81% x̄: 5.38% x̃: 2.44%
    HURT stats (abs)   min: 1 max: 238638 x̄: 133.87 x̃: 20
    HURT stats (rel)   min: <.01% max: 306.25% x̄: 5.81% x̃: 3.97%
    95% mean confidence interval for cycles value: 67.42 119.09
    95% mean confidence interval for cycles %-change: 3.61% 3.73%
    Cycles are HURT.

    total spills in shared programs: 8943 -> 8981 (0.42%)
    spills in affected programs: 1925 -> 1963 (1.97%)
    helped: 44
    HURT: 14

    total fills in shared programs: 21815 -> 21925 (0.50%)
    fills in affected programs: 3511 -> 3621 (3.13%)
    helped: 41
    HURT: 18

    LOST:   70
    GAINED: 14

Reviewed-by: Matt Turner <mattst88@gmail.com>
2b79a9e5 — Jason Ekstrand 7 months ago
intel/fs: Implement nir_intrinsic_load_fs_input_interp_deltas

Reviewed-by: Matt Turner <mattst88@gmail.com>
8e7d0666 — Jason Ekstrand 7 months ago
intel/fs: Actually implement the load_barycentric intrinsics

If they never get used, dead code should clean them up.  Also, we rework
the at_offset and at_sample intrinsics so they return a proper vec2
instead of returning things in PLN layout.  Fortunately, copy-prop is
pretty good at cleaning this up and it doesn't result in any actual
extra MOVs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
5787a2df — Rob Clark 7 months ago
nir: add pass to lower load_interpolated_input

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Next