diff mbox series

Kernel Patch to fix amdgpu

Message ID c8d5a762b809424290cbf2fc6d60268d@bingner.com
State New
Headers show
Series Kernel Patch to fix amdgpu | expand

Commit Message

Sam Bingner Aug. 29, 2024, 9:16 a.m. UTC
Please apply the attached patch from https://github.com/torvalds/linux/commit/0cdb3f9740844b9d95ca413e3fcff11f81223ecf to the 6.8.0 kernel.  This is causing a panic when it tries to remove the module on some qualcomm servers.  I have verified that this applies cleanly, and that it corrects the problem.  As you can see this is a very safe patch.

r/
Sam Bingner


From 0cdb3f9740844b9d95ca413e3fcff11f81223ecf Mon Sep 17 00:00:00 2001
From: Friedrich Vock <friedrich.vock@gmx.de>
Date: Tue, 14 May 2024 09:06:38 +0200
Subject: [PATCH] drm/amdgpu: Check if NBIO funcs are NULL in
 amdgpu_device_baco_exit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The special case for VM passthrough doesn't check adev->nbio.funcs
before dereferencing it. If GPUs that don't have an NBIO block are
passed through, this leads to a NULL pointer dereference on startup.

Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de>
Fixes: 1bece222eabe ("drm/amdgpu: Clear doorbell interrupt status for Sienna Cichlid")
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Manuel Diewald Aug. 29, 2024, 2:05 p.m. UTC | #1
On Thu, Aug 29, 2024 at 09:16:09AM +0000, Sam Bingner wrote:
> Please apply the attached patch from https://github.com/torvalds/linux/commit/0cdb3f9740844b9d95ca413e3fcff11f81223ecf to the 6.8.0 kernel.  This is causing a panic when it tries to remove the module on some qualcomm servers.  I have verified that this applies cleanly, and that it corrects the problem.  As you can see this is a very safe patch.
> 
> r/
> Sam Bingner
> 
> 
> From 0cdb3f9740844b9d95ca413e3fcff11f81223ecf Mon Sep 17 00:00:00 2001
> From: Friedrich Vock <friedrich.vock@gmx.de>
> Date: Tue, 14 May 2024 09:06:38 +0200
> Subject: [PATCH] drm/amdgpu: Check if NBIO funcs are NULL in
>  amdgpu_device_baco_exit
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> The special case for VM passthrough doesn't check adev->nbio.funcs
> before dereferencing it. If GPUs that don't have an NBIO block are
> passed through, this leads to a NULL pointer dereference on startup.
> 
> Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de>
> Fixes: 1bece222eabe ("drm/amdgpu: Clear doorbell interrupt status for Sienna Cichlid")
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Acked-by: Christian König <christian.koenig@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 00fe3c2d54310f..e72e774d17e6a5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -6167,7 +6167,7 @@ int amdgpu_device_baco_exit(struct drm_device *dev)
>  	    adev->nbio.funcs->enable_doorbell_interrupt)
>  		adev->nbio.funcs->enable_doorbell_interrupt(adev, true);
>  
> -	if (amdgpu_passthrough(adev) &&
> +	if (amdgpu_passthrough(adev) && adev->nbio.funcs &&
>  	    adev->nbio.funcs->clear_doorbell_interrupt)
>  		adev->nbio.funcs->clear_doorbell_interrupt(adev);
>  
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

Please follow the guidelines for submitting stable patches:

https://wiki.ubuntu.com/Kernel/Dev/StablePatchFormat
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 00fe3c2d54310f..e72e774d17e6a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -6167,7 +6167,7 @@  int amdgpu_device_baco_exit(struct drm_device *dev)
 	    adev->nbio.funcs->enable_doorbell_interrupt)
 		adev->nbio.funcs->enable_doorbell_interrupt(adev, true);
 
-	if (amdgpu_passthrough(adev) &&
+	if (amdgpu_passthrough(adev) && adev->nbio.funcs &&
 	    adev->nbio.funcs->clear_doorbell_interrupt)
 		adev->nbio.funcs->clear_doorbell_interrupt(adev);