Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CS2 Crashing amdgpu GPU or driver #3960

Open
bjornpijnacker opened this issue Jan 5, 2025 · 0 comments
Open

CS2 Crashing amdgpu GPU or driver #3960

bjornpijnacker opened this issue Jan 5, 2025 · 0 comments

Comments

@bjornpijnacker
Copy link

bjornpijnacker commented Jan 5, 2025

Your system information

  • Have you checked for system updates?: Yes

Please describe your issue in as much detail as possible:

Describe what you expected should happen and what did happen. Please link any large pastes as a Github Gist.

Steps for reproducing this issue:

  1. Launch CS2 on Fedora Linux 41
  2. Play for an hour or so in a competitive match
  3. System freezes and I am booted to GDM login. Game remains running and a full system restart is required.

Log snippet, indicating an issue originating from CS2 Vulkan:

[ 4614.327849] amdgpu 0000:03:00.0: amdgpu: Dumping IP State
[ 4614.330348] amdgpu 0000:03:00.0: amdgpu: Dumping IP State Completed
[ 4614.340384] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=17735888, emitted seq=17735890
[ 4614.340386] amdgpu 0000:03:00.0: amdgpu: Process information: process cs2 pid 29348 thread VKRenderThread pid 29384
[ 4616.340522] amdgpu 0000:03:00.0: amdgpu: MES failed to respond to msg=RESET
[ 4616.340526] [drm:amdgpu_mes_reset_legacy_queue [amdgpu]] *ERROR* failed to reset legacy queue
[ 4616.340693] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
[ 4618.580628] amdgpu 0000:03:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
[ 4618.580631] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 4618.814047] [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[ 4618.853764] amdgpu 0000:03:00.0: amdgpu: MODE1 reset
[ 4618.853766] amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
[ 4618.853815] amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
[ 4619.359816] amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 4619.359887] [drm] PCIE GART of 512M enabled (table at 0x0000008001300000).
[ 4619.359915] [drm] VRAM is lost due to GPU reset!
[ 4619.359925] amdgpu 0000:03:00.0: amdgpu: PSP is resuming...
[ 4619.429546] amdgpu 0000:03:00.0: amdgpu: reserve 0x1300000 from 0x85fc000000 for PSP TMR
[ 4619.573925] amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 4619.573927] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[ 4619.573929] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[ 4619.573931] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000003d, smu fw if version = 0x00000040, smu fw program = 0, smu fw version = 0x004e8000 (78.128.0)
[ 4619.573934] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
[ 4619.753693] amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
[ 4619.764286] [drm] DMUB hardware initialized: version=0x07002A00
[ 4619.893407] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 4619.893409] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 4619.893410] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 4619.893411] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 4619.893411] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 4619.893412] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 4619.893412] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 4619.893413] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 4619.893413] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 4619.893414] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 4619.893414] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
[ 4619.893415] amdgpu 0000:03:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 4619.893415] amdgpu 0000:03:00.0: amdgpu: ring vcn_unified_1 uses VM inv eng 1 on hub 8
[ 4619.893416] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 4 on hub 8
[ 4619.893417] amdgpu 0000:03:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 14 on hub 0
[ 4619.896692] amdgpu 0000:03:00.0: amdgpu: GPU reset(2) succeeded!
[ 4619.896978] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!

[ 4614.340386] amdgpu 0000:03:00.0: amdgpu: Process information: process cs2 pid 29348 thread VKRenderThread pid 29384 is the part that pointed me to CS2. I haven't had it happen anywhere else.

I've looked at https://gitlab.freedesktop.org/drm/amd/-/issues/3131, where people report similar issues. However, CS2 does not run my GPU above regular boost frequency at all. It's at about 80% utilization. I have also looked at #3386, which is a different issue. I am missing the [130882.405656] [drm:gfx_v10_0_priv_reg_irq [amdgpu]] ERROR Illegal register access in command stream logline that @tadzik reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants