Resuming from suspend and using DPMS often crashes compositors
Brief summary of the problem:
When the monitors turn on after DPMS on or resume from suspend, my DE crashes and the monitors show a black screen for ~30 seconds. This has been happening since I've installed the GPU in November, so across multiple kernel releases. This issue is not DE related, I've been able to reproduce it on all wlroots based compositors and also KDE. A common denominator of the crashes seems to be this line in dmesg:
[drm:dp_set_fec_ready [amdgpu]] *ERROR* dpcd write failed to set fec_ready
The DE errors vary, but often seem to originate from functions manipulating monitor parameters (Hyprland example shown below).
A thing to note is the monitors take 10+ seconds to turn on from their power saving mode. My theory is there is some kind of a timeout happening that baits compositors into thinking the monitors are ready when they're not.
Might be related to #2359.
Hardware description:
- CPU: AMD Ryzen 9 5950X
- GPU: 7900 XTX - Navi 31 [Radeon RX 7900 XT/7900 XTX/7900M] [1002:744c] (rev c8)
- System Memory: 64GB of DDR4
- Display(s): 2x Philips 27M1F5800
- Type of Display Connection: DP
System information:
- Distro name and Version: Arch Linux
- Kernel version: 6.9.1-arch1-1
- Custom kernel: N/A
- AMD official driver version: N/A
How to reproduce the issue:
Suspend and resume or set DPSM off and on.
Attached files:
Log files
dmesg:
[13679.618942] [drm:dp_set_fec_ready [amdgpu]] *ERROR* dpcd write failed to set fec_ready
[13680.039577] [drm:dp_set_fec_ready [amdgpu]] *ERROR* dpcd write failed to set fec_ready
[13680.712845] ------------[ cut here ]------------
[13680.712874] WARNING: CPU: 0 PID: 180969 at drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dcn20/dcn20_dsc.c:272 dsc2_disable+0x108/0x180 [amdgpu]
[13680.713080] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device vxlan xt_policy iptable_mangle xt_mark xt_bpf xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_addrtype iptable_filter br_netfilter bridge stp llc overlay wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel rfkill nct6775 nct6775_core amd_atl intel_rapl_msr hwmon_vid intel_rapl_common kvm_amd snd_hda_codec_realtek kvm snd_hda_codec_generic snd_hda_scodec_component snd_hda_codec_hdmi nls_iso8859_1 crct10dif_pclmul crc32_pclmul vfat snd_hda_intel polyval_clmulni fat snd_intel_dspcfg polyval_generic gf128mul snd_intel_sdw_acpi ghash_clmulni_intel snd_hda_codec sha512_ssse3 sha256_ssse3 snd_hda_core sha1_ssse3 aesni_intel snd_hwdep crypto_simd snd_pcm cryptd igb snd_timer ptp joydev mousedev snd pps_core sp5100_tco ccp dca soundcore rapl
[13680.713130] acpi_cpufreq pcspkr i2c_piix4 k10temp wmi_bmof mac_hid uinput i2c_dev sg crypto_user dm_mod loop nfnetlink ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid amdgpu video amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper drm_buddy nvme drm_display_helper crc32c_intel nvme_core xhci_pci cec xhci_pci_renesas nvme_auth wmi
[13680.713155] CPU: 0 PID: 180969 Comm: kworker/0:4H Not tainted 6.9.1-arch1-1 #1 8721656fa781c58301f7268d475f3e6380e2b47c
[13680.713157] Hardware name: To Be Filled By O.E.M. X570 Pro4/X570 Pro4, BIOS P5.01 01/18/2023
[13680.713158] Workqueue: events_highpri dm_irq_work_func [amdgpu]
[13680.713343] RIP: 0010:dsc2_disable+0x108/0x180 [amdgpu]
[13680.713520] Code: 4c 24 0c 44 8b 43 10 48 8b 40 10 48 8b 30 48 85 f6 74 04 48 8b 76 08 48 c7 c1 f0 2e 28 c1 ba 08 00 00 00 31 ff e8 08 c6 76 d4 <0f> 0b 48 8b 53 20 48 8b 43 28 45 31 c9 48 8b 7b 08 0f b6 8a b4 00
[13680.713522] RSP: 0018:ffffaf4315b7f7a0 EFLAGS: 00010246
[13680.713523] RAX: 0000000000000000 RBX: ffff9ead0873fe00 RCX: ffffffffc1282ef0
[13680.713524] RDX: 0000000000000008 RSI: ffff9ead0220e0c8 RDI: 0000000000000000
[13680.713525] RBP: ffff9eb385c00be0 R08: 0000000000000001 R09: 0000000000000001
[13680.713527] R10: ffff9eb385c00be0 R11: 0000000000000001 R12: ffff9ead08b96b00
[13680.713527] R13: ffff9eb017e68000 R14: 0000000000000002 R15: 0000000000000000
[13680.713529] FS: 0000000000000000(0000) GS:ffff9ebbfe200000(0000) knlGS:0000000000000000
[13680.713530] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13680.713531] CR2: 00007cad0a200990 CR3: 0000000913a20000 CR4: 0000000000f50ef0
[13680.713532] PKRU: 55555554
[13680.713533] Call Trace:
[13680.713535] <TASK>
[13680.713536] ? dsc2_disable+0x108/0x180 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.713710] ? __warn.cold+0x8e/0xe8
[13680.713713] ? dsc2_disable+0x108/0x180 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.713886] ? report_bug+0xff/0x140
[13680.713889] ? handle_bug+0x3c/0x80
[13680.713891] ? exc_invalid_op+0x17/0x70
[13680.713893] ? asm_exc_invalid_op+0x1a/0x20
[13680.713897] ? dsc2_disable+0x108/0x180 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.714069] link_set_dsc_on_stream+0x409/0x480 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.714249] ? srso_alias_return_thunk+0x5/0xfbef5
[13680.714251] ? dm_helpers_dp_write_dsc_enable+0x28a/0x700 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.714433] link_set_dsc_enable+0x83/0x90 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.714613] link_set_dpms_off+0x1ab/0x730 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.714792] commit_planes_for_stream+0x588/0x1780 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.714952] ? srso_alias_return_thunk+0x5/0xfbef5
[13680.714954] ? dcn32_is_pipe_topology_transition_seamless+0x58/0x180 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.715135] dc_update_planes_and_stream+0x4c4/0xcb0 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.715297] dc_commit_updates_for_stream+0x449/0x520 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.715454] link_set_all_streams_dpms_off_for_link+0xc5/0x110 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.715638] link_detect+0x40b/0x4f0 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.715817] handle_hpd_irq_helper+0xf7/0x170 [amdgpu 11785c3085e75bb1d1465c3bd7f7962d53ef457f]
[13680.715998] process_one_work+0x18e/0x350
[13680.716002] worker_thread+0x2eb/0x410
[13680.716004] ? __pfx_worker_thread+0x10/0x10
[13680.716006] kthread+0xd2/0x100
[13680.716009] ? __pfx_kthread+0x10/0x10
[13680.716011] ret_from_fork+0x34/0x50
[13680.716013] ? __pfx_kthread+0x10/0x10
[13680.716015] ret_from_fork_asm+0x1a/0x30
[13680.716019] </TASK>
[13680.716020] ---[ end trace 0000000000000000 ]---
[13681.615293] snd_hda_intel 0000:0b:00.1: Refused to change power state from D0 to D3hot
hyprland crash log sample:
Hyprland received signal 11(SEGV)
Backtrace:
# | Hyprland(_Z12getBacktracev+0x61) [0x5945ff609d21]
getBacktrace()
??:?
#1 | Hyprland(_ZN13CrashReporter18createAndSaveCrashEi+0xde9) [0x5945ff5a0729]
CrashReporter::createAndSaveCrash(int)
??:?
#2 | Hyprland(_Z25handleUnrecoverableSignali+0x71) [0x5945ff520281]
handleUnrecoverableSignal(int)
??:?
#3 | /usr/lib/libc.so.6(+0x3cae0) [0x7442b9565ae0]
??
??:0
#4 | Hyprland(+0x28b55b) [0x5945ff6a855b]
CGammaControlProtocol::applyGammaToState(CMonitor*)
??:?
#5 | Hyprland(+0x2fbda1) [0x5945ff718da1]
IHyprWindowDecoration::getDisplayName[abi:cxx11]()
??:?
#6 | /usr/lib/libffi.so.8(+0x7596) [0x7442b9b13596]
??
??:0
#7 | /usr/lib/libffi.so.8(+0x400e) [0x7442b9b1000e]
??
??:0
#8 | /usr/lib/libffi.so.8(ffi_call+0x123) [0x7442b9b12bd3]
??
??:0
#9 | /usr/lib/libwayland-server.so.0(+0x8ada) [0x7442b9fafada]
??
??:0
#1 | /usr/lib/libwayland-server.so.0(+0xd180) [0x7442b9fb4180]
??
??:0
#11 | /usr/lib/libwayland-server.so.0(wl_event_loop_dispatch+0xa2) [0x7442b9fb2ae2]
??
??:0
#12 | /usr/lib/libwayland-server.so.0(wl_display_run+0x27) [0x7442b9fb32d7]
??
??:0
#13 | Hyprland(_ZN17CEventLoopManager9enterLoopEP10wl_displayP13wl_event_loop+0x55) [0x5945ff661bc5]
CEventLoopManager::enterLoop(wl_display*, wl_event_loop*)
??:?
#14 | Hyprland(main+0xa4d) [0x5945ff4e959d]
main
??:?
#15 | /usr/lib/libc.so.6(+0x25c88) [0x7442b954ec88]
??
??:0
#16 | /usr/lib/libc.so.6(__libc_start_main+0x8c) [0x7442b954ed4c]
??
??:0
#17 | Hyprland(_start+0x25) [0x5945ff51cbb5]
_start
??:?