• Bug#1102175: inux-amd-64: Repeated lockups in AMDGPU DRM driver

    From Alison Chaiken@21:1/5 to All on Sun Apr 6 06:30:01 2025
    Package: linux-amd-64
    Version: linux-image-6.12.19-amd64
    Severity: important
    X-Debbugs-Cc: alison@she-devel.com

    Dear Maintainer,

    When the PC has been sitting idle with the monitor off and I try to
    interact
    with it, I sometimes find that it is locked up and won't respond to
    keystrokes,
    ping or sysrq. Based on the syslog, it looks like the PCIe endpoint
    has gone
    into the low-power D3cold state and can't be woken. The result is

    watchdog: Watchdog detected hard LOCKUP on cpu 20

    after which I must press the power button to shutdown and reboot. Below
    is some output from the syslog.

    Thanks for your hard work maintaing Debian.

    -- Alison Chaiken, alison@she-devel.com

    2025-03-30T14:54:01.842776-07:00 schallkreis kernel: pcieport
    0000:0f:00.0: Unable to change power state from D3cold to D0, device inaccessible
    2025-03-30T14:54:02.504644-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring sdma0
    2025-03-30T14:54:02.504652-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:02.945988-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring sdma0
    2025-03-30T14:54:03.387173-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:04.049130-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:04.049136-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring sdma1
    2025-03-30T14:54:04.931536-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:05.373985-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:09.785625-07:00 schallkreis kwin_wayland_wrapper[4583]: kwin_libinput: Libinput: event22 - Logitech USB Optical Mouse: client
    bug: event processing lagging behind by 220ms, your system is too slow 2025-03-30T14:54:10.888484-07:00 schallkreis kernel: watchdog: Watchdog detected hard LOCKUP on cpu 20
    2025-03-30T14:54:10.888491-07:00 schallkreis kernel: Modules linked in:
    dm_mod cpuid tls qrtr binfmt_misc nls_ascii nls_cp437 vfat fat amd_atl intel_rapl_msr intel_rapl_common mt7925e mt7925_common mt792x_lib mt76_connac_lib edac_mce_amd snd_hda_codec_realtek mt76 kvm_amd snd_hda_codec_generic mac80211 snd_hda_scodec_component
    snd_hda_codec_hdmi snd_hda_intel kvm snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec libarc4 snd_hda_core cfg80211 crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_hwdep sha256_ssse3 sha1_ssse3
    snd_pcm aesni_intel spd5118 gf128mul crypto_simd cryptd snd_timer
    wmi_bmof sp5100_tco rapl snd ccp pcspkr watchdog rfkill k10temp
    soundcore joydev evdev nfsd auth_rpcgss nfs_acl lockd grace sunrpc
    parport_pc ppdev lp parport configfs efi_pstore nfnetlink efivarfs
    ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic hid_generic
    usbhid hid amdgpu amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_display_helper cec rc_core drm_ttm_helper ttm
    xhci_pci xhci_hcd drm_kms_helper ahci libahci r8169 libata drm nvme 2025-03-30T14:54:10.888493-07:00 schallkreis kernel: thunderbolt
    realtek usbcore crc32_pclmul mdio_devres scsi_mod crc32c_intel nvme_core
    libphy i2c_piix4 i2c_smbus usb_common scsi_common nvme_auth crc16 video
    wmi gpio_amdpt gpio_generic butto2025-03-30T14:54:10.888494-07:00
    schallkreis kernel: CPU: 20 UID: 1000 PID: 4645 Comm: HDMI-A-1 Tainted:
    G W 6.12.17-amd64 #1 Debian 6.12.17-1 2025-03-30T14:54:10.888501-07:00 schallkreis kernel: Tainted: [W]DWARN 2025-03-30T14:54:10.888502-07:00 schallkreis kernel: Hardware name:
    System76 Thelio Mira/Thelio Mira, BIOS 3.11.SP01 12/05/2024 2025-03-30T14:54:10.888503-07:00 schallkreis kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x70/0x2a0 2025-03-30T14:54:10.888503-07:00 schallkreis kernel: Code: 77 77 f0 0f
    ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77
    53 85 c0 74=
    10 0f b6 03 84 c0 74 09 f3 90 <0f> b6 03 84 c0 75 f7 b8 01 00 00 00 66
    89 03 5b 5d 41 5c 41 5d c3
    2025-03-30T14:54:10.888504-07:00 schallkreis kernel: RSP:
    0018:ffffb0d1cb007868 EFLAGS: 00000002
    2025-03-30T14:54:10.888510-07:00 schallkreis kernel: RAX:
    0000000000000001 RBX: ffff8b47eb080178 RCX: ffff8b4a20fd62d2025-03-30T14:54:10.888510-07:00 schallkreis kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8b47eb080172025-03-30T14:54:10.888511-07:00 schallkreis kernel: RBP: ffff8b4a20fd6280 R08: 0000000000000080 R09: ffff8b4a20fd6282025-03-30T14:54:10.888511-07:00 schallkreis kernel: R10: ffffb0d1cb0078e0 R11: 0000000000000004 R12: ffff8b47eb080172025-03-30T14:54:10.888511-07:00 schallkreis kernel: R13: ffff8b4a20fd62d8 R14: ffff8b4a20fd6080 R15: ffff8b4a20fd6282025-03-30T14:54:10.888511-07:00 schallkreis kernel: FS: 00007fc576ffd6c0(0000) GS:ffff8b565fc00000(0000) knlGS:0000000000000000 2025-03-30T14:54:10.888512-07:00 schallkreis kernel: CS: 0010 DS: 0000
    ES: 0000 CR0: 0000000080050033
    2025-03-30T14:54:10.888513-07:00 schallkreis kernel: CR2:
    00007f8f80d1aef0 CR3: 0000000129824000 CR4: 0000000000f50ef2025-03-30T14:54:10.888513-07:00 schallkreis kernel:
    PKRU: 55555554
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: Call Trace: 2025-03-30T14:54:10.888514-07:00 schallkreis kernel: <NMI> 2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? watchdog_hardlockup_check.cold+0x100/0x105
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? __perf_event_overflow+0x10c/0x320
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? amd_pmu_v2_handle_irq+0x2a1/0x3c0
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ? perf_event_nmi_handler+0x2a/0x50
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ?
    nmi_handle+0x5e/0x120
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ? default_do_nmi+0x40/0x130
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ?
    exc_nmi+0x122/0x1a0
    2025-03-30T14:54:10.888516-07:00 schallkreis kernel: ?
    end_repeat_nmi+0xf/0x53
    2025-03-30T14:54:10.888516-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: </NMI2025-03-30T14:54:10.888518-07:00 schallkreis kernel: <TASK> 2025-03-30T14:54:10.888518-07:00 schallkreis kernel: _raw_spin_lock_irqsave+0x3d/0x50
    2025-03-30T14:54:10.888518-07:00 schallkreis kernel: drm_event_reserve_init+0x2f/0xc0 [drm]
    2025-03-30T14:54:10.888518-07:00 schallkreis kernel: drm_mode_atomic_ioctl+0x61d/0xcb0 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel: drm_ioctl_kernel+0xad/0x100 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel:
    drm_ioctl+0x277/0x4f0 [drm]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel:
    amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel:
    __x64_sys_ioctl+0x91/0xd0
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel:
    do_syscall_64+0x82/0x190
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? __x64_sys_poll+0xd0/0x180
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? restore_fpregs_from_fpstate+0x3c/0xa0
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? switch_fpu_return+0x4e/0xd0
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x172/0x210
    2025-03-30T14:54:10.888523-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888523-07:00 schallkreis kernel: ?
    do_futex+0x125/0x190
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? __x64_sys_futex+0x127/0x1e0
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? __x64_sys_poll+0xd0/0x180
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
    2025-03-30T14:54:10.888528-07:00 schallkreis kernel: RIP:
    0033:0x7fc5ce5168db
    2025-03-30T14:54:10.888528-07:00 schallkreis kernel: Code: 00 48 89 44
    24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44
    24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 =
    f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00 2025-03-30T14:54:10.888528-07:00 schallkreis kernel: RSP:
    002b:00007fc576ffc3b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000012025-03-30T14:54:10.888529-07:00 schallkreis kernel: RAX: ffffffffffffffda RBX: 00007fc56c00bf20 RCX: 00007fc5ce5168d2025-03-30T14:54:10.888529-07:00 schallkreis kernel: RDX: 00007fc576ffc4a0 RSI: 00000000c03864bc RDI: 0000000000000012025-03-30T14:54:10.888529-07:00 schallkreis kernel: RBP: 00007fc576ffc4a0 R08: 00007fc56c00ebd0 R09: 00007fc56c0084a2025-03-30T14:54:10.888530-07:00 schallkreis kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c03864bc 2025-03-30T14:54:10.888530-07:00 schallkreis kernel: R13:
    0000000000000012 R14: 00007fc56c01983c R15: 00007fc56c008520 2025-03-30T14:54:10.888530-07:00 schallkreis kernel: </TASK> 2025-03-30T14:54:10.888530-07:00 schallkreis kernel: watchdog: Watchdog detected hard LOCKUP on cpu 8
    2025-03-30T14:54:10.888531-07:00 schallkreis kernel: Modules linked in:
    dm_mod cpuid tls qrtr binfmt_misc nls_ascii nls_cp437 vfat fat amd_atl intel_rapl_msr intel_rapl_common mt7925e mt=
    7925_common mt792x_lib mt76_connac_lib edac_mce_amd
    snd_hda_codec_realtek mt76 kvm_amd snd_hda_codec_generic mac80211 snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel kvm
    snd_intel_dspcfg =
    snd_intel_sdw_acpi snd_hda_codec libarc4 snd_hda_core cfg80211
    crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_hwdep sha256_ssse3 sha1_ssse3 snd_pcm aesni_intel spd5118 gf128mul crypto_simd=
    cryptd snd_timer wmi_bmof sp5100_tco rapl snd ccp pcspkr watchdog
    rfkill k10temp soundcore joydev evdev nfsd auth_rpcgss nfs_acl lockd
    grace sunrpc parport_pc ppdev lp parpor=
    t configfs efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic hid_generic usbhid hid amdgpu amdxcp
    drm_exec gpu_sched drm_buddy i2c_algo_bit drm=
    _suballoc_helper drm_display_helper cec rc_core drm_ttm_helper ttm
    xhci_pci xhci_hcd drm_kms_helper ahci libahci r8169 libata drm nvme 2025-03-30T14:54:10.888532-07:00 schallkreis kernel: thunderbolt
    realtek usbcore crc32_pclmul mdio_devres scsi_mod crc32c_intel nvme_core
    libphy i2c_piix4 i2c_smbus usb_common scsi_common =
    nvme_auth crc16 video wmi gpio_amdpt gpio_generic butto2025-03-30T14:54:10.888532-07:00 schallkreis kernel: CPU: 8 UID: 0
    PID: 0 Comm: swapper/8 Tainted: G W=
    6.12.17-amd64 #1 Debian
    6.12.17-2025-03-30T14:54:10.888532-07:00 schallkreis kernel: Tainted: [W]=3DWARN
    2025-03-30T14:54:10.888533-07:00 schallkreis kernel: Hardware name:
    System76 Thelio Mira/Thelio Mira, BIOS 3.11.SP01 12/05/2024

    -- System Information:
    Debian Release: trixie/sid
    APT prefers testing-debug
    APT policy: (500, 'testing-debug'), (500, 'testing')
    Architecture: amd64 (x86_64)

    Kernel: Linux 6.12.19-amd64 (SMP w/24 CPU threads; PREEMPT)
    Kernel taint flags: TAINT_WARN
    Locale: LANGDen_US.UTF-8, LC_CTYPE=3Den_US.UTF-8 (charmap=3DUTF-8),
    LANGUAGE not set
    Shell: /bin/sh linked to /usr/bin/dash
    Init: systemd (via /run/systemd/system)
    LSM: AppArmor: enabled

    ---
    Alison Chaiken alison@she-devel.com
    https://she-devel.com
    "What respite from her thrilling toil did Beauty ever take -- But Work
    might be Electric Rest To those that Magic make" -- Emily Dickson <html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body style='font-size: 11pt; font-family: "Times New Roman",Times,serif'>
    <p>Package: linux-amd-64<br />Version: linux-image-6.12.19-amd64<br />Severity: important<br />X-Debbugs-Cc: alison@she-devel.com</p>
    <p>Dear Maintainer,</p>
    <p>When the PC has been sitting idle with the monitor off &nbsp;and I try to interact<br />with it, I sometimes find that it is locked up and won't respond to keystrokes,<br />ping or sysrq. &nbsp; Based on the syslog, it looks like the PCIe endpoint has
    gone<br />into the low-power D3cold state and can't be woken. &nbsp;The result is</p>
    <p>&nbsp; &nbsp; &nbsp;watchdog: Watchdog detected hard LOCKUP on cpu 20</p> <p>after which I must press the power button to shutdown
  • From Salvatore Bonaccorso@21:1/5 to Alison Chaiken on Thu Apr 10 23:40:01 2025
    XPost: linux.debian.kernel

    Control: tags -1 + moreinfo

    Hi Alison,

    On Sat, Apr 05, 2025 at 09:24:01PM -0700, Alison Chaiken wrote:
    Package: linux-amd-64
    Version: linux-image-6.12.19-amd64
    Severity: important
    X-Debbugs-Cc: alison@she-devel.com

    Dear Maintainer,

    When the PC has been sitting idle with the monitor off and I try to
    interact
    with it, I sometimes find that it is locked up and won't respond to keystrokes,
    ping or sysrq. Based on the syslog, it looks like the PCIe endpoint has gone
    into the low-power D3cold state and can't be woken. The result is

    watchdog: Watchdog detected hard LOCKUP on cpu 20

    after which I must press the power button to shutdown and reboot. Below is some output from the syslog.

    Thanks for your hard work maintaing Debian.

    -- Alison Chaiken, alison@she-devel.com

    2025-03-30T14:54:01.842776-07:00 schallkreis kernel: pcieport 0000:0f:00.0: Unable to change power state from D3cold to D0, device inaccessible 2025-03-30T14:54:02.504644-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring sdma0
    2025-03-30T14:54:02.504652-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:02.945988-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring sdma0
    2025-03-30T14:54:03.387173-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:04.049130-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:04.049136-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring sdma1
    2025-03-30T14:54:04.931536-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:05.373985-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:09.785625-07:00 schallkreis kwin_wayland_wrapper[4583]: kwin_libinput: Libinput: event22 - Logitech USB Optical Mouse: client bug: event processing lagging behind by 220ms, your system is too slow 2025-03-30T14:54:10.888484-07:00 schallkreis kernel: watchdog: Watchdog detected hard LOCKUP on cpu 20
    2025-03-30T14:54:10.888491-07:00 schallkreis kernel: Modules linked in: dm_mod cpuid tls qrtr binfmt_misc nls_ascii nls_cp437 vfat fat amd_atl intel_rapl_msr intel_rapl_common mt7925e mt7925_common mt792x_lib mt76_connac_lib edac_mce_amd snd_hda_codec_realtek mt76 kvm_amd snd_hda_codec_generic mac80211 snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel kvm snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec libarc4 snd_hda_core cfg80211 crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_hwdep sha256_ssse3 sha1_ssse3 snd_pcm aesni_intel spd5118 gf128mul crypto_simd cryptd snd_timer wmi_bmof sp5100_tco rapl snd ccp pcspkr
    watchdog rfkill k10temp soundcore joydev evdev nfsd auth_rpcgss nfs_acl
    lockd grace sunrpc parport_pc ppdev lp parport configfs efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic hid_generic usbhid hid amdgpu amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_display_helper cec rc_core drm_ttm_helper ttm xhci_pci xhci_hcd drm_kms_helper ahci libahci r8169
    libata drm nvme
    2025-03-30T14:54:10.888493-07:00 schallkreis kernel: thunderbolt realtek usbcore crc32_pclmul mdio_devres scsi_mod crc32c_intel nvme_core libphy i2c_piix4 i2c_smbus usb_common scsi_common nvme_auth crc16 video wmi gpio_amdpt gpio_generic butto2025-03-30T14:54:10.888494-07:00 schallkreis kernel: CPU: 20 UID: 1000 PID: 4645 Comm: HDMI-A-1 Tainted: G W 6.12.17-amd64 #1 Debian 6.12.17-1
    2025-03-30T14:54:10.888501-07:00 schallkreis kernel: Tainted: [W]DWARN 2025-03-30T14:54:10.888502-07:00 schallkreis kernel: Hardware name: System76 Thelio Mira/Thelio Mira, BIOS 3.11.SP01 12/05/2024 2025-03-30T14:54:10.888503-07:00 schallkreis kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x70/0x2a0 2025-03-30T14:54:10.888503-07:00 schallkreis kernel: Code: 77 77 f0 0f ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 53 85 c0 74=
    10 0f b6 03 84 c0 74 09 f3 90 <0f> b6 03 84 c0 75 f7 b8 01 00 00 00 66 89
    03 5b 5d 41 5c 41 5d c3
    2025-03-30T14:54:10.888504-07:00 schallkreis kernel: RSP: 0018:ffffb0d1cb007868 EFLAGS: 00000002
    2025-03-30T14:54:10.888510-07:00 schallkreis kernel: RAX: 0000000000000001 RBX: ffff8b47eb080178 RCX: ffff8b4a20fd62d2025-03-30T14:54:10.888510-07:00 schallkreis kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8b47eb080172025-03-30T14:54:10.888511-07:00 schallkreis kernel: RBP: ffff8b4a20fd6280 R08: 0000000000000080 R09: ffff8b4a20fd6282025-03-30T14:54:10.888511-07:00 schallkreis kernel: R10: ffffb0d1cb0078e0 R11: 0000000000000004 R12: ffff8b47eb080172025-03-30T14:54:10.888511-07:00 schallkreis kernel: R13: ffff8b4a20fd62d8 R14: ffff8b4a20fd6080 R15: ffff8b4a20fd6282025-03-30T14:54:10.888511-07:00 schallkreis kernel: FS: 00007fc576ffd6c0(0000) GS:ffff8b565fc00000(0000) knlGS:0000000000000000 2025-03-30T14:54:10.888512-07:00 schallkreis kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    2025-03-30T14:54:10.888513-07:00 schallkreis kernel: CR2: 00007f8f80d1aef0 CR3: 0000000129824000 CR4: 0000000000f50ef2025-03-30T14:54:10.888513-07:00 schallkreis kernel: PKRU: 55555554
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: Call Trace: 2025-03-30T14:54:10.888514-07:00 schallkreis kernel: <NMI> 2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? watchdog_hardlockup_check.cold+0x100/0x105
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? __perf_event_overflow+0x10c/0x320
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? amd_pmu_v2_handle_irq+0x2a1/0x3c0
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ? perf_event_nmi_handler+0x2a/0x50
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ?
    nmi_handle+0x5e/0x120
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ? default_do_nmi+0x40/0x130
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ? exc_nmi+0x122/0x1a0 2025-03-30T14:54:10.888516-07:00 schallkreis kernel: ? end_repeat_nmi+0xf/0x53
    2025-03-30T14:54:10.888516-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: </NMI2025-03-30T14:54:10.888518-07:00 schallkreis kernel: <TASK> 2025-03-30T14:54:10.888518-07:00 schallkreis kernel: _raw_spin_lock_irqsave+0x3d/0x50
    2025-03-30T14:54:10.888518-07:00 schallkreis kernel: drm_event_reserve_init+0x2f/0xc0 [drm]
    2025-03-30T14:54:10.888518-07:00 schallkreis kernel: drm_mode_atomic_ioctl+0x61d/0xcb0 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel: drm_ioctl_kernel+0xad/0x100 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel: drm_ioctl+0x277/0x4f0 [drm]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel: amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel: __x64_sys_ioctl+0x91/0xd0
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel:
    do_syscall_64+0x82/0x190
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? __x64_sys_poll+0xd0/0x180
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? restore_fpregs_from_fpstate+0x3c/0xa0
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? switch_fpu_return+0x4e/0xd0
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x172/0x210
    2025-03-30T14:54:10.888523-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888523-07:00 schallkreis kernel: ? do_futex+0x125/0x190 2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? __x64_sys_futex+0x127/0x1e0
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? __x64_sys_poll+0xd0/0x180
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
    2025-03-30T14:54:10.888528-07:00 schallkreis kernel: RIP:
    0033:0x7fc5ce5168db
    2025-03-30T14:54:10.888528-07:00 schallkreis kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48
    89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 =
    f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00 2025-03-30T14:54:10.888528-07:00 schallkreis kernel: RSP: 002b:00007fc576ffc3b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000012025-03-30T14:54:10.888529-07:00 schallkreis kernel: RAX: ffffffffffffffda RBX: 00007fc56c00bf20 RCX: 00007fc5ce5168d2025-03-30T14:54:10.888529-07:00 schallkreis kernel: RDX: 00007fc576ffc4a0 RSI: 00000000c03864bc RDI: 0000000000000012025-03-30T14:54:10.888529-07:00 schallkreis kernel: RBP: 00007fc576ffc4a0 R08: 00007fc56c00ebd0 R09: 00007fc56c0084a2025-03-30T14:54:10.888530-07:00 schallkreis kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c03864bc 2025-03-30T14:54:10.888530-07:00 schallkreis kernel: R13: 0000000000000012 R14: 00007fc56c01983c R15: 00007fc56c008520
    2025-03-30T14:54:10.888530-07:00 schallkreis kernel: </TASK> 2025-03-30T14:54:10.888530-07:00 schallkreis kernel: watchdog: Watchdog detected hard LOCKUP on cpu 8
    2025-03-30T14:54:10.888531-07:00 schallkreis kernel: Modules linked in: dm_mod cpuid tls qrtr binfmt_misc nls_ascii nls_cp437 vfat fat amd_atl intel_rapl_msr intel_rapl_common mt7925e mt=
    7925_common mt792x_lib mt76_connac_lib edac_mce_amd snd_hda_codec_realtek mt76 kvm_amd snd_hda_codec_generic mac80211 snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel kvm snd_intel_dspcfg =
    snd_intel_sdw_acpi snd_hda_codec libarc4 snd_hda_core cfg80211 crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_hwdep sha256_ssse3 sha1_ssse3 snd_pcm aesni_intel spd5118 gf128mul crypto_simd=
    cryptd snd_timer wmi_bmof sp5100_tco rapl snd ccp pcspkr watchdog rfkill k10temp soundcore joydev evdev nfsd auth_rpcgss nfs_acl lockd grace sunrpc parport_pc ppdev lp parpor=
    t configfs efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic hid_generic usbhid hid amdgpu amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm=
    _suballoc_helper drm_display_helper cec rc_core drm_ttm_helper ttm xhci_pci xhci_hcd drm_kms_helper ahci libahci r8169 libata drm nvme 2025-03-30T14:54:10.888532-07:00 schallkreis kernel: thunderbolt realtek usbcore crc32_pclmul mdio_devres scsi_mod crc32c_intel nvme_core libphy i2c_piix4 i2c_smbus usb_common scsi_common =
    nvme_auth crc16 video wmi gpio_amdpt gpio_generic butto2025-03-30T14:54:10.888532-07:00 schallkreis kernel: CPU: 8 UID: 0 PID: 0 Comm: swapper/8 Tainted: G W=
    6.12.17-amd64 #1 Debian 6.12.17-2025-03-30T14:54:10.888532-07:00 schallkreis kernel: Tainted: [W]=3DWARN
    2025-03-30T14:54:10.888533-07:00 schallkreis kernel: Hardware name: System76 Thelio Mira/Thelio Mira, BIOS 3.11.SP01 12/05/2024

    As the kernel is tained, this is not the first warning. Can you please
    provide the full kernel log so we can see what is first happening?

    Is the issue actually reproducible for you? If yes could you as well
    test newer kernels as the above shows it's under 6.12.17-1? current in
    unstable is 6.12.22, or try 6.13.10 from experimental.

    Regards,
    Salvatore

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alison Chaiken@21:1/5 to All on Thu Apr 17 07:10:02 2025
    XPost: linux.debian.kernel

    When the PC has been sitting idle with the monitor off and I try to
    interact with it, I sometimes find that it is locked up and won't
    respond to keystrokes, ping or sysrq. Based on the syslog, it looks
    like the PCIe endpoint has gone into the low-power D3cold state and
    can't be woken. The result is

    watchdog: Watchdog detected hard LOCKUP on cpu 20

    after which I must press the power button to shutdown and reboot. Below
    is some output from the syslog.

    *********************************************************************************************************

    2025-03-30T14:54:01.842776-07:00 schallkreis kernel: pcieport
    0000:0f:00.0: Unable to change power state from D3cold to D0, device inaccessible
    2025-03-30T14:54:02.504644-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring sdma0
    2025-03-30T14:54:02.504652-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:02.945988-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring sdma0
    2025-03-30T14:54:03.387173-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:04.049130-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:04.049136-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring sdma1
    2025-03-30T14:54:04.931536-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:05.373985-07:00 schallkreis kernel: [drm] Fence
    fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:09.785625-07:00 schallkreis kwin_wayland_wrapper[4583]: kwin_libinput: Libinput: event22 - Logitech USB Optical Mouse: client
    bug: event processing lagging behind by 220ms, your system is too slow 2025-03-30T14:54:10.888484-07:00 schallkreis kernel: watchdog: Watchdog detected hard LOCKUP on cpu 20
    2025-03-30T14:54:10.888491-07:00 schallkreis kernel: Modules linked in:
    dm_mod cpuid tls qrtr binfmt_misc nls_ascii nls_cp437 vfat fat amd_atl intel_rapl_msr intel_rapl_common mt7925e mt7925_common mt792x_lib mt76_connac_lib edac_mce_amd snd_hda_codec_realtek mt76 kvm_amd snd_hda_codec_generic mac80211 snd_hda_scodec_component
    snd_hda_codec_hdmi snd_hda_intel kvm snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec libarc4 snd_hda_core cfg80211 crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_hwdep sha256_ssse3 sha1_ssse3
    snd_pcm aesni_intel spd5118 gf128mul crypto_simd cryptd snd_timer
    wmi_bmof sp5100_tco rapl snd ccp pcspkr watchdog rfkill k10temp
    soundcore joydev evdev nfsd auth_rpcgss nfs_acl lockd grace sunrpc
    parport_pc ppdev lp parport configfs efi_pstore nfnetlink efivarfs
    ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic hid_generic
    usbhid hid amdgpu amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_display_helper cec rc_core drm_ttm_helper ttm
    xhci_pci xhci_hcd drm_kms_helper ahci libahci r8169 libata drm nvme 2025-03-30T14:54:10.888493-07:00 schallkreis kernel: thunderbolt
    realtek usbcore crc32_pclmul mdio_devres scsi_mod crc32c_intel nvme_core
    libphy i2c_piix4 i2c_smbus usb_common scsi_common nvme_auth crc16 video
    wmi gpio_amdpt gpio_generic butto2025-03-30T14:54:10.888494-07:00
    schallkreis kernel: CPU: 20 UID: 1000 PID: 4645 Comm: HDMI-A-1 Tainted:
    G W 6.12.17-amd64 #1 Debian 6.12.17-1 2025-03-30T14:54:10.888501-07:00 schallkreis kernel: Tainted: [W]DWARN 2025-03-30T14:54:10.888502-07:00 schallkreis kernel: Hardware name:
    System76 Thelio Mira/Thelio Mira, BIOS 3.11.SP01 12/05/2024 2025-03-30T14:54:10.888503-07:00 schallkreis kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x70/0x2a0 2025-03-30T14:54:10.888503-07:00 schallkreis kernel: Code: 77 77 f0 0f
    ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77
    53 85 c0 74=
    10 0f b6 03 84 c0 74 09 f3 90 <0f> b6 03 84 c0 75 f7 b8 01 00 00 00 66
    89 03 5b 5d 41 5c 41 5d c3
    2025-03-30T14:54:10.888504-07:00 schallkreis kernel: RSP:
    0018:ffffb0d1cb007868 EFLAGS: 00000002
    2025-03-30T14:54:10.888510-07:00 schallkreis kernel: RAX:
    0000000000000001 RBX: ffff8b47eb080178 RCX: ffff8b4a20fd62d2025-03-30T14:54:10.888510-07:00 schallkreis kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8b47eb080172025-03-30T14:54:10.888511-07:00 schallkreis kernel: RBP: ffff8b4a20fd6280 R08: 0000000000000080 R09: ffff8b4a20fd6282025-03-30T14:54:10.888511-07:00 schallkreis kernel: R10: ffffb0d1cb0078e0 R11: 0000000000000004 R12: ffff8b47eb080172025-03-30T14:54:10.888511-07:00 schallkreis kernel: R13: ffff8b4a20fd62d8 R14: ffff8b4a20fd6080 R15: ffff8b4a20fd6282025-03-30T14:54:10.888511-07:00 schallkreis kernel: FS: 00007fc576ffd6c0(0000) GS:ffff8b565fc00000(0000) knlGS:0000000000000000 2025-03-30T14:54:10.888512-07:00 schallkreis kernel: CS: 0010 DS: 0000
    ES: 0000 CR0: 0000000080050033
    2025-03-30T14:54:10.888513-07:00 schallkreis kernel: CR2:
    00007f8f80d1aef0 CR3: 0000000129824000 CR4: 0000000000f50ef2025-03-30T14:54:10.888513-07:00 schallkreis kernel:
    PKRU: 55555554
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: Call Trace: 2025-03-30T14:54:10.888514-07:00 schallkreis kernel: <NMI> 2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? watchdog_hardlockup_check.cold+0x100/0x105
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? __perf_event_overflow+0x10c/0x320
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? amd_pmu_v2_handle_irq+0x2a1/0x3c0
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ? perf_event_nmi_handler+0x2a/0x50
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ?
    nmi_handle+0x5e/0x120
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ? default_do_nmi+0x40/0x130
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ?
    exc_nmi+0x122/0x1a0
    2025-03-30T14:54:10.888516-07:00 schallkreis kernel: ?
    end_repeat_nmi+0xf/0x53
    2025-03-30T14:54:10.888516-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: </NMI2025-03-30T14:54:10.888518-07:00 schallkreis kernel: <TASK> 2025-03-30T14:54:10.888518-07:00 schallkreis kernel: _raw_spin_lock_irqsave+0x3d/0x50
    2025-03-30T14:54:10.888518-07:00 schallkreis kernel: drm_event_reserve_init+0x2f/0xc0 [drm]
    2025-03-30T14:54:10.888518-07:00 schallkreis kernel: drm_mode_atomic_ioctl+0x61d/0xcb0 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel: drm_ioctl_kernel+0xad/0x100 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel:
    drm_ioctl+0x277/0x4f0 [drm]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel:
    amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel:
    __x64_sys_ioctl+0x91/0xd0
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel:
    do_syscall_64+0x82/0x190
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? __x64_sys_poll+0xd0/0x180
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? restore_fpregs_from_fpstate+0x3c/0xa0
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? switch_fpu_return+0x4e/0xd0
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x172/0x210
    2025-03-30T14:54:10.888523-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888523-07:00 schallkreis kernel: ?
    do_futex+0x125/0x190
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? __x64_sys_futex+0x127/0x1e0
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? __x64_sys_poll+0xd0/0x180
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
    2025-03-30T14:54:10.888528-07:00 schallkreis kernel: RIP:
    0033:0x7fc5ce5168db
    2025-03-30T14:54:10.888528-07:00 schallkreis kernel: Code: 00 48 89 44
    24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44
    24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 =
    f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00 2025-03-30T14:54:10.888528-07:00 schallkreis kernel: RSP:
    002b:00007fc576ffc3b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000012025-03-30T14:54:10.888529-07:00 schallkreis kernel: RAX: ffffffffffffffda RBX: 00007fc56c00bf20 RCX: 00007fc5ce5168d2025-03-30T14:54:10.888529-07:00 schallkreis kernel: RDX: 00007fc576ffc4a0 RSI: 00000000c03864bc RDI: 0000000000000012025-03-30T14:54:10.888529-07:00 schallkreis kernel: RBP: 00007fc576ffc4a0 R08: 00007fc56c00ebd0 R09: 00007fc56c0084a2025-03-30T14:54:10.888530-07:00 schallkreis kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c03864bc 2025-03-30T14:54:10.888530-07:00 schallkreis kernel: R13:
    0000000000000012 R14: 00007fc56c01983c R15: 00007fc56c008520 2025-03-30T14:54:10.888530-07:00 schallkreis kernel: </TASK> 2025-03-30T14:54:10.888530-07:00 schallkreis kernel: watchdog: Watchdog detected hard LOCKUP on cpu 8
    2025-03-30T14:54:10.888531-07:00 schallkreis kernel: Modules linked in:
    dm_mod cpuid tls qrtr binfmt_misc nls_ascii nls_cp437 vfat fat amd_atl intel_rapl_msr intel_rapl_common mt7925e mt=
    7925_common mt792x_lib mt76_connac_lib edac_mce_amd
    snd_hda_codec_realtek mt76 kvm_amd snd_hda_codec_generic mac80211 snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel kvm
    snd_intel_dspcfg =
    snd_intel_sdw_acpi snd_hda_codec libarc4 snd_hda_core cfg80211
    crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_hwdep sha256_ssse3 sha1_ssse3 snd_pcm aesni_intel spd5118 gf128mul crypto_simd=
    cryptd snd_timer wmi_bmof sp5100_tco rapl snd ccp pcspkr watchdog
    rfkill k10temp soundcore joydev evdev nfsd auth_rpcgss nfs_acl lockd
    grace sunrpc parport_pc ppdev lp parpor=
    t configfs efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic hid_generic usbhid hid amdgpu amdxcp
    drm_exec gpu_sched drm_buddy i2c_algo_bit drm=
    _suballoc_helper drm_display_helper cec rc_core drm_ttm_helper ttm
    xhci_pci xhci_hcd drm_kms_helper ahci libahci r8169 libata drm nvme 2025-03-30T14:54:10.888532-07:00 schallkreis kernel: thunderbolt
    realtek usbcore crc32_pclmul mdio_devres scsi_mod crc32c_intel nvme_core
    libphy i2c_piix4 i2c_smbus usb_common scsi_common =
    nvme_auth crc16 video wmi gpio_amdpt gpio_generic butto2025-03-30T14:54:10.888532-07:00 schallkreis kernel: CPU: 8 UID: 0
    PID: 0 Comm: swapper/8 Tainted: G W=
    6.12.17-amd64 #1 Debian
    6.12.17-2025-03-30T14:54:10.888532-07:00 schallkreis kernel: Tainted: [W]=3DWARN
    2025-03-30T14:54:10.888533-07:00 schallkreis kernel: Hardware name:
    System76 Thelio Mira/Thelio Mira, BIOS 3.11.SP01 12/05/2024

    -- System Information:
    Debian Release: trixie/sid
    APT prefers testing-debug
    APT policy: (500, 'testing-debug'), (500, 'testing')
    Architecture: amd64 (x86_64)

    Kernel: Linux 6.12.19-amd64 (SMP w/24 CPU threads; PREEMPT)
    Kernel taint flags: TAINT_WARN
    Locale: LANGDen_US.UTF-8, LC_CTYPE=3Den_US.UTF-8 (charmap=3DUTF-8),
    LANGUAGE not set
    Shell: /bin/sh linked to /usr/bin/dash
    Init: systemd (via /run/systemd/system)
    LSM: AppArmor: enabled

    ************************************************************************************

    Since the system is locking up, I have no dmesg. I uploaded a
    compressed syslog which contains 2 occurrences of the same failure to

    https://she-devel.com/Thelio-Mira-syslog.tar.xz

    The problem has persisted with 6.12.21-amd64. Since the failure starts
    with PCIe at 0000:0f, below is some more related output in which that
    PCIe link appears to attach to USB.

    [alison@schallkreis ~]$ lspci -vv -t
    -[0000:00]-+-00.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Root Complex
    +-00.2 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge IOMMU
    +-01.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Dummy Host Bridge
    +-01.1-[01-03]----00.0-[02-03]----00.0-[03]--+-00.0 Advanced
    Micro Devices, Inc. [AMD/ATI] Navi 33 [Radeon RX 7600/7600 XT/7600M XT/7600S/7700S / PRO W7600]
    | \-00.1 Advanced
    Micro Devices, Inc. [AMD/ATI] Navi 31 HDMI/DP Audio
    +-01.2-[04]----00.0 Micron/Crucial Technology T700 NVMe PCIe
    SSD
    +-02.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Dummy Host Bridge
    +-02.1-[05-0e]----00.0-[06-0e]--+-00.0-[07]--
    | +-04.0-[08]--
    | +-08.0-[09]--
    | +-09.0-[0a]----00.0 ASMedia Technology Inc. ASM1061/ASM1062 Serial ATA Controller
    | +-0a.0-[0b]----00.0 MEDIATEK
    Corp. Device 0717
    | +-0b.0-[0c]----00.0 Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
    | +-0c.0-[0d]----00.0 Advanced
    Micro Devices, Inc. [AMD] Device 43fc
    | \-0d.0-[0e]----00.0 Advanced
    Micro Devices, Inc. [AMD] 600 Series Chipset SATA Controller
    +-02.2-[0f-72]----00.0-[10-72]--+-00.0-[11-40]--
    | +-01.0-[41-70]--
    | +-02.0-[71]----00.0 ASMedia Technology Inc. ASM4242 USB 3.2 xHCI Controller
    | \-03.0-[72]----00.0 ASMedia Technology Inc. ASM4242 USB 4 / Thunderbolt 3 Host Router
    +-03.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Dummy Host Bridge
    +-04.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Dummy Host Bridge
    +-08.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Dummy Host Bridge
    +-08.1-[73]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI]
    Granite Ridge [Radeon Graphics]
    | +-00.1 Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller
    | +-00.2 Advanced Micro Devices, Inc. [AMD]
    Family 19h PSP/CCP
    | +-00.3 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 3.1 xHCI
    | +-00.4 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 3.1 xHCI
    | \-00.6 Advanced Micro Devices, Inc. [AMD]
    Family 17h/19h/1ah HD Audio Controller
    +-08.3-[74]----00.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 2.0 xHCI
    +-14.0 Advanced Micro Devices, Inc. [AMD] FCH SMBus
    Controller
    +-14.3 Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
    +-18.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Data Fabric; Function 0
    +-18.1 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Data Fabric; Function 1
    +-18.2 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Data Fabric; Function 2
    +-18.3 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Data Fabric; Function 3
    +-18.4 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Data Fabric; Function 4
    +-18.5 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Data Fabric; Function 5
    +-18.6 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Data Fabric; Function 6
    \-18.7 Advanced Micro Devices, Inc. [AMD] Raphael/Granite
    Ridge Data Fabric; Function 7

    To investigate, I wrote a bpftrace script and a systemd unit to run it
    which are at

    https://github.com/chaiken/BPF-sandbox/commit/e0dda39cbe92e0f80805a2a06aa80d6fb3b065b2

    With these I'm 99% sure that apcupsd is triggering the problem by waking
    the USB bus every 15 seconds. I started the tracer and stopped
    apcupsd. If the system lasts a week with locking up, I'll restart
    apcupsd and see if I can figure out how it's triggering the kernel
    failure. So far the system has been stable for 4 days.

    Thanks again,
    Alison Chaiken

    ---
    Alison Chaiken alison@she-devel.com
    https://she-devel.com
    "What respite from her thrilling toil did Beauty ever take -- But Work
    might be Electric Rest To those that Magic make" -- Emily Dickson <html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body style='font-size: 11pt; font-family: "Times New Roman",Times,serif'>
    <p>When the PC has been sitting idle with the monitor off &nbsp;and I try to interact with it, I sometimes find that it is locked up and won't respond to keystrokes, ping or sysrq. &nbsp; Based on the syslog, it looks like the PCIe endpoint has gone into
    the low-power D3cold state and can't be woken. &nbsp;The result is</p> <p>&nbsp; &nbsp; &nbsp;watchdog: Watchdog detected hard LOCKUP on cpu 20</p> <p>after which I must press the power button to shutdown and reboot. &nbsp;Below is some output from the syslog.</p>
    <p>*********************************************************************************************************</p>
  • From Salvatore Bonaccorso@21:1/5 to Alison Chaiken on Fri Apr 18 18:00:01 2025
    XPost: linux.debian.kernel

    Hi Alison,

    On Wed, Apr 16, 2025 at 09:53:18PM -0700, Alison Chaiken wrote:
    When the PC has been sitting idle with the monitor off and I try to
    interact with it, I sometimes find that it is locked up and won't respond to keystrokes, ping or sysrq. Based on the syslog, it looks like the PCIe endpoint has gone into the low-power D3cold state and can't be woken. The result is

    watchdog: Watchdog detected hard LOCKUP on cpu 20

    after which I must press the power button to shutdown and reboot. Below is some output from the syslog.

    *********************************************************************************************************

    2025-03-30T14:54:01.842776-07:00 schallkreis kernel: pcieport 0000:0f:00.0: Unable to change power state from D3cold to D0, device inaccessible 2025-03-30T14:54:02.504644-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring sdma0
    2025-03-30T14:54:02.504652-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:02.945988-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring sdma0
    2025-03-30T14:54:03.387173-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:04.049130-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:04.049136-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring sdma1
    2025-03-30T14:54:04.931536-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:05.373985-07:00 schallkreis kernel: [drm] Fence fallback timer expired on ring gfx_0.0.0
    2025-03-30T14:54:09.785625-07:00 schallkreis kwin_wayland_wrapper[4583]: kwin_libinput: Libinput: event22 - Logitech USB Optical Mouse: client bug: event processing lagging behind by 220ms, your system is too slow 2025-03-30T14:54:10.888484-07:00 schallkreis kernel: watchdog: Watchdog detected hard LOCKUP on cpu 20
    2025-03-30T14:54:10.888491-07:00 schallkreis kernel: Modules linked in: dm_mod cpuid tls qrtr binfmt_misc nls_ascii nls_cp437 vfat fat amd_atl intel_rapl_msr intel_rapl_common mt7925e mt7925_common mt792x_lib mt76_connac_lib edac_mce_amd snd_hda_codec_realtek mt76 kvm_amd snd_hda_codec_generic mac80211 snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel kvm snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec libarc4 snd_hda_core cfg80211 crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_hwdep sha256_ssse3 sha1_ssse3 snd_pcm aesni_intel spd5118 gf128mul crypto_simd cryptd snd_timer wmi_bmof sp5100_tco rapl snd ccp pcspkr
    watchdog rfkill k10temp soundcore joydev evdev nfsd auth_rpcgss nfs_acl
    lockd grace sunrpc parport_pc ppdev lp parport configfs efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic hid_generic usbhid hid amdgpu amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_display_helper cec rc_core drm_ttm_helper ttm xhci_pci xhci_hcd drm_kms_helper ahci libahci r8169
    libata drm nvme
    2025-03-30T14:54:10.888493-07:00 schallkreis kernel: thunderbolt realtek usbcore crc32_pclmul mdio_devres scsi_mod crc32c_intel nvme_core libphy i2c_piix4 i2c_smbus usb_common scsi_common nvme_auth crc16 video wmi gpio_amdpt gpio_generic butto2025-03-30T14:54:10.888494-07:00 schallkreis kernel: CPU: 20 UID: 1000 PID: 4645 Comm: HDMI-A-1 Tainted: G W 6.12.17-amd64 #1 Debian 6.12.17-1
    2025-03-30T14:54:10.888501-07:00 schallkreis kernel: Tainted: [W]DWARN 2025-03-30T14:54:10.888502-07:00 schallkreis kernel: Hardware name: System76 Thelio Mira/Thelio Mira, BIOS 3.11.SP01 12/05/2024 2025-03-30T14:54:10.888503-07:00 schallkreis kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x70/0x2a0 2025-03-30T14:54:10.888503-07:00 schallkreis kernel: Code: 77 77 f0 0f ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 53 85 c0 74=
    10 0f b6 03 84 c0 74 09 f3 90 <0f> b6 03 84 c0 75 f7 b8 01 00 00 00 66 89
    03 5b 5d 41 5c 41 5d c3
    2025-03-30T14:54:10.888504-07:00 schallkreis kernel: RSP: 0018:ffffb0d1cb007868 EFLAGS: 00000002
    2025-03-30T14:54:10.888510-07:00 schallkreis kernel: RAX: 0000000000000001 RBX: ffff8b47eb080178 RCX: ffff8b4a20fd62d2025-03-30T14:54:10.888510-07:00 schallkreis kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8b47eb080172025-03-30T14:54:10.888511-07:00 schallkreis kernel: RBP: ffff8b4a20fd6280 R08: 0000000000000080 R09: ffff8b4a20fd6282025-03-30T14:54:10.888511-07:00 schallkreis kernel: R10: ffffb0d1cb0078e0 R11: 0000000000000004 R12: ffff8b47eb080172025-03-30T14:54:10.888511-07:00 schallkreis kernel: R13: ffff8b4a20fd62d8 R14: ffff8b4a20fd6080 R15: ffff8b4a20fd6282025-03-30T14:54:10.888511-07:00 schallkreis kernel: FS: 00007fc576ffd6c0(0000) GS:ffff8b565fc00000(0000) knlGS:0000000000000000 2025-03-30T14:54:10.888512-07:00 schallkreis kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    2025-03-30T14:54:10.888513-07:00 schallkreis kernel: CR2: 00007f8f80d1aef0 CR3: 0000000129824000 CR4: 0000000000f50ef2025-03-30T14:54:10.888513-07:00 schallkreis kernel: PKRU: 55555554
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: Call Trace: 2025-03-30T14:54:10.888514-07:00 schallkreis kernel: <NMI> 2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? watchdog_hardlockup_check.cold+0x100/0x105
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? __perf_event_overflow+0x10c/0x320
    2025-03-30T14:54:10.888514-07:00 schallkreis kernel: ? amd_pmu_v2_handle_irq+0x2a1/0x3c0
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ? perf_event_nmi_handler+0x2a/0x50
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ?
    nmi_handle+0x5e/0x120
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ? default_do_nmi+0x40/0x130
    2025-03-30T14:54:10.888515-07:00 schallkreis kernel: ? exc_nmi+0x122/0x1a0 2025-03-30T14:54:10.888516-07:00 schallkreis kernel: ? end_repeat_nmi+0xf/0x53
    2025-03-30T14:54:10.888516-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: ? native_queued_spin_lock_slowpath+0x70/0x2a0
    2025-03-30T14:54:10.888517-07:00 schallkreis kernel: </NMI2025-03-30T14:54:10.888518-07:00 schallkreis kernel: <TASK> 2025-03-30T14:54:10.888518-07:00 schallkreis kernel: _raw_spin_lock_irqsave+0x3d/0x50
    2025-03-30T14:54:10.888518-07:00 schallkreis kernel: drm_event_reserve_init+0x2f/0xc0 [drm]
    2025-03-30T14:54:10.888518-07:00 schallkreis kernel: drm_mode_atomic_ioctl+0x61d/0xcb0 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel: drm_ioctl_kernel+0xad/0x100 [drm]
    2025-03-30T14:54:10.888519-07:00 schallkreis kernel: drm_ioctl+0x277/0x4f0 [drm]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel: amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel: __x64_sys_ioctl+0x91/0xd0
    2025-03-30T14:54:10.888520-07:00 schallkreis kernel:
    do_syscall_64+0x82/0x190
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? __x64_sys_poll+0xd0/0x180
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888521-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? restore_fpregs_from_fpstate+0x3c/0xa0
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? switch_fpu_return+0x4e/0xd0
    2025-03-30T14:54:10.888522-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x172/0x210
    2025-03-30T14:54:10.888523-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888523-07:00 schallkreis kernel: ? do_futex+0x125/0x190 2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? __x64_sys_futex+0x127/0x1e0
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888524-07:00 schallkreis kernel: ? __x64_sys_poll+0xd0/0x180
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888525-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-03-30T14:54:10.888526-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-03-30T14:54:10.888527-07:00 schallkreis kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
    2025-03-30T14:54:10.888528-07:00 schallkreis kernel: RIP:
    0033:0x7fc5ce5168db
    2025-03-30T14:54:10.888528-07:00 schallkreis kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48
    89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 =
    f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00 2025-03-30T14:54:10.888528-07:00 schallkreis kernel: RSP: 002b:00007fc576ffc3b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000012025-03-30T14:54:10.888529-07:00 schallkreis kernel: RAX: ffffffffffffffda RBX: 00007fc56c00bf20 RCX: 00007fc5ce5168d2025-03-30T14:54:10.888529-07:00 schallkreis kernel: RDX: 00007fc576ffc4a0 RSI: 00000000c03864bc RDI: 0000000000000012025-03-30T14:54:10.888529-07:00 schallkreis kernel: RBP: 00007fc576ffc4a0 R08: 00007fc56c00ebd0 R09: 00007fc56c0084a2025-03-30T14:54:10.888530-07:00 schallkreis kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c03864bc 2025-03-30T14:54:10.888530-07:00 schallkreis kernel: R13: 0000000000000012 R14: 00007fc56c01983c R15: 00007fc56c008520
    2025-03-30T14:54:10.888530-07:00 schallkreis kernel: </TASK> 2025-03-30T14:54:10.888530-07:00 schallkreis kernel: watchdog: Watchdog detected hard LOCKUP on cpu 8
    2025-03-30T14:54:10.888531-07:00 schallkreis kernel: Modules linked in: dm_mod cpuid tls qrtr binfmt_misc nls_ascii nls_cp437 vfat fat amd_atl intel_rapl_msr intel_rapl_common mt7925e mt=
    7925_common mt792x_lib mt76_connac_lib edac_mce_amd snd_hda_codec_realtek mt76 kvm_amd snd_hda_codec_generic mac80211 snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel kvm snd_intel_dspcfg =
    snd_intel_sdw_acpi snd_hda_codec libarc4 snd_hda_core cfg80211 crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_hwdep sha256_ssse3 sha1_ssse3 snd_pcm aesni_intel spd5118 gf128mul crypto_simd=
    cryptd snd_timer wmi_bmof sp5100_tco rapl snd ccp pcspkr watchdog rfkill k10temp soundcore joydev evdev nfsd auth_rpcgss nfs_acl lockd grace sunrpc parport_pc ppdev lp parpor=
    t configfs efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic hid_generic usbhid hid amdgpu amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm=
    _suballoc_helper drm_display_helper cec rc_core drm_ttm_helper ttm xhci_pci xhci_hcd drm_kms_helper ahci libahci r8169 libata drm nvme 2025-03-30T14:54:10.888532-07:00 schallkreis kernel: thunderbolt realtek usbcore crc32_pclmul mdio_devres scsi_mod crc32c_intel nvme_core libphy i2c_piix4 i2c_smbus usb_common scsi_common =
    nvme_auth crc16 video wmi gpio_amdpt gpio_generic butto2025-03-30T14:54:10.888532-07:00 schallkreis kernel: CPU: 8 UID: 0 PID: 0 Comm: swapper/8 Tainted: G W=
    6.12.17-amd64 #1 Debian 6.12.17-2025-03-30T14:54:10.888532-07:00 schallkreis kernel: Tainted: [W]=3DWARN
    2025-03-30T14:54:10.888533-07:00 schallkreis kernel: Hardware name: System76 Thelio Mira/Thelio Mira, BIOS 3.11.SP01 12/05/2024

    -- System Information:
    Debian Release: trixie/sid
    APT prefers testing-debug
    APT policy: (500, 'testing-debug'), (500, 'testing')
    Architecture: amd64 (x86_64)

    Kernel: Linux 6.12.19-amd64 (SMP w/24 CPU threads; PREEMPT)
    Kernel taint flags: TAINT_WARN
    Locale: LANGDen_US.UTF-8, LC_CTYPE=3Den_US.UTF-8 (charmap=3DUTF-8),
    LANGUAGE not set
    Shell: /bin/sh linked to /usr/bin/dash
    Init: systemd (via /run/systemd/system)
    LSM: AppArmor: enabled

    ************************************************************************************

    Since the system is locking up, I have no dmesg. I uploaded a compressed syslog which contains 2 occurrences of the same failure to

    https://she-devel.com/Thelio-Mira-syslog.tar.xz

    The problem has persisted with 6.12.21-amd64. Since the failure starts with PCIe at 0000:0f, below is some more related output in which that PCIe link appears to attach to USB.

    [alison@schallkreis ~]$ lspci -vv -t
    -[0000:00]-+-00.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Root Complex
    +-00.2 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge IOMMU
    +-01.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Dummy Host Bridge
    +-01.1-[01-03]----00.0-[02-03]----00.0-[03]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] Navi 33 [Radeon RX 7600/7600 XT/7600M XT/7600S/7700S / PRO W7600]
    | \-00.1 Advanced Micro Devices, Inc. [AMD/ATI] Navi 31 HDMI/DP Audio
    +-01.2-[04]----00.0 Micron/Crucial Technology T700 NVMe PCIe SSD
    +-02.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Dummy Host Bridge
    +-02.1-[05-0e]----00.0-[06-0e]--+-00.0-[07]--
    | +-04.0-[08]--
    | +-08.0-[09]--
    | +-09.0-[0a]----00.0 ASMedia Technology Inc. ASM1061/ASM1062 Serial ATA Controller
    | +-0a.0-[0b]----00.0 MEDIATEK Corp. Device 0717
    | +-0b.0-[0c]----00.0 Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
    | +-0c.0-[0d]----00.0 Advanced Micro Devices, Inc. [AMD] Device 43fc
    | \-0d.0-[0e]----00.0 Advanced Micro Devices, Inc. [AMD] 600 Series Chipset SATA Controller
    +-02.2-[0f-72]----00.0-[10-72]--+-00.0-[11-40]--
    | +-01.0-[41-70]--
    | +-02.0-[71]----00.0 ASMedia Technology Inc. ASM4242 USB 3.2 xHCI Controller
    | \-03.0-[72]----00.0 ASMedia Technology Inc. ASM4242 USB 4 / Thunderbolt 3 Host Router
    +-03.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Dummy Host Bridge
    +-04.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Dummy Host Bridge
    +-08.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Dummy Host Bridge
    +-08.1-[73]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] Granite Ridge [Radeon Graphics]
    | +-00.1 Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller
    | +-00.2 Advanced Micro Devices, Inc. [AMD] Family
    19h PSP/CCP
    | +-00.3 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 3.1 xHCI
    | +-00.4 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 3.1 xHCI
    | \-00.6 Advanced Micro Devices, Inc. [AMD] Family 17h/19h/1ah HD Audio Controller
    +-08.3-[74]----00.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 2.0 xHCI
    +-14.0 Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
    +-14.3 Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
    +-18.0 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 0
    +-18.1 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 1
    +-18.2 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 2
    +-18.3 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 3
    +-18.4 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 4
    +-18.5 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 5
    +-18.6 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 6
    \-18.7 Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 7

    To investigate, I wrote a bpftrace script and a systemd unit to run it which are at

    https://github.com/chaiken/BPF-sandbox/commit/e0dda39cbe92e0f80805a2a06aa80d6fb3b065b2

    With these I'm 99% sure that apcupsd is triggering the problem by waking the USB bus every 15 seconds. I started the tracer and stopped apcupsd. If the system lasts a week with locking up, I'll restart apcupsd and see if I can figure out how it's triggering the kernel failure. So far the system
    has been stable for 4 days.

    Thanks for providing those updates. Please let us know once you have
    further confidence about the above.

    Regards,
    Salvatore

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alison Chaiken@21:1/5 to All on Fri Apr 25 06:00:02 2025
    XPost: linux.debian.kernel

    The tracer at https://github.com/chaiken/BPF-sandbox/commit/e0dda39cbe92e0f80805a2a06aa80d6fb3b065b2
    finally was running when apcupsd triggered the lockup. The stack below
    has appeared over and over for days. The time output in the trace has a
    1000x math error:

    2025-04-24T11:20:23.394893-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: pci_mmcfg_read() for comm
    apcupsd with tid 3183 took 222794 s
    2025-04-24T11:20:38.809232-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: apcupsd: lock contended for
    222810s
    2025-04-24T11:20:38.809380-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: __down_common+521 2025-04-24T11:20:38.809396-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: __down_common+521 2025-04-24T11:20:38.809406-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: down_timeout+85 2025-04-24T11:20:38.809420-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    acpi_os_wait_semaphore+76
    2025-04-24T11:20:38.809431-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    acpi_ut_acquire_mutex+58
    2025-04-24T11:20:38.809444-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    acpi_ex_enter_interpreter+16
    2025-04-24T11:20:38.809458-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    acpi_ex_system_wait_mutex+82
    2025-04-24T11:20:38.809470-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    acpi_ds_begin_method_execution+137
    2025-04-24T11:20:38.809482-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    acpi_ds_call_control_method+79
    2025-04-24T11:20:38.809493-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_ps_parse_aml+718 2025-04-24T11:20:38.809504-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    acpi_ps_execute_method+319
    2025-04-24T11:20:38.809515-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_ns_evaluate+318 2025-04-24T11:20:38.809528-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    acpi_evaluate_object+334
    2025-04-24T11:20:38.809539-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    acpi_device_set_power+495
    2025-04-24T11:20:38.809551-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    acpi_pci_set_power_state+109
    2025-04-24T11:20:38.809562-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: pci_power_up+48 2025-04-24T11:20:38.809573-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    pci_pm_runtime_resume+51
    2025-04-24T11:20:38.809584-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: __rpm_callback+65 2025-04-24T11:20:38.809596-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: rpm_callback+85 2025-04-24T11:20:38.809608-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: rpm_resume+1235 2025-04-24T11:20:38.809618-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: rpm_resume+723 2025-04-24T11:20:38.809630-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: __pm_runtime_resume+75 2025-04-24T11:20:38.809641-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    usb_autoresume_device+30
    2025-04-24T11:20:38.809651-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: usbdev_open+259 2025-04-24T11:20:38.809662-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: chrdev_open+178 2025-04-24T11:20:38.809673-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: do_dentry_open+332 2025-04-24T11:20:38.809685-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: vfs_open+46 2025-04-24T11:20:38.809698-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: path_openat+2094 2025-04-24T11:20:38.809708-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: do_filp_open+196 2025-04-24T11:20:38.809719-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: do_sys_openat2+174 2025-04-24T11:20:38.809730-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: __x64_sys_openat+85 2025-04-24T11:20:38.809742-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: do_syscall_64+130 2025-04-24T11:20:38.809753-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]:
    entry_SYSCALL_64_after_hwframe+118

    Here is the failure shortly thereafter. It happened while the system
    was idle except for browser, IRC etc. left running. Perhaps he
    pulseaudio failure is related?

    2025-04-24T11:22:43.698237-07:00 schallkreis plasmashell[3874]: org.kde.pulseaudio: context kaput
    2025-04-24T11:22:43.698345-07:00 schallkreis plasmashell[3874]: org.kde.pulseaudio: No object for name "alsa_output.pci-0000_73_00.6.analog-stereo"
    2025-04-24T11:22:43.698364-07:00 schallkreis plasmashell[3874]: org.kde.pulseaudio: No object for name "alsa_output.pci-0000_73_00.6.analog-stereo.monitor" 2025-04-24T11:22:44.139668-07:00 schallkreis kded6[3770]:
    org.kde.pulseaudio: context kaput
    2025-04-24T11:22:44.139719-07:00 schallkreis kded6[3770]:
    org.kde.pulseaudio: No object for name "alsa_output.pci-0000_73_00.6.analog-stereo"
    2025-04-24T11:22:44.139733-07:00 schallkreis kded6[3770]:
    org.kde.pulseaudio: No object for name "alsa_output.pci-0000_73_00.6.analog-stereo.monitor" 2025-04-24T11:22:44.139817-07:00 schallkreis systemd[3492]:
    pulseaudio.service: Main process exited, code=killed, status=9/KILL 2025-04-24T11:22:44.580911-07:00 schallkreis systemd[3492]:
    pulseaudio.service: Failed with result 'signal'. 2025-04-24T11:22:44.581246-07:00 schallkreis systemd[3492]:
    pulseaudio.service: Consumed 33min 45.125s CPU time, 11.4M memory peak. 2025-04-24T11:22:44.801449-07:00 schallkreis systemd[3492]:
    pulseaudio.service: Scheduled restart job, restart counter is at 1. 2025-04-24T11:22:45.022289-07:00 schallkreis systemd[3492]: Starting pulseaudio.service - Sound Service...
    2025-04-24T11:22:47.669907-07:00 schallkreis plasmashell[3874]: pw.conf:
    can't load config client.conf: No such file or directory 2025-04-24T11:22:47.669986-07:00 schallkreis plasmashell[3874]: pw.conf:
    can't load config client.conf: No such file or directory 2025-04-24T11:22:47.670011-07:00 schallkreis plasmashell[3874]: kpipewire_logging: Failed to create PipeWire context 2025-04-24T11:22:47.670024-07:00 schallkreis plasmashell[3874]: received
    error while creating the stream "Failed to create PipeWire context"
    Media monitor will not work.
    2025-04-24T11:22:52.523562-07:00 schallkreis plasmashell[3874]: pw.conf:
    can't load config client.conf: No such file or directory 2025-04-24T11:22:52.523678-07:00 schallkreis plasmashell[3874]: pw.conf:
    can't load config client.conf: No such file or directory 2025-04-24T11:22:52.523700-07:00 schallkreis plasmashell[3874]: kpipewire_logging: Failed to create PipeWire context 2025-04-24T11:22:52.523713-07:00 schallkreis plasmashell[3874]: received
    error while creating the stream "Failed to create PipeWire context"
    Media monitor will not work.
    2025-04-24T11:22:52.744151-07:00 schallkreis kernel: watchdog: BUG: soft
    lockup - CPU#16 stuck for 23s! [apcupsd:3183]
    2025-04-24T11:22:52.744159-07:00 schallkreis kernel: Modules linked in:
    qrtr nfsd auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc nls_ascii nls_cp437 vfat fat amd_atl intel_rapl_msr intel_rapl_common mt7925e mt7925_common mt792x_lib mt76_connac_lib edac_mce_amd
    snd_hda_codec_realtek mt76 kvm_amd snd_hda_codec_generic mac80211 snd_hda_scodec_component snd_hda_codec_hdmi kvm snd_hda_intel
    snd_intel_dspcfg libarc4 snd_intel_sdw_acpi snd_hda_codec cfg80211 crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_hda_core
    sha256_ssse3 sha1_ssse3 snd_hwdep aesni_intel snd_pcm gf128mul
    crypto_simd cryptd wmi_bmof snd_timer spd5118 rapl ccp snd pcspkr
    k10temp rfkill soundcore joydev evdev parport_pc ppdev lp parport
    efi_pstore configfs nfnetlink efivarfs ip_tables x_tables autofs4 ext4
    mbcache jbd2 crc32c_generic hid_generic usbhid hid amdgpu amdxcp
    drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_display_helper cec rc_core drm_ttm_helper ttm xhci_pci xhci_hcd drm_kms_helper ahci libahci libata drm thunderbolt r8169 usbcore nvme sp5100_tco realtek
    2025-04-24T11:22:52.744161-07:00 schallkreis kernel: scsi_mod watchdog mdio_devres nvme_core libphy crc32_pclmul crc32c_intel i2c_piix4
    i2c_smbus usb_common scsi_common nvme_auth video crc16 gpio_amdpt wmi gpio_generic button
    2025-04-24T11:22:52.744162-07:00 schallkreis kernel: CPU: 16 UID: 0 PID:
    3183 Comm: apcupsd Not tainted 6.12.21-amd64 #1 Debian 6.12.21-1 2025-04-24T11:22:52.744162-07:00 schallkreis kernel: Hardware name:
    System76 Thelio Mira/Thelio Mira, BIOS 3.11.SP01 12/05/2024 2025-04-24T11:22:52.744163-07:00 schallkreis kernel: RIP: 0010:pci_mmcfg_read+0xa4/0xe0
    2025-04-24T11:22:52.744164-07:00 schallkreis kernel: Code: fe 01 75 0b
    4c 01 e0 8a 00 0f b6 c0 89 45 00 e8 a2 6a 51 ff 31 c0 5b 5d 41 5c 41 5d
    41 5e 41 5f c3 cc cc cc cc 4c 01 e0 8b 00 <89> 45 00 eb e0 4c 01 e0 66
    8b 00 0f b7 c0 89 45 00 eb d2 e8 74 6a
    2025-04-24T11:22:52.744164-07:00 schallkreis kernel: RSP:
    0018:ffff9d0645f9f4b8 EFLAGS: 00000286
    2025-04-24T11:22:52.744164-07:00 schallkreis kernel: RAX:
    00000000ffffffff RBX: 0000000000f00000 RCX: 0000000000000ffc 2025-04-24T11:22:52.744165-07:00 schallkreis kernel: RDX:
    00000000000000ff RSI: 000000000000000f RDI: 0000000000000000 2025-04-24T11:22:52.744165-07:00 schallkreis kernel: RBP:
    ffff9d0645f9f4f4 R08: 0000000000000004 R09: ffff9d0645f9f4f4 2025-04-24T11:22:52.744170-07:00 schallkreis kernel: R10:
    000000000000000f R11: ffff9d0645f9f2e0 R12: 0000000000000ffc 2025-04-24T11:22:52.744170-07:00 schallkreis kernel: R13:
    0000000000000000 R14: 0000000000000004 R15: 0000000000000000 2025-04-24T11:22:52.744170-07:00 schallkreis kernel: FS:
    00007f8da22fc0c0(0000) GS:ffff8c8b5fa00000(0000) knlGS:0000000000000000 2025-04-24T11:22:52.744171-07:00 schallkreis kernel: CS: 0010 DS: 0000
    ES: 0000 CR0: 0000000080050033
    2025-04-24T11:22:52.744171-07:00 schallkreis kernel: CR2:
    00007f9fdd253680 CR3: 0000000111982000 CR4: 0000000000f50ef0 2025-04-24T11:22:52.744171-07:00 schallkreis kernel: PKRU: 55555554 2025-04-24T11:22:52.744172-07:00 schallkreis kernel: Call Trace: 2025-04-24T11:22:52.744172-07:00 schallkreis kernel: <IRQ> 2025-04-24T11:22:52.744172-07:00 schallkreis kernel: ? watchdog_timer_fn.cold+0x3d/0xa1
    2025-04-24T11:22:52.744173-07:00 schallkreis kernel: ? __pfx_watchdog_timer_fn+0x10/0x10
    2025-04-24T11:22:52.744174-07:00 schallkreis kernel: ? __hrtimer_run_queues+0x12f/0x2a0
    2025-04-24T11:22:52.744174-07:00 schallkreis kernel: ? hrtimer_interrupt+0xfa/0x210
    2025-04-24T11:22:52.744174-07:00 schallkreis kernel: ? __sysvec_apic_timer_interrupt+0x52/0x100
    2025-04-24T11:22:52.744175-07:00 schallkreis kernel: ? sysvec_apic_timer_interrupt+0x6c/0x90
    2025-04-24T11:22:52.744176-07:00 schallkreis kernel: </IRQ> 2025-04-24T11:22:52.744176-07:00 schallkreis kernel: <TASK> 2025-04-24T11:22:52.744176-07:00 schallkreis kernel: ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
    2025-04-24T11:22:52.744177-07:00 schallkreis kernel: ? pci_mmcfg_read+0xa4/0xe0
    2025-04-24T11:22:52.744177-07:00 schallkreis kernel: ? pci_mmcfg_read+0x4f/0xe0
    2025-04-24T11:22:52.744177-07:00 schallkreis kernel: trace_clock_x86_tsc+0x20/0x20
    2025-04-24T11:22:52.744178-07:00 schallkreis kernel: pci_find_next_ext_capability+0x89/0xf0
    2025-04-24T11:22:52.744178-07:00 schallkreis kernel: pci_restore_ltr_state+0x28/0x50
    2025-04-24T11:22:52.744179-07:00 schallkreis kernel: pci_restore_state.part.0+0x29/0x370
    2025-04-24T11:22:52.744179-07:00 schallkreis kernel: pci_pm_runtime_resume+0x45/0xf0
    2025-04-24T11:22:52.744179-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10
    2025-04-24T11:22:52.744179-07:00 schallkreis kernel:
    __rpm_callback+0x41/0x170
    2025-04-24T11:22:52.744180-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10
    2025-04-24T11:22:52.744180-07:00 schallkreis kernel:
    rpm_callback+0x55/0x60
    2025-04-24T11:22:52.744180-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10
    2025-04-24T11:22:52.744181-07:00 schallkreis kernel:
    rpm_resume+0x4d3/0x700
    2025-04-24T11:22:52.744181-07:00 schallkreis kernel: ? sched_clock_cpu+0xf/0x1d0
    2025-04-24T11:22:52.744181-07:00 schallkreis kernel: ? psi_group_change+0x138/0x300
    2025-04-24T11:22:52.744182-07:00 schallkreis kernel:
    rpm_resume+0x2d3/0x700
    2025-04-24T11:22:52.744183-07:00 schallkreis kernel: ?
    update_curr+0x98/0x280
    2025-04-24T11:22:52.744183-07:00 schallkreis kernel:
    rpm_resume+0x2d3/0x700
    2025-04-24T11:22:52.744183-07:00 schallkreis kernel:
    rpm_resume+0x2d3/0x700
    2025-04-24T11:22:52.744184-07:00 schallkreis kernel: __pm_runtime_resume+0x4b/0x80
    2025-04-24T11:22:52.744184-07:00 schallkreis kernel: usb_autoresume_device+0x1e/0x50 [usbcore]
    2025-04-24T11:22:52.744184-07:00 schallkreis kernel:
    usbdev_open+0x103/0x260 [usbcore]
    2025-04-24T11:22:52.744185-07:00 schallkreis kernel: ? bpf_prog_e8932b6bae2b9745_restrict_filesystems+0xb7/0x132 2025-04-24T11:22:52.744185-07:00 schallkreis kernel:
    chrdev_open+0xb2/0x230
    2025-04-24T11:22:52.744186-07:00 schallkreis kernel: ? __pfx_chrdev_open+0x10/0x10
    2025-04-24T11:22:52.744186-07:00 schallkreis kernel:
    do_dentry_open+0x14c/0x440
    2025-04-24T11:22:52.744186-07:00 schallkreis kernel: vfs_open+0x2e/0xe0 2025-04-24T11:22:52.744186-07:00 schallkreis kernel:
    path_openat+0x82e/0x12d0
    2025-04-24T11:22:52.744187-07:00 schallkreis kernel: ?
    kick_pool+0x5c/0x140
    2025-04-24T11:22:52.744187-07:00 schallkreis kernel:
    do_filp_open+0xc4/0x170
    2025-04-24T11:22:52.744188-07:00 schallkreis kernel:
    do_sys_openat2+0xae/0xe0
    2025-04-24T11:22:52.744188-07:00 schallkreis kernel:
    __x64_sys_openat+0x55/0xa0
    2025-04-24T11:22:52.744188-07:00 schallkreis kernel:
    do_syscall_64+0x82/0x190
    2025-04-24T11:22:52.744188-07:00 schallkreis kernel: ?
    vfs_read+0x15e/0x370
    2025-04-24T11:22:52.744189-07:00 schallkreis kernel: ?
    vfs_read+0x15e/0x370
    2025-04-24T11:22:52.744190-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-04-24T11:22:52.744190-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-04-24T11:22:52.744190-07:00 schallkreis kernel: ? terminate_walk+0x61/0x100
    2025-04-24T11:22:52.744190-07:00 schallkreis kernel: ? path_openat+0x495/0x12d0
    2025-04-24T11:22:52.744191-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-04-24T11:22:52.744191-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210
    2025-04-24T11:22:52.744191-07:00 schallkreis kernel: ?
    do_filp_open+0xc4/0x170
    2025-04-24T11:22:52.744192-07:00 schallkreis kernel: ? __rseq_handle_notify_resume+0xa2/0x4a0
    2025-04-24T11:22:52.744192-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x172/0x210
    2025-04-24T11:22:52.744193-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-04-24T11:22:52.744194-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190
    2025-04-24T11:22:52.744194-07:00 schallkreis kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
    2025-04-24T11:22:52.744194-07:00 schallkreis kernel: RIP:
    0033:0x7f8da23979ee
    2025-04-24T11:22:52.744195-07:00 schallkreis kernel: Code: 08 0f 85 f5
    4b ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44
    24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00
    00 00 0f 1f 80 00 00 00 00 48 83 ec 08
    2025-04-24T11:22:52.744195-07:00 schallkreis kernel: RSP:
    002b:00007fff105c8718 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 2025-04-24T11:22:52.744196-07:00 schallkreis kernel: RAX:
    ffffffffffffffda RBX: 00007f8da22fc0c0 RCX: 00007f8da23979ee 2025-04-24T11:22:52.964717-07:00 schallkreis kernel: RDX:
    0000000000080002 RSI: 00005608b22ab500 RDI: ffffffffffffff9c 2025-04-24T11:22:52.964718-07:00 schallkreis kernel: RBP:
    0000000000000007 R08: 0000000000000000 R09: 0000000000000000 2025-04-24T11:22:52.964718-07:00 schallkreis kernel: R10:
    0000000000000000 R11: 0000000000000246 R12: 00007fff105c8850 2025-04-24T11:22:52.964719-07:00 schallkreis kernel: R13:
    00007fff105c8800 R14: 0000560886d392cb R15: 0000000000000004 2025-04-24T11:22:52.964719-07:00 schallkreis kernel: </TASK>

    Perhaps this

    2025-04-24T11:22:52.744179-07:00 schallkreis kernel: pci_pm_runtime_resume+0x45/0xf0
    2025-04-24T11:22:52.744179-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10
    2025-04-24T11:22:52.744179-07:00 schallkreis kernel:
    __rpm_callback+0x41/0x170
    2025-04-24T11:22:52.744180-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10
    2025-04-24T11:22:52.744180-07:00 schallkreis kernel:
    rpm_callback+0x55/0x60
    2025-04-24T11:22:52.744180-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10

    is the output of the actual bug? That's where the stack which printed
    out over and over and the failure stack originated.

    Best wishes,

    Alison Chaiken

    ---
    Alison Chaiken alison@she-devel.com
    https://she-devel.com
    "What respite from her thrilling toil did Beauty ever take -- But Work
    might be Electric Rest To those that Magic make" -- Emily Dickson <html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body style='font-size: 11pt; font-family: "Times New Roman",Times,serif'>
    <p>The tracer at <a href="https://github.com/chaiken/BPF-sandbox/commit/e0dda39cbe92e0f80805a2a06aa80d6fb3b065b2">https://github.com/chaiken/BPF-sandbox/commit/e0dda39cbe92e0f80805a2a06aa80d6fb3b065b2</a> finally was running when apcupsd triggered the
    lockup.&nbsp;&nbsp; The stack below has appeared over and over for days.&nbsp; The time output in the trace has a 1000x math error:</p>
    <p>2025-04-24T11:20:23.394893-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: pci_mmcfg_read() for comm apcupsd with tid 3183 took 222794 s<br />2025-04-24T11:20:38.809232-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: apcup