Categories
DailyOps

How to fix i801 SMBus interrupt storm

Today I have installed Ubuntu Jammy Jellyfish (release 22.04) on Acer Aspire One (A114-32-P991), but the i801 SMBus interrupt storm made this system unusable.

Operating system version.

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu Jammy Jellyfish (development branch)
Release:	22.04
Codename:	jammy

Kernel version and architecture.

$ uname -a
Linux laptop 5.15.0-25-generic #25-Ubuntu SMP Wed Mar 30 15:54:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

The issue is apparent through constant operating system freezes due to CPU soft lockups.

Apr 17 20:15:16 laptop kernel: [ 1732.712998] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [swapper/3:0]

These lockups are caused by the unusually high number of i801_smbus interrupts.

$ cat /proc/interrupts 
            CPU0       CPU1       CPU2       CPU3       
   0:         12          0          0          0  IR-IO-APIC    2-edge      timer
   1:          0          0        114          0  IR-IO-APIC    1-edge      i8042
   8:          0          0          0          1  IR-IO-APIC    8-fasteoi   rtc0
   9:          0         35          0          0  IR-IO-APIC    9-fasteoi   acpi
  14:          0          2          0          0  IR-IO-APIC   14-fasteoi   INT3453:00, INT3453:01, INT3453:03
  15:          0          0          0          0  IR-IO-APIC   15-fasteoi   INT3453:02
  20:          0          0          0    2937054  IR-IO-APIC   20-fasteoi   i801_smbus
  31:        911          0          0          0  IR-IO-APIC   31-fasteoi   idma64.0, i2c_designware.0
  39:          0          0       5148          0  IR-IO-APIC   39-fasteoi   mmc0
 120:          0          0          0          0  DMAR-MSI    0-edge      dmar0
 121:          0          0          0          0  DMAR-MSI    1-edge      dmar1
 122:          0          0          0          0  IR-PCI-MSI 311296-edge      PCIe PME
 123:          0          0          0          0  IR-PCI-MSI 315392-edge      PCIe PME
 124:          0          0          0          0  IR-PCI-MSI 317440-edge      PCIe PME
 125:          0          0          0          3  IR-PCI-MSI 1048576-edge      rtsx_pci
 126:       2143          0          0          0  IR-PCI-MSI 344064-edge      xhci_hcd
 128:          0          0          0          0  IR-PCI-MSI 294912-edge      ahci[0000:00:12.0]
 129:          0          1          0          0  IR-PCI-MSI 1050624-edge      enp2s0f1
 130:          0        102          0          0  IR-PCI-MSI 32768-edge      i915
 131:          0          0         36          0  IR-PCI-MSI 245760-edge      mei_me
 132:          0          0          0         29  IR-PCI-MSI 1572864-edge      ath10k_pci
 133:        419          0          0          0  IR-PCI-MSI 229376-edge      snd_hda_intel:card0
 NMI:          4          9          1          0   Non-maskable interrupts
 LOC:      10190       5306      10085       9476   Local timer interrupts
 SPU:          0          0          0          0   Spurious interrupts
 PMI:          4          9          1          0   Performance monitoring interrupts
 IWI:          8         19          0          0   IRQ work interrupts
 RTR:          0          0          0          0   APIC ICR read retries
 RES:       1095       1011       1710       1309   Rescheduling interrupts
 CAL:       6118       9384       7753       5489   Function call interrupts
 TLB:          0          6          3          1   TLB shootdowns
 TRM:          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0   Threshold APIC interrupts
 DFR:          0          0          0          0   Deferred Error APIC interrupts
 MCE:          0          0          0          0   Machine check exceptions
 MCP:          1          2          2          2   Machine check polls
 ERR:          0
 MIS:          0
 PIN:          0          0          0          0   Posted-interrupt notification event
 NPI:          0          0          0          0   Nested posted-interrupt event
 PIW:          0          0          0          0   Posted-interrupt wakeup event

Temporary solution is described at Soft lockup due to interrupt storm from smbus.

Inspect i2c-i801 module parameters.

$ modinfo --parameters i2c-i801 
disable_features:Disable selected driver features:
		  0x01  disable SMBus PEC
		  0x02  disable the block buffer
		  0x08  disable the I2C block read functionality
		  0x10  don't use interrupts
		  0x20  disable SMBus Host Notify  (uint)

You can stop using interrupts for this module by defining modprobe configuration, but this was too late for me as I couldn’t decrypt system disk to read this at boot.

$ cat <<EOF | sudo tee /etc/modprobe.d/i2c-i801.conf
options i2c-i801 disable_features=0x10
EOF

The better solution is to alter default GRUB configuration.

$ cat  /etc/default/grub
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=hidden
GRUB_TIMEOUT=0
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="i2c-i801.disable_features=0x10 quiet splash"
GRUB_CMDLINE_LINUX=""

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"

Update GRUB configuration.

$ sudo update-grub
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/init-select.cfg'
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.15.0-25-generic
Found initrd image: /boot/initrd.img-5.15.0-25-generic
Memtest86+ needs a 16-bit boot, that is not available on EFI, exiting
Warning: os-prober will not be executed to detect other bootable partitions.
Systems on them will not be added to the GRUB boot configuration.
Check GRUB_DISABLE_OS_PROBER documentation entry.
Adding boot menu entry for UEFI Firmware Settings ...
done

Reboot operating system.

$ sudo reboot

Everything should work as expected as there is no interrupt storm.

$ cat /proc/interrupts
 
            CPU0       CPU1       CPU2       CPU3       
   0:         12          0          0          0  IR-IO-APIC    2-edge      timer
   1:          0          0       4805          0  IR-IO-APIC    1-edge      i8042
   8:          0          0          0          1  IR-IO-APIC    8-fasteoi   rtc0
   9:          0       1652          0          0  IR-IO-APIC    9-fasteoi   acpi
  14:          0     183955          0          0  IR-IO-APIC   14-fasteoi   INT3453:00, INT3453:01, INT3453:03
  15:          0          0          0          0  IR-IO-APIC   15-fasteoi   INT3453:02
  31:    2248499          0          0          0  IR-IO-APIC   31-fasteoi   idma64.0, i2c_designware.0
  39:          0          0     363801          0  IR-IO-APIC   39-fasteoi   mmc0
 120:          0          0          0          0  DMAR-MSI    0-edge      dmar0
 121:          0          0          0          0  DMAR-MSI    1-edge      dmar1
 122:          0          0          0          0  IR-PCI-MSI 311296-edge      PCIe PME
 123:          0          0          0          0  IR-PCI-MSI 315392-edge      PCIe PME
 124:          0          0          0          0  IR-PCI-MSI 317440-edge      PCIe PME
 125:          0          0          0          3  IR-PCI-MSI 1048576-edge      rtsx_pci
 126:          0       4875          0          0  IR-PCI-MSI 344064-edge      xhci_hcd
 127:          0     183174          0          0  INT3453:00   18  ELAN0503:00
 128:          0          0          0          0  IR-PCI-MSI 294912-edge      ahci[0000:00:12.0]
 129:          0          0          0          0  IR-PCI-MSI 1050624-edge      enp2s0f1
 130:        487          0     987502          0  IR-PCI-MSI 32768-edge      i915
 131:          0         38          0          0  IR-PCI-MSI 245760-edge      mei_me
 132:          0      78403        500          0  IR-PCI-MSI 1572864-edge      ath10k_pci
 133:          0          0          0       1767  IR-PCI-MSI 229376-edge      snd_hda_intel:card0
 NMI:        103        111        115        107   Non-maskable interrupts
 LOC:     740998     739429     839688     708769   Local timer interrupts
 SPU:          0          0          0          0   Spurious interrupts
 PMI:        103        111        115        107   Performance monitoring interrupts
 IWI:      10798      10253     324600       9684   IRQ work interrupts
 RTR:          0          0          0          0   APIC ICR read retries
 RES:      64975      71046     133738      60757   Rescheduling interrupts
 CAL:      56447      91761      63249      58625   Function call interrupts
 TLB:      19908      28139      27394      26391   TLB shootdowns
 TRM:          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0   Threshold APIC interrupts
 DFR:          0          0          0          0   Deferred Error APIC interrupts
 MCE:          0          0          0          0   Machine check exceptions
 MCP:          9         10         10         10   Machine check polls
 ERR:          0
 MIS:          0
 PIN:          0          0          0          0   Posted-interrupt notification event
 NPI:          0          0          0          0   Nested posted-interrupt event
 PIW:          0          0          0          0   Posted-interrupt wakeup event

Nice!