Categories
DailyOps

How to handle low memory conditions on desktop

Handle low memory conditions on desktop using a nohang a sophisticated low memory handler for Linux in Python.

Inspect nohang utility package.

$ apt info nohang
Package: nohang
Version: 0.2.0-1
Priority: optional
Section: admin
Maintainer: Yangfl 
Installed-Size: 269 kB
Pre-Depends: init-system-helpers (>= 1.54~)
Depends: python3:any
Suggests: libnotify-bin, sudo
Homepage: https://github.com/hakavlad/nohang
Download-Size: 51.5 kB
APT-Sources: http://deb.debian.org/debian bullseye/main amd64 Packages
Description: sophisticated low memory handler for Linux
 nohang is a highly configurable daemon for Linux which is able to correctly
 prevent out of memory (OOM) and keep system responsiveness in low memory
 conditions.

Install nohang utility.

$ sudo apt install nohang

Inspect service status.

$ sudo systemctl status nohang
[Service]  
ExecStart=/usr/sbin/nohang --monitor --config /etc/nohang/nohang-desktop.conf
$ sudo systemctl status nohang
● nohang.service - Sophisticated low memory handler
     Loaded: loaded (/lib/systemd/system/nohang.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2021-07-29 17:25:20 UTC; 8s ago
       Docs: man:nohang(8)
             https://github.com/hakavlad/nohang
   Main PID: 1795 (nohang)
      Tasks: 1 (limit: 25)
     Memory: 10.8M (max: 100.0M swap max: 100.0M)
        CPU: 124ms
     CGroup: /hostcritical.slice/nohang.service
             └─1795 /usr/bin/python3 /usr/sbin/nohang --monitor --config /etc/nohang/nohang.conf

Jul 29 17:25:20 bullseye systemd[1]: Started Sophisticated low memory handler.
Jul 29 17:25:20 bullseye nohang[1795]: Starting nohang with config /etc/nohang/nohang.conf
Jul 29 17:25:20 bullseye nohang[1795]: Monitoring has started!

Inspect configuration files.

$ ls /etc/nohang/
nohang-desktop.conf  nohang.conf

Start with a dedicated configuration in case this is a desktop machine. Copy configuration files instead of creating service override as this is way easier.

$ sudo cp /etc/nohang/{nohang.conf,nohang.conf.backup}
$ sudo cp /etc/nohang/{nohang-desktop.conf,nohang.conf}
$ sudo systemctl restart nohang

Check configuration after every modification.

$ sudo nohang --check --config /etc/nohang/nohang.conf 
Starting nohang with config /etc/nohang/nohang.conf

0. Check kernel messages for OOM events
    @check_kmsg:    
    @debug_kmsg:    

1. Common zram settings
    zram_checking_enabled:   False

2. Common PSI settings
    psi_checking_enabled:    True
    psi_path:                /proc/pressure/memory
    psi_metrics:             full_avg10
    psi_excess_duration:     30.0 sec
    psi_post_action_delay:   15.0 sec

3. Poll rate
    fill_rate_mem:   6000.0
    fill_rate_swap:  2000.0
    fill_rate_zram:  4000.0
    max_sleep:       3.0 sec
    min_sleep:       0.1 sec

4. Warnings and notifications
    post_action_gui_notifications:  True
    hide_corrective_action_type:    False
    low_memory_warnings_enabled:    True
    warning_exe:                    
    warning_threshold_min_mem:      3182 MiB, 20.0 %
    warning_threshold_min_swap:     25 %
    warning_threshold_max_zram:     7160 MiB, 45.0 %
    warning_threshold_max_psi:      10.0
    min_post_warning_delay:         60.0 sec
    env_cache_time:                 300.0

5. Soft threshold
    soft_threshold_min_mem:   796 MiB, 5.0 %
    soft_threshold_min_swap:  10 %
    soft_threshold_max_zram:  8751 MiB, 55.0 %
    soft_threshold_max_psi:   40.0

6. Hard threshold
    hard_threshold_min_mem:   318 MiB, 2.0 %
    hard_threshold_min_swap:  4 %
    hard_threshold_max_zram:  9547 MiB, 60.0 %
    hard_threshold_max_psi:   90.0

7. Customize victim selection: adjusting badness of processes

7.1. Ignore positive oom_score_adj
    ignore_positive_oom_score_adj:  False

7.2. Adjusting badness of processes by matching with regular expressions
7.2.1. Matching process names with RE patterns
    badness_adj:  regexp:
             200  ^(Web Content|Privileged Cont|file:// Content)$
            -200  ^(dnf|yum|packagekitd)$
7.2.2. Matching CGroup_v1-line with RE patterns
    (not set)
7.2.3. Matching CGroup_v2-line with RE patterns
    (not set)
7.2.4. Matching eUIDs with RE patterns
    (not set)
7.2.5. Matching realpath with RE patterns
    badness_adj:  regexp:
            -200  ^(/usr/libexec/Xorg|/usr/lib/xorg/Xorg|/usr/lib/Xorg|/usr/bin/X|/usr/bin/Xorg|/usr/bin/Xwayland|/usr/bin/weston|/usr/bin/sway)$
            -200  ^(/usr/bin/gnome-shell|/usr/bin/metacity|/usr/bin/mutter|/usr/lib/gnome-session/gnome-session-binary|/usr/libexec/gnome-session-binary|/usr/libexec/gnome-session-ctl)$
            -200  ^(/usr/bin/plasma-desktop|/usr/bin/plasmashell|/usr/bin/plasma_session|/usr/bin/kwin|/usr/bin/kwin_x11|/usr/bin/kwin_wayland)$
            -200  ^(/usr/bin/startplasma-wayland|/usr/lib/x86_64-linux-gnu/libexec/startplasma-waylandsession|/usr/bin/ksmserver)$
            -200  ^(/usr/bin/cinnamon|/usr/bin/muffin|/usr/bin/cinnamon-session|/usr/bin/cinnamon-launcher)$
            -200  ^(/usr/bin/xfwm4|/usr/bin/xfce4-session|/usr/bin/xfce4-panel|/usr/bin/xfdesktop)$
            -200  ^(/usr/bin/marco|/usr/bin/mate-session|/usr/bin/caja|/usr/bin/mate-panel)$
            -200  ^(/usr/bin/lxqt-panel|/usr/bin/pcmanfm-qt|/usr/bin/lxqt-session)$
            -200  ^(/usr/bin/budgie-wm|/usr/bin/budgie-panel)$
            -200  ^(/usr/bin/compiz|/usr/bin/openbox|/usr/bin/fluxbox|/usr/bin/awesome|/usr/bin/icewm|/usr/bin/enlightenment|/usr/bin/gala|/usr/bin/wingpanel|/usr/bin/i3)$
            -200  ^(/usr/sbin/gdm|/usr/sbin/gdm3|/usr/sbin/sddm|/usr/bin/sddm|/usr/lib/x86_64-linux-gnu/sddm/sddm-helper|/usr/bin/slim|/usr/sbin/lightdm|/usr/libexec/gdm-session-worker|/usr/libexec/gdm-wayland-session|/usr/lib/gdm3/gdm-wayland-session|/usr/lib/gdm3/gdm-session-worker)$
            -200  ^/usr/lib/gdm3/
            -200  ^(/lib/systemd/systemd-logind|/usr/lib/systemd/systemd-logind)$
            -200  ^(/lib/systemd/systemd|/usr/lib/systemd/systemd)$
            -200  ^(/usr/bin/dbus-daemon|/usr/bin/dbus-run-session|/usr/bin/dbus-broker-launcher|/usr/bin/dbus-broker)$
            -200  ^(/usr/bin/calamares|/usr/bin/dpkg|/usr/bin/pacman|/usr/bin/yay|/usr/bin/pamac|/usr/bin/pamac-daemon|/usr/bin/pamac-manager)$
7.2.6. Matching /proc/[pid]/cwd realpath with RE patterns
    (not set)
7.2.7. Matching cmdlines with RE patterns
    (not set)
7.2.8. Matching environ with RE patterns
    (not set)

8. Customize soft corrective actions
    (not set)

9. Misc
    max_soft_exit_time:         10.0 sec
    post_kill_exe:              
    min_badness:                1
    post_soft_action_delay:     3.0 sec
    post_zombie_delay:          0.1 sec
    victim_cache_time:          10.0 sec
    exe_timeout:                20.0 sec

10. Verbosity
    print_config_at_startup:    False
    print_mem_check_results:    False
    min_mem_report_interval:    60.0 sec
    print_proc_table:           False
    extra_table_info:           None
    print_victim_status:        True
    print_victim_cmdline:       False
    max_victim_ancestry_depth:  3
    print_statistics:           True
    debug_gui_notifications:    False
    debug_psi:                  False
    debug_sleep:                False
    debug_threading:            False
    separate_log:               False

config is OK

Inspect tasks.

$ sudo nohang --tasks --config /etc/nohang/nohang.conf 
Starting nohang with config /etc/nohang/nohang.conf
Tasks state (memory values in mebibytes):
###########################################################################################################
#    PID     PPID  badness  oom_score  oom_score_adj        eUID  S  VmSize  VmRSS  VmSwap  Name             
#-------  -------  -------  ---------  -------------  ----------  -  ------  -----  ------  ---------------
#    384        1      512        512           -250           0  S      39     16       0  systemd-journal  
#    410        1        0          0          -1000           0  S      22      6       0  systemd-udevd    
#    435        1      668        668              0           0  S       5      3       0  cron             
#    436        1        0         70           -900         104  S       8      4       0  dbus-daemon      
#    442        1      677        677              0           0  S      20     16       0  nohang           
#    444        1      669        669              0           0  S     216      4       0  rsyslogd         
#    449        1      471        671              0           0  S      21      7       0  systemd-logind   
#    478        1      670        670              0           0  S      97      5       0  dhclient         
#    587        1      678        678              0           0  S      26     17       0  upgrade-check-nexus  
#    588        1      667        667              0           0  S       3      2       0  agetty           
#    590        1      668        668              0         105  S      19      3       0  chronyd          
#    591        1       36         36           -999           0  S    1022     50       0  containerd       
#    594      590      668        668              0         105  S      11      3       0  chronyd          
#    600        1        0          0          -1000           0  S      13      7       0  sshd             
#    643        1      399        399           -500           0  S    1359     95       0  dockerd          
#    807      600      672        672              0           0  S      14      9       0  sshd             
#    810        1      472        672              0        1000  S      15      8       0  systemd          
#    811      810      468        668              0        1000  S      99      2       0  (sd-pam)         
#    835      807      670        670              0        1000  S      14      6       0  sshd             
#    836      835      669        669              0        1000  S       7      5       0  bash             
#    849      836      669        669              0           0  S       7      4       0  sudo             
###########################################################################################################
Found 21 tasks with non-zero VmRSS (except init and self)
Process with highest badness (found in 9ms):
  PID: 587, Name: upgrade-check-nexus, badness: 678

Perform a test.

$ sudo nohang --memload
Enter the numbers 298700 to confirm that you are not a robot: 298700
--------------------------------------------------------------------
Warning! The process will consume memory until 40 MiB of memory
(MemAvailable + SwapFree) remain free, and it will be terminated via SIGUSR1
at the end. This may cause the system to freeze and processes to terminate.
Do you want to continue? [No/Yes] Yes
Memory consumption has started!
MemAvailable: 761 MiB, SwapFree: 57 MiB             
zsh: terminated  sudo nohang --memload

Inspect logs.

$ sudo journalctl -u nohang
-- Journal begins at Sun 2021-05-16 20:11:42 CEST, ends at Thu 2021-07-29 20:08:25 CEST. --
lip 29 18:02:29 desktop systemd[1]: Started Sophisticated low memory handler.
lip 29 18:02:29 desktop nohang[4432]: Starting nohang with config /etc/nohang/nohang.conf
lip 29 18:02:29 desktop nohang[4432]: Monitoring has started!
lip 29 18:03:41 desktop nohang[4432]: >>=== STARTING implement_corrective_action() ====>>
lip 29 18:03:41 desktop nohang[4432]: Memory status that requires corrective actions:
lip 29 18:03:41 desktop nohang[4432]:   MemAvailable [25 MiB, 0.2 %] <= soft_threshold_min_mem [796 MiB, 5.0 %]
lip 29 18:03:41 desktop nohang[4432]:   SwapFree [94 MiB, 9.6 %] <= soft_threshold_min_swap [98 MiB, 10.0 %]
lip 29 18:03:41 desktop nohang[4432]: Found 132 tasks with non-zero oom_score (except init and self) in 3ms
lip 29 18:03:41 desktop nohang[4432]: TOP-15 tasks by badness:
lip 29 18:03:41 desktop nohang[4432]:   Name                PID badness
lip 29 18:03:41 desktop nohang[4432]:   --------------- ------- -------
lip 29 18:03:41 desktop nohang[4432]:   nohang             5470    1202
lip 29 18:03:41 desktop nohang[4432]:   firefox            3448     684
lip 29 18:03:41 desktop nohang[4432]:   Web Content        3731     676
lip 29 18:03:41 desktop nohang[4432]:   snap-store         2713     675
lip 29 18:03:41 desktop nohang[4432]:   Web Content        3757     674
lip 29 18:03:41 desktop nohang[4432]:   gnome-shell        2458     673
lip 29 18:03:41 desktop nohang[4432]:   WebExtensions      3978     673
lip 29 18:03:41 desktop nohang[4432]:   Web Content        3690     672
lip 29 18:03:41 desktop nohang[4432]:   Web Content        3776     672
lip 29 18:03:41 desktop nohang[4432]:   Xorg               2292     670
lip 29 18:03:41 desktop nohang[4432]:   Web Content        3679     670
lip 29 18:03:41 desktop nohang[4432]:   Web Content        3710     670
lip 29 18:03:41 desktop nohang[4432]:   Web Content        3737     670
lip 29 18:03:41 desktop nohang[4432]:   Web Content        3751     670
lip 29 18:03:41 desktop nohang[4432]:   Privileged Cont    3564     668
lip 29 18:03:41 desktop nohang[4432]: TOP printed in 0ms; process with highest badness:
lip 29 18:03:41 desktop nohang[4432]:   PID: 5470, name: nohang, badness: 1202
lip 29 18:03:41 desktop nohang[4432]: Recheck memory levels...
lip 29 18:03:41 desktop nohang[4432]: Memory status that requires corrective actions:
lip 29 18:03:41 desktop nohang[4432]:   MemAvailable [25 MiB, 0.2 %] <= soft_threshold_min_mem [796 MiB, 5.0 %]
lip 29 18:03:41 desktop nohang[4432]:   SwapFree [90 MiB, 9.2 %] <= soft_threshold_min_swap [98 MiB, 10.0 %]
lip 29 18:03:41 desktop nohang[4432]: Victim status (found in 2ms):
lip 29 18:03:41 desktop nohang[4432]:   PID:       5470, name: nohang, state: D (disk sleep), EUID: 1000, SID: 3633 (zsh), lifetime: 32.4s
lip 29 18:03:41 desktop nohang[4432]:   badness:   1202, oom_score:  1202, oom_score_adj:  0
lip 29 18:03:41 desktop nohang[4432]:   Vm, MiB:   Size: 13580, RSS: 13568 (Anon: 13562, File: 6, Shmem: 0), Swap: 0
lip 29 18:03:41 desktop nohang[4432]:   cgroup_v1: /user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-4a52582e-1de>
lip 29 18:03:41 desktop nohang[4432]:   cgroup_v2: /user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-4a52582e-1de>
lip 29 18:03:41 desktop nohang[4432]:   ancestry:  PID 3633 (zsh) <= PID 3632 (tmux: server) <= PID 2191 (systemd)
lip 29 18:03:41 desktop nohang[4432]:   exe realpath: /usr/bin/python3.9
lip 29 18:03:41 desktop nohang[4432]:   cwd realpath: /home/milosz
lip 29 18:03:41 desktop nohang[4432]: Memory info, MiB:
lip 29 18:03:41 desktop nohang[4432]:   total=15911, used=15508, free=131, available=27, shared=146, buffers=2, cache=270,
lip 29 18:03:41 desktop nohang[4432]:   swap_total=980, swap_used=890, swap_free=89
lip 29 18:03:41 desktop nohang[4432]: Memory pressure (system-wide):
lip 29 18:03:41 desktop nohang[4432]:   some avg10=14.06 avg60=2.60 avg300=0.54
lip 29 18:03:41 desktop nohang[4432]:   full avg10=7.77 avg60=1.43 avg300=0.30
lip 29 18:03:41 desktop nohang[4432]: Implementing a corrective action:
lip 29 18:03:41 desktop nohang[4432]:   Sending SIGTERM to the victim
lip 29 18:03:41 desktop nohang[4432]: OK; total response time: 12ms

Additional notes

There are additional utilities included in this package, but the above-mentioned commands cover everything.

Customize configuration file and check it to verify modifications. It is well documented. Remember to restart the service afterward.

This or a similar OOM helper is a must on a Linux desktop. I have lost countless hours due to a lack of system responsiveness during low memory conditions.