Increase the system-wide number of available file handles (open files).

Recently, I could not connect to an example.org server using OpenSSH client as it just stopped responding instead of giving me a shell prompt.

$ ssh example.org -vv
[...]
debug2: languages ctos:
debug2: languages stoc:
debug2: first_kex_follows 0
debug2: reserved 0
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: ecdsa-sha2-nistp256
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ecdsa-s2a2-nistp256 SHA256:02f7XfMGsmlj94TAm9LqxS1rOHcL0GUc1Phwz0gXkigw
debug1: Host 'example.org' is known and matches the ECDSA host key.
debug1: Found key in /home/milosz/.ssh/known_hosts:885
debug2: set_newkeys: mode 1
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug2: set_newkeys: mode 0
debug1: rekey after 134217728 blocks
debug2: key: /home/milosz/.ssh/id_rsa (0x55ed2a3ecee0)
debug2: key: /home/milosz/.ssh/id_dsa ((nil))
debug2: key: /home/milosz/.ssh/id_ecdsa ((nil))
debug2: key: /home/milosz/.ssh/id_ed25519 ((nil))
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<rsa-sha2-256,rsa-sha2-512>
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic
debug1: Next authentication method: gssapi-keyex
debug1: No valid Key exchange context
debug2: we did not send a packet, disable method
debug1: Next authentication method: gssapi-with-mic
debug1: Unspecified GSS failure.  Minor code may provide more information
No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_1000)
debug1: Unspecified GSS failure.  Minor code may provide more information
No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_1000)
debug2: we did not send a packet, disable method
debug1: Next authentication method: publickey
debug1: Offering public key: RSA SHA256:wqGEmK0hE2F0rQaTEg1v0sVXGIN913A8aGZEd3Zmz6QQ /home/milosz/.ssh/id_rsa
debug2: we sent a publickey packet, wait for reply
debug1: Server accepts key: pkalg rsa-sha2-512 blen 535
debug2: input_userauth_pk_ok: fp SHA256:wqGEmK0hE2F0rQaTEg1v0sVXGIN913A8aGZEd3Zmz6QQ
Enter passphrase for key '/home/milosz/.ssh/id_rsa': ****************
debug1: Authentication succeeded (publickey).
Authenticated to example.org ([172.16.0.18]:22).
debug1: channel 0: new [client-session]
debug2: channel 0: send open
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: pledge: network
[There is nothing more.]
</rsa-sha2-256,rsa-sha2-512>

The real culprit was the exceeded number of available file handles.

$ dmesg
[...]
[5512385.576209] VFS: file-max limit 100000 reached
[5512385.635161] VFS: file-max limit 100000 reached
[5512385.650384] VFS: file-max limit 100000 reached
[...]

Other running services were affected as well.

[2020-01-17T23:56:48,463][WARN ][i.n.c.DefaultChannelPipeline] [NJj8BTd] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.io.IOException: Too many open files in system
	at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ~[?:?]
	at sun.nio.ch.ServerSocketChannelImpl.accept(Unknown Source) ~[?:?]
	at sun.nio.ch.ServerSocketChannelImpl.accept(Unknown Source) ~[?:?]
	at io.netty.util.internal.SocketUtils$5.run(SocketUtils.java:110) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.util.internal.SocketUtils$5.run(SocketUtils.java:107) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:?]
	at io.netty.util.internal.SocketUtils.accept(SocketUtils.java:107) ~[netty-common-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.socket.nio.NioServerSocketChannel.doReadMessages(NioServerSocketChannel.java:143) ~[netty-transport-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:75) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final]
	at java.lang.Thread.run(Unknown Source) [?:?]

Display the system-wide number of available file handles.

$ sudo sysctl fs.file-max
fs.file-max = 100000

Display the number of used file-handles. The first number denotes allocated file-handles, the second allocated, but unused file handles, and the third one the maximum number of file handles.

$ sudo sysctl fs.file-nr
fs.file-nr = 100128     0       100000

Increase the maximum number of file handles.

$ sudo sysctl -w fs.file-max=500000
fs.file-max = 500000

Make this change persistent across a reboot.

$ echo "fs.file-max = 500000" | sudo tee /etc/sysctl.d/local-file-max.conf

Verify the system-wide number of available file handles.

$ sudo sysctl fs.file-max
fs.file-max = 500000

Display the top eight processes using the most file handles to determine the next step.

$ lsof | awk '{print $1}' | sort | uniq -c | sort -r | head -8
  89712 mysqld
  61587 java
  14475 varnishd
   1102 proxysql
    760 php-fpm
    688 redis-server
    410 nginx
    348 tuned

Additional information

Please read documentation for /proc/sys/fs/* for more information.

ko-fi