Increase the system-wide number of available file handles (open files).
Recently, I could not connect to an example.org
server using OpenSSH client as it just stopped responding instead of giving me a shell prompt.
$ ssh example.org -vv
[...] debug2: languages ctos: debug2: languages stoc: debug2: first_kex_follows 0 debug2: reserved 0 debug1: kex: algorithm: curve25519-sha256 debug1: kex: host key algorithm: ecdsa-sha2-nistp256 debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none debug1: expecting SSH2_MSG_KEX_ECDH_REPLY debug1: Server host key: ecdsa-s2a2-nistp256 SHA256:02f7XfMGsmlj94TAm9LqxS1rOHcL0GUc1Phwz0gXkigw debug1: Host 'example.org' is known and matches the ECDSA host key. debug1: Found key in /home/milosz/.ssh/known_hosts:885 debug2: set_newkeys: mode 1 debug1: rekey after 134217728 blocks debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS debug1: SSH2_MSG_NEWKEYS received debug2: set_newkeys: mode 0 debug1: rekey after 134217728 blocks debug2: key: /home/milosz/.ssh/id_rsa (0x55ed2a3ecee0) debug2: key: /home/milosz/.ssh/id_dsa ((nil)) debug2: key: /home/milosz/.ssh/id_ecdsa ((nil)) debug2: key: /home/milosz/.ssh/id_ed25519 ((nil)) debug1: SSH2_MSG_EXT_INFO received debug1: kex_input_ext_info: server-sig-algs=<rsa-sha2-256,rsa-sha2-512> debug2: service_accept: ssh-userauth debug1: SSH2_MSG_SERVICE_ACCEPT received debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic debug1: Next authentication method: gssapi-keyex debug1: No valid Key exchange context debug2: we did not send a packet, disable method debug1: Next authentication method: gssapi-with-mic debug1: Unspecified GSS failure. Minor code may provide more information No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_1000) debug1: Unspecified GSS failure. Minor code may provide more information No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_1000) debug2: we did not send a packet, disable method debug1: Next authentication method: publickey debug1: Offering public key: RSA SHA256:wqGEmK0hE2F0rQaTEg1v0sVXGIN913A8aGZEd3Zmz6QQ /home/milosz/.ssh/id_rsa debug2: we sent a publickey packet, wait for reply debug1: Server accepts key: pkalg rsa-sha2-512 blen 535 debug2: input_userauth_pk_ok: fp SHA256:wqGEmK0hE2F0rQaTEg1v0sVXGIN913A8aGZEd3Zmz6QQ Enter passphrase for key '/home/milosz/.ssh/id_rsa': **************** debug1: Authentication succeeded (publickey). Authenticated to example.org ([172.16.0.18]:22). debug1: channel 0: new [client-session] debug2: channel 0: send open debug1: Requesting no-more-sessions@openssh.com debug1: Entering interactive session. debug1: pledge: network [There is nothing more.] </rsa-sha2-256,rsa-sha2-512>
The real culprit was the exceeded number of available file handles.
$ dmesg
[...] [5512385.576209] VFS: file-max limit 100000 reached [5512385.635161] VFS: file-max limit 100000 reached [5512385.650384] VFS: file-max limit 100000 reached [...]
Other running services were affected as well.
[2020-01-17T23:56:48,463][WARN ][i.n.c.DefaultChannelPipeline] [NJj8BTd] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception. java.io.IOException: Too many open files in system at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ~[?:?] at sun.nio.ch.ServerSocketChannelImpl.accept(Unknown Source) ~[?:?] at sun.nio.ch.ServerSocketChannelImpl.accept(Unknown Source) ~[?:?] at io.netty.util.internal.SocketUtils$5.run(SocketUtils.java:110) ~[netty-common-4.1.32.Final.jar:4.1.32.Final] at io.netty.util.internal.SocketUtils$5.run(SocketUtils.java:107) ~[netty-common-4.1.32.Final.jar:4.1.32.Final] at java.security.AccessController.doPrivileged(Native Method) ~[?:?] at io.netty.util.internal.SocketUtils.accept(SocketUtils.java:107) ~[netty-common-4.1.32.Final.jar:4.1.32.Final] at io.netty.channel.socket.nio.NioServerSocketChannel.doReadMessages(NioServerSocketChannel.java:143) ~[netty-transport-4.1.32.Final.jar:4.1.32.Final] at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:75) [netty-transport-4.1.32.Final.jar:4.1.32.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) [netty-transport-4.1.32.Final.jar:4.1.32.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) [netty-transport-4.1.32.Final.jar:4.1.32.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) [netty-transport-4.1.32.Final.jar:4.1.32.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) [netty-transport-4.1.32.Final.jar:4.1.32.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final] at java.lang.Thread.run(Unknown Source) [?:?]
Display the system-wide number of available file handles.
$ sudo sysctl fs.file-max fs.file-max = 100000
Display the number of used file-handles. The first number denotes allocated file-handles, the second allocated, but unused file handles, and the third one the maximum number of file handles.
$ sudo sysctl fs.file-nr fs.file-nr = 100128 0 100000
Increase the maximum number of file handles.
$ sudo sysctl -w fs.file-max=500000 fs.file-max = 500000
Make this change persistent across a reboot.
$ echo "fs.file-max = 500000" | sudo tee /etc/sysctl.d/local-file-max.conf
Verify the system-wide number of available file handles.
$ sudo sysctl fs.file-max fs.file-max = 500000
Display the top eight processes using the most file handles to determine the next step.
$ lsof | awk '{print $1}' | sort | uniq -c | sort -r | head -8 89712 mysqld 61587 java 14475 varnishd 1102 proxysql 760 php-fpm 688 redis-server 410 nginx 348 tuned
Additional information
Please read documentation for /proc/sys/fs/* for more information.