In almost every programming language you can quickly run into problems that the app establishes connections but never closes them. Some programming languages ” automatically” close unused connections, and this sometimes works better, sometimes worse. It is good practice to monitor if the “promises” of the vendors do work.
Such open connections (I/O, databases, net/HTTP requests, etc.) accumulate over time until there are no more free connections. Boom, in simple words: the app stopped working. The annoying thing is that the app may not crash (depending on your error handling!), it just often can’t establish new connections and doesn’t work as expected.
I don’t want to go into the details of how to solve such problems here, because they depend on the programming language you choose and, above all, on your programming style and how you handled errors. But the most important thing is to recognize such errors at.
In Golang, for example, you might get error messages like the following:
accept4: too many open files; retrying in 1s
And especially newbies might now be completely lost.
At some point the server was full and no more free connections were possible.
The following command can be used to check how many connections a specific process is using:
lsof -p -r 10
The PID is the ID of the process, which can be found using the top command, for example. The 10 stands for 10 seconds in this example and means that the command is executed every 10 seconds.
Most of the time, however, we want to check services that have a name. In this case, we can combine the pidof command with lsof:
lsof -p $(pgrep -o -x bes) -r 10
My service is called “bes”. The command above gets the PID of “bes” and every 10 seconds runs the lsof with that PID and makes a list of how many connections “bes” is creating. You only need to replace “bes” with the name of the service you want to monitor, for instance, nginx.
In this case “only” 8. Yesterday it was still 2000 and growing until the app broke overnight ;-). I had created a leak in one of my apps because I forgot to close a MongoDB connection.
Good luck identifying such leaks in your programs!