Two processes can shadow the same port

This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the networking category.

Last Updated: 2021-05-16

I got this error when running the Project M main service via Docker:

{"message":"connectDatabase failed at event sorcerer with the following error: role \"user\" does not exist","error":...}

This was despite running the accompanying DB postgres container they provided. The issue was that their postgres docker container was available (port mapped) on localhost:5432 - which is the same port as the default for postgres running on a host. Therefore the wrong postgres instance (the one on my host vs docker) was being used...

Here is the output of lsof relating to postgres on my host after stopping docker. It listens on two TCP ports: 5432 and 60902

# Command | PID  | ...
postgres  36880           jack    5u  IPv6 0x99adf8619ab18b8d      0t0  TCP [::1]:5432 (LISTEN)
postgres  36880           jack    6u  IPv4 0x99adf86188cb0a8d      0t0  TCP 127.0.0.1:5432 (LISTEN)
postgres  36880           jack   10u  IPv6 0x99adf8617e115575      0t0  UDP [::1]:60902->[::1]:60902
postgres  36916           jack   10u  IPv6 0x99adf8617e115575      0t0  UDP [::1]:60902->[::1]:60902
postgres  36917           jack   10u  IPv6 0x99adf8617e115575      0t0  UDP [::1]:60902->[::1]:60902
postgres  36918           jack   10u  IPv6 0x99adf8617e115575      0t0  UDP [::1]:60902->[::1]:60902
postgres  36920           jack   10u  IPv6 0x99adf8617e115575      0t0  UDP [::1]:60902->[::1]:60902
postgres  36921           jack   10u  IPv6 0x99adf8617e115575      0t0  UDP [::1]:60902->[::1]:60902
postgres  36925           jack   10u  IPv6 0x99adf8617e115575      0t0  UDP [::1]:60902->[::1]:60902

As a demonstration that entities can listen on the same port, I ran the following Ruby server using the same 5432 port used by postgres:

require "socket"
server = TCPServer.new(5432)
loop do
  client = server.accept    # Wait for a client to connect
  client.puts "Hello !"
  client.puts "Time is #{Time.now}"
  client.close
end

This Ruby program should print the current time when I curl localhost:5432 but it does not - presumably because the port is being shadowed by postgres.

If I run lsof, we see another entry corresponding to this process on the same port (5432)

... # postgres stuff from above
ruby      38127           jack   12u  IPv6 0x99adf8619ab1914d      0t0  TCP *:5432 (LISTEN)

If I shut down my local postgres, then curling this Ruby server will work:

$ brew services stop postgresql
# Service stopped
$ curl localhost:5432
Hello !
Time is 2020-03-04 17:14:39 +0700

Inspecting the two sets of output from lsof above more closely, we see that the Ruby process was listening to *:5432 whereas postgres was listening more specifically to 127.0.0.1:5432.

We can confirm that specificity was the issue by modifying the Ruby script above

require "socket"
socket = TCPServer.new("127.0.0.1", 5432)

Errno::EADDRINUSE (Address already in use - bind(2) for "127.0.0.1" port 5432)

Overall, this shows one way in which you can end up having multiple processes binding to the same TCP port without it complaining (and therefore causing shadowing). Another way would be if one socket used an option such as SO_REUSEPORT when opening the port.

Lesson:

Be careful about the possibility of two instances of something running on the same port. Confirm a port is available with $ sudo lsof -i -P -n