Connection refused
Sun, Mar 26, 2017TCP connections fail to establish for a number of reasons: congestion in the network, incorrect destination (ip:port
), incompatible firewall rules, busy server, etc. Two common failures you’ll see are Connection refused
and Connection timed out
. Let’s look at Connection refused
in greater detail.
In a nutshell:
Connection refused
means “I heard back from the host, and the host isn’t allowing me to connect”.
Connection timed out
means “I didn’t hear back from the host.”
The important take-away here is the connection refused
tells you a bunch of things:
- the target host exists and is reachable over the network
- the target is not dropping your request packets
- the target is responding with a TCP packet, most likely with the RST
flag set.
- the target either isn’t listening on the specified port, or is refusing to take new connections on that port.
By contrast, connection timed out
means you didn’t hear any response back from the host before some arbitrary timer expired. You might explicitly set a timeout, or you could have one built into whatever tool / request library you’re using.
Connection Refused
: I’m explicitly not taking connections
Let’s look at a successful HTTP request to a port where nginx is listening for new connections:
❯ curl -I 127.0.0.1:80
HTTP/1.1 200 OK
...
and an unsuccessful request to a port where nothing is listening:
❯ curl -I 127.0.0.1:81
curl: (7) Failed to connect to 127.0.0.1 port 81: Connection refused
The above (failed) request to port 81
returns immediately, even though it didn’t successfully establish a connection. To see why, we can look at what’s happening at the TCP layer in the successful and unsuccessful cases.
Let’s capture one half of the TCP conversation during the successful request to port 80 using tcpdump
. We’ll snoop traffic on the loopback interface (lo
) since we’re issuing the request locally via 127.0.0.1
(ie, the request never actually hits a real network interface). curl
will issue the request using some high numbered ephemeral port given to it by the kernel, but the request destination will be 127.0.0.1:80
. Similarly, responses will originate from 127.0.0.1:80
, so we can monitor the traffic that is issued from source port 80
to see the response half of the conversation:
❯ sudo tcpdump -i lo -n src port 80
# meanwhile, issue `curl -I 127.0.0.1:80`
...
20:56:37.780057 IP 127.0.0.1.80 > 127.0.0.1.45522: Flags [S.], seq 657845710, ack 1014454379, win 43690, options [mss 65495,sackOK,TS val 278491 ecr 278491,nop,wscale 6], length 0
20:56:37.781185 IP 127.0.0.1.80 > 127.0.0.1.45522: Flags [.], ack 75, win 683, options [nop,nop,TS val 278491 ecr 278491], length 0
20:56:37.782287 IP 127.0.0.1.80 > 127.0.0.1.45522: Flags [P.], seq 1:240, ack 75, win 683, options [nop,nop,TS val 278492 ecr 278491], length 239
20:56:37.797218 IP 127.0.0.1.80 > 127.0.0.1.45522: Flags [F.], seq 240, ack 76, win 683, options [nop,nop,TS val 278495 ecr 278493], length 0
Look at the TCP flags set on the response packets:
- [S.]
: SYN + ACK - this is part of the three way handshake to establish the TCP connection)
- [.]
: ACK - this ACK acknowledges the request from curl
- [P.]
: PUSH + ACK - this packet contains the HTTP response, pushed without TCP buffering by nginx
- [F.]
: FIN + ACK - cleanly terminate the connection
Now contrast this with the results curling to a port (81
) where no process is listening:
❯ sudo tcpdump -i lo -n src port 81
# meanwhile, issue `curl -I 127.0.0.1:81`
21:10:24.508842 IP 127.0.0.1.81 > 127.0.0.1.44138: Flags [R.], seq 0, ack 404185781, win 0, length 0
In this case, we see a single response packet with a 0-length data payload and the [R.]
flags (RESET + ACK). The reset flag notifies the connecting client that the connection will be immediately destroyed. In this case, since no TCP handshake was ever completed, it tells the client that the server will not be establishing the requested connection.
The Response is from the Kernel
Consider the fact that no process is actually bound to port 81. This means that the operating system alone must be issuing the reset packet; in fact, connection management in Linux is completely handled by the kernel’s TCP stack in either case (successful handshake or otherwise) before the connection is handed off to the application at all.
We can observe this with a bit of strace
-ing in the positive case. I’ll attach strace
to nginx and its worker processes, filtering for system calls by nginx to accept inbound IPv4 connections. I’ll also use -tt
to print the wall-clock time in microseconds. At the same time, I’ll capture a tcpdump
so we can compare the order of operation:
# set up tcpdump
❯ sudo tcpdump -i lo -n src port 80
# in another terminal, set up strace
❯ pgrep nginx
345
346
347
348
351
❯ sudo strace -f -p 345,346,347,348,351 -e trace=accept4 -tt
# in a third terminal, issue the curl
❯ curl -I 127.0.0.1:80
Interleaving the strace and tcpdump results in chronological order, we get:
21:25:57.939622 IP 127.0.0.1.80 > 127.0.0.1.45542: Flags [S.], seq 3721359522, ack 2397758014, win 43690, options [mss 65495,sackOK,TS val 630523 ecr 630523,nop,wscale 6], length 0
[pid 346] 21:25:57.940386 accept4(6, {sa_family=AF_INET, sin_port=htons(45542), sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_NONBLOCK) = 12
21:25:57.943629 IP 127.0.0.1.80 > 127.0.0.1.45542: Flags [.], ack 75, win 683, options [nop,nop,TS val 630524 ecr 630524], length 0
21:25:57.945494 IP 127.0.0.1.80 > 127.0.0.1.45542: Flags [P.], seq 1:240, ack 75, win 683, options [nop,nop,TS val 630524 ecr 630524], length 239
21:25:57.957252 IP 127.0.0.1.80 > 127.0.0.1.45542: Flags [F.], seq 240, ack 76, win 683, options [nop,nop,TS val 630527 ecr 630525], length 0
The strace
output clearly shows that nginx doesn’t actually accept(2) the connection until after the TCP stack replies with the SYN-ACK packet, indicating the connection is fully established.