sábado, 1 de abril de 2017

TIL: Amazon Nat Gateway drops your connections after 5 mins idle

Yes, after many hours of head scratching, I found out that if you're inside an amazon VPC, and use a NAT gateway or a NAT instance as your internet gateway, silent connections for more than 5 minutes will be dropped, or, even "better", just silenced, but you won't notice anything, just silence.

For long streaming connections going 'mute' for maintenance, you won't recover automatically.


WOW: Is it possible that in 2024, I just hit this again?


I found the info here of what I hit back on 2017:
https://repost.aws/knowledge-center/vpc-troubleshoot-nat-gateway-connection
https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway-troubleshooting.html

text:

IdleTimeoutCount error to release capacity

If a connection that uses a NAT gateway is idle for 350 seconds or more, then the connection times out. You also will see a spike on the IdleTimeoutCount metric. When a connection times out, a NAT gateway returns an RST packet to any resources behind the NAT gateway that attempt to continue the connection. The NAT gateway doesn't send a FIN packet.

To resolve or work around the IdleTimeoutCount error, complete the following tasks:

  • Use the IdleTimeoutCount metric in Amazon CloudWatch to monitor for increases in idle connections. Make sure that you configure CloudWatch Contributor Insights to get visibility on the top contributors of clients with processes in the Idle state.
  • Close idle connections from clients to release capacity.
  • Initiate more traffic over the connection.
  • Turn on TCP keepalive on the instance with a value that's less than 350 seconds.

No hay comentarios:

Publicar un comentario