Monday, October 5, 2020

Weird AWS Problem, driving me mad

Okay, so I have about lost my shit trying to figure this out, but here goes.

This isn't anything complicated. I have a VPC with SUBNET A and SUBNET B. To this VPC I have 5 VPN connections to 5 different firewalls across the world.

Today an outage occurred for only devices in SUBNET B, across all 5 VPNS. Everything in SUBNET A worked fine, no issues at all. Then after 2.5 hours all of a sudden all instances in SUBNET B worked as well.

Naturally I started a ticket with AWS and they ask to see flow logs. I check the logs sure enough, I have entries for my pings from my devices through the VPN into the instance in subnet B, I even see the one for the return traffic, but it never reaches my firewall.

I get a brilliant idea to SSH from a host in SUBNET A into SUBNET B. I"M IN! I do a ping from INSTANCE B to my stuff in my office and i Get "destination net unreachable" from an AWS IP of 169.254.255.41. Can you believe this?

Everything is set up the same across us-east-1a and us-east-1b, subnets have same route tables, and ACLS, instances have same security groups.

After 2.5 hours BAM everything is working, and of course I'm all dandy and whatever with AWS, but now the issue is occurring again. I'm on the phone with AWS, but I'm not sure even they can figure it out.



No comments:

Post a Comment