Monday, August 30, 2021

UPDATE: Can we talk about a strange problem on my network?

UPDATE: It turns out the host site had some "undesirable" traffic out of Mexico last week. So, they hardened their traffic laws. We got caught up in it because, apparently, our circuit runs through Mexican pipes. We had them adjust their posture, as it relates to us, specifically, and all is well, now.

I want to *really* thank the community for all of the help and effort that went into your responses.

It was partially your efforts that allowed e to present the problem to the destination as "their" problem, rather than :my problem."

So, I want to preface this issue with the understanding that I'm a reasonably capable sysadmin with a fair bit of experience in troubleshooting networking issues BUT, I'm not a "network guy."

Now, I've been thrown into a new environment with ZERO legacy knowledge and ZERO knowledge transfer from the last guy who left. Which is fine. Whatever. It just means there's a LOT of controls, routes and hardware that I'm unsure of. Unsure of exactly what all is in place, what all we're currently relying on and what all is there, quietly screwing stuff up, that has been mothballed but not decommissioned.

I have a good approximation but, nothing definitive.

ALL that said.....

So, yesterday, a specific website and all of it's subdomains started timing out on us. No changes to the traffic rules or routes that anyone knows of (and, honestly, *I'd* have been the one making changes, ostensibly). I don't hear about this issue until this morning, around 1000 hours. So, we're 24 hours behind the trail and now it's a real problem because it's been broken for a whole day and nobody has fixed it.

I test the site, and it's not coming up. I test it from my phone and, boom, it's alive. I fire up a laptop and connect to the enterprise WIFI, no joy. Connect to a hotspot, it works (so it's not the OS doing it). I connect to VPN, it works. I google, but the results for "a website doesn't work from my corporate network" are a bit .... voluminous.

What I have is a site that works but, not if I'm connected to my regular production network. However, it's fine from the VPN interface on the ASA, which, naturally bypasses a lot of our corporate controls. (I do not know why)

Since it's *always* DNS, I start there. Our DNS is controlled through an Umbrella VA and it's running well, the VA dashboard assures me that not only is this domain intentionally whitelisted but, added to that, the traffic is being allowed, according to the logs.

On to the firewall! That's not the culprit, either. We ran traps at the interface and find the traffic is flowing with reckless abandon into the ether, we just aren't getting anything back.

Now I'm starting to feel stupid. I will happily and readily admit that networking isn't my strong suit. I'm "OK" at it, I have a deep understanding of protocols, ports, traffic and controls but, in practice, troubleshooting packets isn't what I'm good at.

I *can* say these things:

It doesn't *appear* to be DNS

It almost certainly isn't the firewall

It definitely isn't something on the desktop

I simply have no idea where to go from here.

I don't expect anyone to say "this is your problem, obviously" and offer me the magic ticket. I would greatly appreciate anyone chiming in with some ideas on where I might look for the "thing that fell over" so I can put it back where it was.



No comments:

Post a Comment