Monday, June 1, 2020

Internet dropping out at exact same times every hour for months

Hi all,

Have got a really annoying issue at the moment, we are getting internet dropouts at 32 and 12 minutes past the hour, every hour. It's difficult to troubleshoot also because its at a remote location so i cant be there onsite to look at this - Here is what i have done so far

  • Ran ping to 1st hop and to wan interface from LAN, WAN interface isn't dropping anything but the 1st hop does at those times - pinging ip's not hostnames so not DNS related(i did provide the ISP with these ping logs that are timestamped to assist on their side)
  • Replaced the modem(brand new) and the router(spare) - it improved but not fixed - colleague went down 5-6 weeks ago and swapped it out whilst he was there for a meeting
  • Checked that the hardware wasn't connected to UPS and the UPS was dropping power for any reason etc - directly connected to mains
  • Power cycled everything (one of the 1st things we did)
  • Firmware updates on the router
  • Went through router logs (WatchGuard) and couldnt find any errors or anything obvious thats happening at those times
  • Did a tcpdump for 3 hours and went over the packet capture and the only thing i can see is that during the dropouts our router is doing arp requests to the 1st hop and the 1st hop stops responding, so i added a static arp entry thinking this might shine more light on the issue but it actually made the dropouts longer so removed the static arp entry and now the dropouts are back to 4-5 seconds instead of 30-60 seconds - when the connection is up i can see arp requests are successful between our router and the 1st hop so i know arp is working - other then that im not seeing anything that alludes to the issue
  • Its an NBN connection so all the router has is a static IP, the modem is there just to convert the signal from the copper to digital for the router so theres not PPPoE happening either - its purely a static ip

I've taken all this to the ISP and they are given me the most useless information "Service is stable and has been for the last 48 hours" - im asking them to see if they can see any errors on their end at 32 and 12 minutes past the hour but they are just giving generic responses again.

I'm running out of ideas and things to check so i'm turning to you all.

Appreciate and feedback you guys may have.

EDIT: The ISP has asked for " drop logs with Start and stop sessions to investigate if this drop issue is due to the Physical Line or Layer 3 drop issue " - i called and asked them what they were referring to as I've never heard that before but the guy sounded like he was talking through a pillow so couldn't hear a word he was saying



No comments:

Post a Comment