Monday, August 5, 2019

Major Internet speed degregation over wavelength circuit.

Hello all,

First time posting on this sub-reddit (actually, posting on reddit at all). I've been banging my head on a bandwidth issue that we've been experiencing over a wavelength circuit here in the Seattle region. We're going on about 1.5 months of troubleshooting at this point (with our provider involved heavily and they're starting to get stumped). Here's our situation and a brief overview: we have rack-space at a colocation provider in the Lynnwood area (location A). At this rack, we are delivered a 1Gb/s symmetrical IP Transit circuit over 1310nm fiber going into our Juniper EX3400-48T with a fiber store optic (coded for Juniper). Local speedtests from this point, plugged into an RJ-45 port on the Juniper, to various servers show 750-940Mb/s down and almost always 940Mb/s-1000Mb/s up (on-net with ISP and off-net with other servers / providers peered with ISP). Seattle server latency is around 1-2ms. Also plugged into the Juniper's SPF+ slots is a 1310nm 10Gb optic (also fiber store, coded for Juniper) which is one end of our 10Gb wavelength circuit. This wavelength circuit is basically dormant at this point and is dedicated to the IP Transit, so theoretically, we have 9Gb/s of available headroom. This 1310nm fiber heads to the Westin in Seattle (Location B - approx. 17 miles South of location A) where it goes into the ISP DWDM equipment. From there, their DWDM bundle comes back up (~45 miles) to their other DWDM equipment which is about 13 miles from our HQ (Location C). This last leg of the wavelength circuit to our HQ is fed over 1550nm to another Juniper EX3400-48T. Plugging into that Juniper EX3400-48T at the HQ yields speedtests of approximately 200-400Mb/s down but occasionally, depending on server, reaching the max 940Mb/s of the IP Transit feed. Typical latency here is 4-5ms to Seattle servers. However, this gig speed is uncommon at HQ. Known facts and diagram for visual interpretation is below. I've looked up the bandwidth-delay product but I'm hesitant if that's what is playing a role in this situation since we are seeing conflicting results (some speeds are accurate at both locations despite latency). The ISP has been VERY helpful in helping us troubleshoot this but they're getting to their last straw of ideas. Any ideas or helpful points are GREATLY appreciated.

  • MTU on wavelength circuit is set to 9000 (have tried 1518).
  • Pings from HQ to the Juniper in our rack at colo is a steady 2ms.
  • We've completely swapped the Juniper switches at each end with Dell's (just for testing) with the same results.
  • We have NOT swapped the 1550nm optic at HQ but I'm hesitant that is the issue, still going to order one to test.
  • No framing errors on the switches for the corresponding ports in play.
  • The switches are doing pure layer 2 at this point. Very basic config, no QoS or anything. Two VLANs are involved but we removed them as being a possibility when we tested using the Dell switches (no VLANs on the Dell switch test).
  • Installed IIS (web server) on a server in our rack and tested downloads at HQ and consistently got ~90MB/s (720Mb/s).
  • ISP has validated the wavelength for 10Gb with an RFC test with various framing sizes.

Diagram:

https://imgur.com/hQ3pckc



No comments:

Post a Comment