Hi everyone, I've been working with a really small DC for a few months, this is not your high-end DC it's basically one used for hosting VPS and managed servers for our clients. We don't have high East-West traffic, basically our traffic is limited by our total capacity towards the internet (our customers use their VPS for many things, VPNs, storing files, DVRs, etc.), as in our traffic is almost 100% north-south.
This DC is simple, a couple of core switches connect every single access switch via trunked ports, then the core switches connect to the edge routers which in turn connect to the internet through a couple of transit providers. So as you can see everything is managed by VLANs (think of Cisco campus 3 tier model). Before I continue I have to add that this DC is located in Hong Kong, where bandwidth towards China using “direct physical paths” is REALLY expensive (yes, even though Hong Kong is part of China…). This means we have upstreams that are mainly used only for traffic towards China but due to the pricing we don’t have a lot of bandwidth available.
This was "ok" because our customers used our address space for everything, we just added a VPS or a server inside a VLAN (whose subnet still had available IPs) and queued at the edge routers based on source IP (better say, source subnet) and exit interface. But now the management told me they're getting requests from resellers of our service where:
A.- Customer wants us to originate their prefixes B.- Customer wants to peer with us so they can originate their prefixes
Due to these requests, they now want to have a way "to throttle traffic to all subnet the customer broadcast to us (or that we broadcast in their name) and then per each IP said subnets have when traffic follows a specific upstream" In other words, if customer originates:
10.10.10.0/24 10.11.0.0/24 192.168.1.0/24
They want to throttle their speed to N Mbps if traffic when it uses our Upstream A (and only upstream A), then they want each IP from all these subnets to get specific limits.
Why?
Well, because the resellers complain about “network being slow” when one of their customers starts using all the available bandwidth, thus management wants to be able to avoid this (up to certain point) and be able to quickly tell them “your connection is slow because your user with IP X.X.X.X is consuming all your quota”.
Taking aside the other requirements like BGP peering zones and reworking the network for it to be fully routed instead of VLAN based, this queing/qos I don’t really have a clue on how to approach it in a way that’s scalable and easy to deploy, I don’t want to log into the edge-router tied to upstream A and add queues per every single tenant like this, It’d take a lot of time and also a lot of resources in the router’s hardware.
Any idea here, we are willing to purchase hardware for this.
My 2cents: I think this is like too much, but maybe there's a way...
No comments:
Post a Comment