Saturday, December 12, 2020

Help a simple programmer understand: How does the networking for CDN providers work (AS/BGP)?

So lately I've been thinking about how a CDN works, and how a company could build something similar by themselves. I'm trying to understand, and have written below how I think it works, any corrections or more info would be much appreciated!

Let's say we want to have 3 different datacenters around the country, both for availability and performance, serve static content from static.acme.com

My understanding is that the company would get their hands on an IPv4 block, and register an AS. We would, via our DC operator/ISP, announce our AS on each datacenter via BGP? Basically saying "IP's in this range can be routed here". We would have some kind of beefy router at each DC receiving that traffic and load balance it on the servers in the DC (either via L4 or L7).

On the DNS side of things we would have an A record for static.acme.com pointing at one of the IP's in our block.

Is it correct so far?

Then, what happens if one of the DC's become unavailable? Can we withdraw that BGP announcement somehow, how is that cached etc? What happens if the load become very uneven? E.g people with the cheapest route to DC1 use the service much more than users "close" to the other DC's, making it preferable to route some DC1-users to DC2/DC3, even though DC1 in "closest" ?

I'm trying to understand the black magic of large scale networking, would appreciate any pointers!



No comments:

Post a Comment