Friday, June 19, 2020

Web scrapping a website that uses edge servers

Hello, I am trying to scrap a website that uses Cloudflare edge servers, but since their edge servers are far away from my area I have resorted to using a VPS. I have optimized my web scrapper code enough so that it has very minimal effect on the post request RTT, however since the request is dynamic it is not cached by the edge servers which means that RTT includes the trip between the origin and the edge server. So what specifications of the server should I be looking at to reduce the delay time between the VPS and the edge server?

I am trying to improve the RTT by milliseconds so even the slightest differences are important, I have a couple of ideas that I would appreciate if anyone would confirm if they would make it faster.

-Using a server with large bandwidth

-Geo location of the server as close as possible to the edge server

-Having a private peering with the edge server data center

-Locating my VPS to the edge server nearest to the origin server.

Thanks for your time.



No comments:

Post a Comment