We have a REST API server with 25 million calls each day. Our stack consists of Haproxy + Gunicorn + Flask also we have a MongoDB database that's used by our Rest API. We monitor it with Netdata and watch the statistics with Elasticsearch. Server has 64 GB Ram, AMD Ryzen 7 1700x Pro and SSD storage. Sometimes, netdata used to alarm us about "Accept Queue Overflow" and "Listen Queue Overflow", when we look at these alarms over google, we see that there are some stuff to be changed over at the sysctl.conf and we increased the neccesseary values little by little. After we changed the values we stopped getting alarms. But even though, when we look at the sysctl.conf we have a feeling that the values we set are absurt. So if you could take a look at our sysctl.conf and make a comment about it, we would be glad. Thank you.
net.ipv4.tcp_max_syn_backlog = 1000000 net.core.somaxconn = 8192 net.core.netdev_max_backloag = 900000 net.netfilter.nf_conntrack_max = 1024288 net.netfilter.nf_conntrack_tcp_timeout_close_wait = 20 net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 30 net.netfilter.nf_conntrack_tcp_timeout_time_wait = 20 net.core.wmem_default=8388608 net.core.rmem_default=8388608 net.core.rmem_max=16777216 net.core.wmem_max=16777216 net.ipv4.tcp_rmem=4096 8388608 16777216 net.ipv4.tcp_wmem=4096 8388608 16777216 net.ipv4.tcp_mem=4096 8388608 10388608 net.ipv4.route.flush=1 net.ipv4.ip_local_port_range = 10000 61000
And our TXQUEUELEN value is 4000.
netstat -s | grep -i list output;
netstat -s | grep -i list 7273 SYNs to LISTEN sockets dropped
We currently see no problem because we moved our Rest API to Websockets, but still; we are curious and we would like to know if what we are doing is wrong. (Our concurrent connection is around 1200-1500 on Websockets).
Edit:We have no problem regarding CPU/RAM. Our cpu usage is around 10% and RAM consumption is around 50-55%.
Haproxy.cfg parameters;
global maxconn 60000 defaults retries 3 backlog 10000 timeout client 35s timeout connect 5s timeout server 35s timeout tunnel 3600s timeout http-keep-alive 100s timeout http-request 15s timeout queue 30s timeout tarpit 60s default-server inter 3s rise 2 fall 3
No comments:
Post a Comment