Question

我正在尝试调整WS以支持约2万个并发用户。

无论我更改哪种配置，当我的测试遇到2（k）k个用户和各种502/504错误时，每个端点的平均响应时间仍为6秒。

WebService：

CloudFlare <-> Nginx <-> Gunicorn <-> Django / DRF <-> Memcache <---> Postgres

这是我尝试过的：

将枪械工人从4名增加到10名
将服务（pod）实例从3个增加到10个
将Gunicorn工作者的超时时间增加到120
将Nginx proxy_pass超时增加到120

大多数端点每100秒访问数据库一次，其他请求从内存缓存中获取数据。

有人可以指出我应该更改哪种配置吗？

我应该在哪里寻找延误/瓶颈？

独角兽公司的工人显然正在被淘汰，因为WS的观点没有逻辑，所以我不明白这一点。它应该只从memcache获取查询并返回。

Nginx日志：

latforms/android HTTP/1.1", upstream: "http://10.0.1.17:9090/endpoints/platforms/android", host: "myhost.co"
2018/08/13 23:43:25 [error] 8893#8893: *2809163 upstream timed out (110: Connection timed out) while connecting to upstream, client: 200.211.198.133, server: myhost.co, request: "GET /endpoints/store/products/729 HTTP/1.1", upstream: "http://10.0.1.18:9090/endpoints/store/products/729", host: "myhost.co"
200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/categories/?cat_pk=13081 HTTP/1.1" 200 1718 "-" "python-requests/2.18.4" 627 80.840 [production-service-api-80] 10.0.0.112:9090, 10.0.1.13:9090, 10.0.0.113:9090 0, 0, 11150 40.000, 40.000, 0.840 504, 504, 200
200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/categories/?cat_pk=13081 HTTP/1.1" 200 1718 "-" "python-requests/2.18.4" 689 80.857 [production-service-api-80] 10.0.0.112:9090, 10.0.1.12:9090, 10.0.0.113:9090 0, 0, 11150 40.000, 40.000, 0.857 504, 504, 200
200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/home/ HTTP/1.1" 200 10072 "-" "python-requests/2.18.4" 670 80.580 [production-service-api-80] 10.0.1.13:9090, 10.0.1.11:9090, 10.0.0.112:9090 0, 0, 66511 40.001, 40.002, 0.577 504, 504, 200
200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/products/691/ HTTP/1.1" 200 703 "-" "python-requests/2.18.4" 646 80.486 [production-service-api-80] 10.0.1.8:9090, 10.0.1.13:9090, 10.0.1.12:9090 0, 0, 1968 40.000, 40.000, 0.486 504, 504, 200
200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/products/5458 HTTP/1.1" 301 0 "-" "python-requests/2.18.4" 678 80.444 [production-service-api-80] 10.0.1.13:9090, 10.0.1.12:9090, 10.0.1.17:9090 0, 0, 0 40.000, 40.002, 0.442 504, 504, 301
....
90, 10.0.1.11:9090, 10.0.1.8:9090 0, 0, 1968 40.000, 40.000, 0.584 504, 504, 200
200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/products/5458/ HTTP/1.1" 200 241 "-" "python-requests/2.18.4" 647 80.709 [production-service-api-80] 10.0.0.113:9090, 10.0.1.8:9090, 10.0.0.112:9090 0, 0, 327 40.001, 40.000, 0.708 504, 504, 200
--
2018/08/13 23:43:25 [error] 8766#8766: *2809243 upstream timed out (110: Connection timed out) while connecting to upstream, client: 200.211.198.133, server: myhost.co, request: "GET /endpoints/store/categories/?cat_pk=13081 HTTP/1.1", upstream: "http://10.0.1.13:9090/endpoints/store/categories/?cat_pk=13081", host: "myhost.co"
200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/products/692 HTTP/1.1" 301 0 "-" "python-requests/2.18.4" 677 80.672 [production-service-api-80] 10.0.1.17:9090, 10.0.1.10:9090, 10.0.0.113:9090 0, 0, 0 40.001, 40.001, 0.670 504, 504, 301
200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/products/4608/ HTTP/1.1" 200 553 "-" "python-requests/2.18.4" 647 80.591 [production-service-api-80] 10.0.1.11:9090, 10.0.1.17:9090, 10.0.1.8:9090 0, 0, 1090 40.000, 40.003, 0.588 504, 504, 200

独角兽原木：

{"asctime": "2018-08-13 23:42:55,145", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/store/products/691/ HTTP/1.1\" 200 1968 \"-\" \"python-requests/2.18.4\""}
{"asctime": "2018-08-13 23:42:55,167", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/store/products/729 HTTP/1.1\" 301 - \"-\" \"python-requests/2.18.4\""}
[2018-08-13 23:42:55 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:36)
[2018-08-13 23:42:55 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:37)
[2018-08-13 23:42:55 +0000] [382] [INFO] Booting worker with pid: 382
[2018-08-13 23:42:55 +0000] [383] [INFO] Booting worker with pid: 383
{"asctime": "2018-08-13 23:42:55,403", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/store/products/691/ HTTP/1.1\" 200 1968 \"-\" \"python-requests/2.18.4\""}
....
{"asctime": "2018-08-13 23:42:55,184", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/store/categories/?cat_pk=13081 HTTP/1.1\" 200 11150 \"-\" \"python-requests/2.18.4\""}
{"asctime": "2018-08-13 23:42:55,262", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/platforms/android HTTP/1.1\" 200 48 \"-\" \"python-requests/2.18.4\""}
{"asctime": "2018-08-13 23:42:55,439", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/platforms/android HTTP/1.1\" 200 48 \"-\" \"python-requests/2.18.4\""}
--
[2018-08-13 23:42:56 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:31)
{"asctime": "2018-08-13 23:42:56,689", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:56 +0000] \"GET /endpoints/store/products/729/ HTTP/1.1\" 200 2163 \"-\" \"python-requests/2.18.4\""}
{"asctime": "2018-08-13 23:42:56,799", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:56 +0000] \"GET /endpoints/store/products/5458/ HTTP/1.1\" 200 327 \"-\" \"python-requests/2.18.4\""}

Answer 1

为什么不使用uwsgi？

为了更好地工作，请执行此操作

减少代码中的数据库匹配次数
增加工作人员的枪支数量
gunicorn和nginx的可用信息记录

如果这些配置不适合您，则必须更改设置配置或增加服务器资源。

Nginx + Gunicorn + Django高延迟

1 个答案: