paste.httpserver 和 HTTP/1.1 Keep-alive 的减速；用 httperf 和 ab 测试答案

【问题标题】：paste.httpserver and slowdown with HTTP/1.1 Keep-alive; tested with httperf and abpaste.httpserver 和 HTTP/1.1 Keep-alive 的减速；用 httperf 和 ab 测试
【发布时间】：2009-11-23 08:22:12
【问题描述】：

我有一个基于 paste.httpserver 的 Web 服务器作为 HTTP 和 WSGI 之间的适配器。当我使用 httperf 进行性能测量时，如果我每次使用 --num-conn 启动一个新请求，我每秒可以执行超过 1,000 个请求。如果我改为使用 --num-call 重用连接，那么我每秒会收到大约 11 个请求，速度的 1/100。

如果我尝试 ab，我会超时。

我的测试是

% ./httperf --server localhost --port 8080 --num-conn 100
...
Request rate: 1320.4 req/s (0.8 ms/req)
...

和

% ./httperf --server localhost --port 8080 --num-call 100
...
Request rate: 11.2 req/s (89.4 ms/req)
...

这是一个简单的可重现服务器

from paste import httpserver

def echo_app(environ, start_response):
    n = 10000
    start_response("200 Ok", [("Content-Type", "text/plain"),
                              ("Content-Length", str(n))])
    return ["*" * n]

httpserver.serve(echo_app, protocol_version="HTTP/1.1")

这是一个多线程服务器，很难分析。这是一个单线程的变体：

from paste import httpserver

class MyHandler(httpserver.WSGIHandler):
    sys_version = None
    server_version = "MyServer/0.0"
    protocol_version = "HTTP/1.1"

    def log_request(self, *args, **kwargs):
        pass


def echo_app(environ, start_response):
    n = 10000
    start_response("200 Ok", [("Content-Type", "text/plain"),
                              ("Content-Length", str(n))])
    return ["*" * n]

# WSGIServerBase is single-threaded
server = httpserver.WSGIServerBase(echo_app, ("localhost", 8080), MyHandler)
server.handle_request()

分析它

% python2.6 -m cProfile -o paste.prof paste_slowdown.py

然后用它击中它

%httperf --client=0/1 --server=localhost --port=8080 --uri=/ \ 
   --send-buffer=4096 --recv-buffer=16384 --num-conns=1 --num-calls=500

我得到了类似的个人资料

>>> p=pstats.Stats("paste.prof")
>>> p.strip_dirs().sort_stats("cumulative").print_stats()
Sun Nov 22 21:31:57 2009    paste.prof

         109749 function calls in 46.570 CPU seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   46.571   46.571 {execfile}
        1    0.001    0.001   46.570   46.570 paste_slowdown.py:2(<module>)
        1    0.000    0.000   46.115   46.115 SocketServer.py:250(handle_request)
        1    0.000    0.000   44.675   44.675 SocketServer.py:268(_handle_request_noblock)
        1    0.000    0.000   44.675   44.675 SocketServer.py:301(process_request)
        1    0.000    0.000   44.675   44.675 SocketServer.py:318(finish_request)
        1    0.000    0.000   44.675   44.675 SocketServer.py:609(__init__)
        1    0.000    0.000   44.675   44.675 httpserver.py:456(handle)
        1    0.001    0.001   44.675   44.675 BaseHTTPServer.py:325(handle)
      501    0.006    0.000   44.674    0.089 httpserver.py:440(handle_one_request)
     2001    0.020    0.000   44.383    0.022 socket.py:373(readline)
      501   44.354    0.089   44.354    0.089 {method 'recv' of '_socket.socket' objects}
        1    1.440    1.440    1.440    1.440 {select.select}
         ....

你可以看到几乎所有的时间都在一个recv中。

我决定放弃 httpref 并编写自己的 HTTP/1.1-with-keep-alive 请求并使用 netcat 发送：

GET / HTTP/1.1
Location: localhost
Connection: Keep-Alive
Content-Length: 0

GET / HTTP/1.1
Location: localhost
Connection: Keep-Alive
Content-Length: 0

 ... repeat 97 more times, to have 99 keep-alives in total ...

GET / HTTP/1.1
Location: localhost
Connection: Close
Content-Length: 0

我寄来的

nc localhost 8080 < ~/src/send_to_paste.txt

100 个请求的总时间为 0.03 秒，性能非常好。

这表明 httperf 做错了（但它是一段被广泛使用和尊重的代码），所以我尝试了 'ab'

% ab -n 100 -k localhost:8080/
This is ApacheBench, Version 1.3d <$Revision: 1.73 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)...
Server timed out

: Operation now in progress

检测服务器，它处理一个请求并等待第二个。

知道发生了什么吗？

【问题讨论】：

标签： python paste keep-alive httpserver httperf

【解决方案1】：

经过一番努力，似乎要么是Nagle's algorithm要么是延迟的ACK，要么是它们之间的交互。如果我做类似的事情，它就会消失

server.socket.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

我是如何找到它的？首先，我检测了 socket.py 中的每个“recv”，所以我可以找出哪个 recv 正在等待。我会看到 11 个中大约 5 个 recv 的延迟接近 200 毫秒。我不明白为什么会有任何延迟。然后我使用 Wireshark 观察消息，并注意到它实际上是从服务器发送到客户端的延迟。这意味着从我的客户端传出的消息中的 TCP 层中的某些内容。

朋友建议很明显，我搜索了“200ms socket delay”，找到了这个问题的描述。

粘贴跟踪报告位于 http://trac.pythonpaste.org/pythonpaste/ticket/392 以及当处理程序使用 HTTP/1.1 时启用 TCP_NODELAY 的补丁。

【讨论】：

我自己也很难学到这一点：github.com/williame/hellepoll/blob/…