某测试模拟器性能优化-选对你的库

另一个模拟器组件

前面好几篇文章写都是模拟器里面的一个叫做log_generator的组件的调优过程,这次我们讲讲另外一个组件client_simulator。

client_simulator这个组件内在逻辑非常简单,希望模拟大量的客户端,同时按照一定的频率向一台HTTP服务器请求编码过的chunk数据,这个一定的频率指的是,每秒发送10次请求,每次请求一个不断变化的url。编写client_simulator这个组件的目的是验证该HTTP服务器的内存置换策略的有效性。

问题描述

组里负责该组件的同事,写完Python代码后上机一跑,性能不忍直视,大致只能模拟100个客户端。要知道,这个结果可是在一台8核4G的虚拟机上跑的啊,总共也就打出了130M左右的带宽,离我们的目标700节点,差距太大。

重构和分析

看了下该组件的代码,好嘛,又是坑爹的多线程。直接改成多进程+多线程的模式。进程和CPU数一致,每进程启动线程数= 期望节点数/进程数。

重构完上机一跑,嗯,有所改善,CPU都吃的差不多满了,但是整体性能提高不大,大概能模拟300个客户端了。

这可咋办,再看一次代码?代码很简单,就是两层循环,外层生成进程,内层生成线程,线程使用网络神库requests请求数据。再没有其他复杂逻辑了~

既然又是一个IO密集的程序,那么自然系统资源都消耗在网络数据传输上了,难道要重构requests?我可没这能力。

数据来帮忙

死马当作活马医,再次上yappi分析,得到下面的结果:

Clock type: CPU
Ordered by: totaltime, desc

name                                                                                                  ncall                 tsub        ttot        tavg
/usr/lib64/python2.7/threading.py:754 Thread.run                                                      10                    0.000704    14.011758   1.401176
/home/admin/comrade/comrade/faker.py:23 fake_sdk                                                      10                    0.124799    14.010940   1.401094
/usr/lib/python2.7/site-packages/requests/api.py:61 get                                               2070                  0.045381    13.767441   0.006651
/usr/lib/python2.7/site-packages/requests/api.py:16 request                                           2070                  0.041756    13.719202   0.006628
/usr/lib/python2.7/site-packages/requests/sessions.py:441 Session.request                             2070                  0.059906    11.679531   0.005642
/usr/lib/python2.7/site-packages/requests/sessions.py:589 Session.send                                2070                  0.089826    7.909129    0.003821
/usr/lib/python2.7/site-packages/requests/adapters.py:388 HTTPAdapter.send                            2070                  0.064437    6.610118    0.003193
/usr/lib/python2.7/site-packages/urllib3/connectionpool.py:447 HTTPConnectionPool.urlopen             2070                  0.085589    3.781414    0.001827
/usr/lib/python2.7/site-packages/requests/sessions.py:401 Session.prepare_request                     2070                  0.078741    3.018275    0.001458
/usr/lib/python2.7/site-packages/urllib3/connectionpool.py:322 HTTPConnectionPool._make_request       2070                  0.077418    2.430796    0.001174
/usr/lib/python2.7/site-packages/requests/models.py:299 PreparedRequest.prepare                       2070                  0.038006    1.528752    0.000739
/usr/lib/python2.7/site-packages/requests/adapters.py:290 HTTPAdapter.get_connection                  2070                  0.024181    1.526116    0.000737
/usr/lib/python2.7/site-packages/urllib3/poolmanager.py:266 PoolManager.connection_from_url           2070                  0.017157    1.346354    0.000650
/usr/lib/python2.7/site-packages/urllib3/poolmanager.py:206 PoolManager.connection_from_host          2070                  0.015797    1.236867    0.000598
/usr/lib/python2.7/site-packages/requests/structures.py:42 CaseInsensitiveDict.__init__               12420                 0.108715    1.226514    0.000099
/usr/lib/python2.7/site-packages/urllib3/poolmanager.py:229 PoolManager.connection_from_context       2070                  0.018828    1.213145    0.000586
/usr/lib64/python2.7/httplib.py:1053 HTTPConnection.getresponse                                       2070                  0.034232    1.161452    0.000561
/usr/lib/python2.7/site-packages/requests/sessions.py:340 Session.__init__                            2070                  0.060470    1.106475    0.000535
/usr/lib/python2.7/site-packages/requests/sessions.py:50 merge_setting                                14490                 0.103324    1.102030    0.000076
/usr/lib/python2.7/site-packages/urllib3/poolmanager.py:242 PoolManager.connection_from_pool_key      2070                  0.034751    1.082293    0.000523
/usr/lib64/python2.7/httplib.py:1015 HTTPConnection.request                                           2070                  0.010323    1.073503    0.000519
/usr/lib64/python2.7/httplib.py:1036 HTTPConnection._send_request                                     2070                  0.074046    1.063180    0.000514
/usr/lib64/python2.7/httplib.py:437 HTTPResponse.begin                                                2070                  0.056034    1.041176    0.000503
/usr/lib64/python2.7/_abcoll.py:526 update                                                            22770                 0.277791    1.022363    0.000045
/usr/lib/python2.7/site-packages/requests/adapters.py:253 HTTPAdapter.build_response                  2070                  0.050085    1.008311    0.000487
/usr/lib/python2.7/site-packages/requests/sessions.py:398 Session.__exit__                            2070                  0.007221    0.889536    0.000430
/usr/lib/python2.7/site-packages/requests/sessions.py:705 Session.close                               2070                  0.016573    0.882314    0.000426
/usr/lib/python2.7/site-packages/urllib3/poolmanager.py:170 PoolManager._new_pool                     2070                  0.038002    0.854147    0.000413
/usr/lib/python2.7/site-packages/requests/adapters.py:313 HTTPAdapter.close                           4140                  0.021252    0.841980    0.000203
/usr/lib/python2.7/site-packages/requests/models.py:810 Response.content                              2070                  0.023343    0.821442    0.000397

果然和预期的一样,最占用CPU资源的是requests库。咦,为何有那么多sessions.py模块的调用,我们的需求,明明只是简单的HTTP GET请求。

替换大法好

啊,啊,啊,我们是不是杀鸡用牛刀了,赶紧的,把requests换成urllib2吧。

替换完,上机一跑,性能杠杠滴,带宽瞬间打满。