让我们提高效率水平:
最有效的(如果没有禁止的话)是事先删除"\n"-instances(使用智能高效的O/S工具)然后处理文件的“rest” -I/O(其中python内部,根据定义,再次附加“\n”,一旦在aFileINPUT-iterator中使用,如文档中所述,不管os.filesep == { "\n" | "\r\n" | "\r" | ... }实际上用于“行” -separation 步骤,在迭代器输入流上)。
让我们衡量效率水平 - 通过decoding 实际操作流程:
关于使用 map( lambda ) :
############################################################# EFFICIENCY LIMITS :
# - pure-[SERIAL]
# - local-GIL-lock
# - local-CPU
# - local-RAM-I/O :
>>> def a_map_lambda_loop( aFileINPUT ):
... for line in map( lambda s: s.strip( "\n" ), aFileINPUT ):
... do_something( line )
>>> dis.dis( a_map_lambda_loop )
2 0 SETUP_LOOP 36 (to 39)
3 LOAD_GLOBAL 0 (map)
6 LOAD_CONST 1 (<code object <lambda> at 0x7ff8fee7b930, file "<stdin>", line 2>)
9 MAKE_FUNCTION 0
12 LOAD_FAST 0 (aFileINPUT)
15 CALL_FUNCTION 2
18 GET_ITER
>> 19 FOR_ITER 16 (to 38)
22 STORE_FAST 1 (line)
3 25 LOAD_GLOBAL 1 (do_something)
28 LOAD_FAST 1 (line)
31 CALL_FUNCTION 1
34 POP_TOP
35 JUMP_ABSOLUTE 19
>> 38 POP_BLOCK
>> 39 LOAD_CONST 0 (None)
42 RETURN_VALUE
关于使用@chepner-提升loop:
############################################################# EFFICIENCY LIMITS :
# - pure-[SERIAL]
# - local-GIL-lock
# - local-CPU
# - local-RAM-I/O :
>>> def a_loop_runner( aFileINPUT ):
... for line in aFileINPUT:
... line = line.strip( "\n" )
... do_something( line )
>>> dis.dis( a_loop_runner )
2 0 SETUP_LOOP 39 (to 42)
3 LOAD_FAST 0 (aFileINPUT)
6 GET_ITER
>> 7 FOR_ITER 31 (to 41)
10 STORE_FAST 1 (line)
3 13 LOAD_FAST 1 (line)
16 LOAD_ATTR 0 (strip)
19 LOAD_CONST 1 ('\n')
22 CALL_FUNCTION 1
25 STORE_FAST 1 (line)
4 28 LOAD_GLOBAL 1 (do_something)
31 LOAD_FAST 1 (line)
34 CALL_FUNCTION 1
37 POP_TOP
38 JUMP_ABSOLUTE 7
>> 41 POP_BLOCK
>> 42 LOAD_CONST 0 (None)
45 RETURN_VALUE
关于使用 methodcaller() :
############################################################# EFFICIENCY LIMITS :
# - pure-[SERIAL]
# - local-GIL-lock
# - local-CPU
# - local-RAM-I/O :
>>> def a_methodcaller_loop( aFileINPUT ):
... for line in map( methodcaller( "strip", "\n" ), aFileINPUT ):
... do_something( line )
>>> dis.dis( a_methodcaller_loop )
2 0 SETUP_LOOP 42 (to 45)
3 LOAD_GLOBAL 0 (map)
6 LOAD_GLOBAL 1 (methodcaller)
9 LOAD_CONST 1 ('strip')
12 LOAD_CONST 2 ('\n')
15 CALL_FUNCTION 2
18 LOAD_FAST 0 (aFileINPUT)
21 CALL_FUNCTION 2
24 GET_ITER
>> 25 FOR_ITER 16 (to 44)
28 STORE_FAST 1 (line)
3 31 LOAD_GLOBAL 2 (do_something)
34 LOAD_FAST 1 (line)
37 CALL_FUNCTION 1
40 POP_TOP
41 JUMP_ABSOLUTE 25
>> 44 POP_BLOCK
>> 45 LOAD_CONST 0 (None)
48 RETURN_VALUE
在使用 ALAP .strip() 呼叫时,如果 .strip() 无法延迟到 do_something(),并且可能已分发,为了获得更高的处理效率 - { pure-[SERIAL] |只是-[CONCURRENT] },{ 本地 |独立的 }-GIL 锁,{ 本地 |分布式 }-CPU,{ 本地 |分布式}-RAM-I/O:
############################################################# EFFICIENCY LIMITS :
# - pure-[SERIAL] |+ just-[CONCURRENT]
# - local-GIL-lock|+ independent-GIL-lock
# - local-CPU |+ independent-CPUs
# - local-RAM-I/O |+ independent-RAM-I/O
>>> def ALAP_runner( aFileINPUT ):
... for line in aFileINPUT:
... do_something( line.strip( "\n" ) )
>>> dis.dis( ALAP_runner )
2 0 SETUP_LOOP 33 (to 36)
3 LOAD_FAST 0 (aFileINPUT)
6 GET_ITER
>> 7 FOR_ITER 25 (to 35)
10 STORE_FAST 1 (line)
3 13 LOAD_GLOBAL 0 (do_something)
16 LOAD_FAST 1 (line)
19 LOAD_ATTR 1 (strip)
22 LOAD_CONST 1 ('\n')
25 CALL_FUNCTION 1
28 CALL_FUNCTION 1
31 POP_TOP
32 JUMP_ABSOLUTE 7
>> 35 POP_BLOCK
>> 36 LOAD_CONST 0 (None)
39 RETURN_VALUE
更多细节在很大程度上取决于 do_something() 的性质和实际的 overhead-strict re-formulated Amdahl's Law 成本(查看所有附加管理费用并添加如果从纯-[SERIAL] 到 just-@ 987654342@,如果{ process | node }-distributed则更多。
在 list-理解与基于 if 的成员分配器 -pure-[SERIAL]、local-GIL-lock、local-CPU、 local-RAM-I/O(非常不受即时语法构造函数不可挽救的内存分配MemoryError崩溃的保护):
############################################################# EFFICIENCY LIMITS :
# - pure-[SERIAL]
# - local-GIL-lock
# - local-CPU
# - local-RAM-I/O :
>>> def anOnTheFlyGrowingListComprehension( self ):
... res = [x for x in list(set(self.database.values())) if x.startswith(text)]
>>> dis.dis( anOnTheFlyGrowingListComprehension )
2 0 BUILD_LIST 0
3 LOAD_GLOBAL 0 (list)
6 LOAD_GLOBAL 1 (set)
9 LOAD_FAST 0 (self)
12 LOAD_ATTR 2 (database)
15 LOAD_ATTR 3 (values)
18 CALL_FUNCTION 0
21 CALL_FUNCTION 1
24 CALL_FUNCTION 1
27 GET_ITER
>> 28 FOR_ITER 27 (to 58)
31 STORE_FAST 1 (x)
34 LOAD_FAST 1 (x)
37 LOAD_ATTR 4 (startswith)
40 LOAD_GLOBAL 5 (text)
43 CALL_FUNCTION 1
46 POP_JUMP_IF_FALSE 28
49 LOAD_FAST 1 (x)
52 LIST_APPEND 2
55 JUMP_ABSOLUTE 28
>> 58 STORE_FAST 2 (results)
61 LOAD_CONST 0 (None)
64 RETURN_VALUE
或
另一个更近距离的迭代器公式化纯-[SERIAL] "front"-end .strip()-er:
############################################################# EFFICIENCY LIMITS :
# - pure-[SERIAL]
# - local-GIL-lock
# - local-CPU
# - local-RAM-I/O :
>>> dis.dis( '( do_something( line.strip( "\n" ) ) for line in aFileINPUT )' )
0 STORE_SLICE+0
1 SLICE+2
2 LOAD_CONST 24431 (24431)
5 POP_JUMP_IF_TRUE 28015
8 LOAD_NAME 26740 (26740)
11 BUILD_MAP 26478
14 STORE_SLICE+0
15 SLICE+2
16 IMPORT_NAME 28265 (28265)
19 LOAD_NAME 29486 (29486)
22 LOAD_GLOBAL 26994 (26994)
25 JUMP_IF_TRUE_OR_POP 8232
28 <34>
29 UNARY_POSITIVE
30 <34>
31 SLICE+2
32 STORE_SLICE+1
33 SLICE+2
34 STORE_SLICE+1
35 SLICE+2
36 BUILD_TUPLE 29295
39 SLICE+2
40 IMPORT_NAME 28265 (28265)
43 LOAD_NAME 26912 (26912)
46 JUMP_FORWARD 24864 (to 24913)
49 PRINT_EXPR
50 BUILD_MAP 25964
53 PRINT_ITEM_TO
54 INPLACE_XOR
55 BREAK_LOOP
56 EXEC_STMT
57 IMPORT_STAR
58 SLICE+2
59 STORE_SLICE+1