32.12. dis — Python 字节码的反汇编程序

源代码: Lib/dis.py


The dis 模块支持分析 CPython bytecode 通过反汇编它。此模块接受作为输入的 CPython 字节码的定义在文件 Include/opcode.h 并用于编译器和解释器。

CPython 实现细节: 字节码是 CPython 解释器的实现细节。不保证字节码不会在 Python 版本之间被添加、移除或改变。不应考虑使用此模块来跨 Python VM 或 Python 发行。

范例:给定函数 myfunc() :

def myfunc(alist):
    return len(alist)
					

以下命令可以用于显示反汇编的 myfunc() :

>>> dis.dis(myfunc)
  2           0 LOAD_GLOBAL              0 (len)
              3 LOAD_FAST                0 (alist)
              6 CALL_FUNCTION            1
              9 RETURN_VALUE
					

(2 是行号)。

32.12.1. 字节码分析

3.4 版新增。

字节码分析 API 允许将 Python 代码片段包裹在 Bytecode 对象,以提供对编译代码的详细轻松访问。

class dis. Bytecode ( x , * , first_line=None , current_offset=None )

Analyse the bytecode corresponding to a function, generator, method, string of source code, or a code object (as returned by compile() ).

This is a convenience wrapper around many of the functions listed below, most notably get_instructions() , as iterating over a Bytecode instance yields the bytecode operations as Instruction 实例。

first_line 不是 None , it indicates the line number that should be reported for the first source line in the disassembled code. Otherwise, the source line information (if any) is taken directly from the disassembled code object.

current_offset 不是 None , it refers to an instruction offset in the disassembled code. Setting this means dis() will display a “current instruction” marker against the specified opcode.

classmethod from_traceback ( tb )

构造 Bytecode instance from the given traceback, setting current_offset to the instruction responsible for the exception.

codeobj

编译代码对象。

first_line

代码对象的第 1 源行 (若可用)

dis ( )

Return a formatted view of the bytecode operations (the same as printed by dis.dis() , but returned as a multi-line string).

info ( )

Return a formatted multi-line string with detailed information about the code object, like code_info() .

范例:

>>> bytecode = dis.Bytecode(myfunc)
>>> for instr in bytecode:
...     print(instr.opname)
...
LOAD_GLOBAL
LOAD_FAST
CALL_FUNCTION
RETURN_VALUE
					

32.12.2. 分析函数

The dis module also defines the following analysis functions that convert the input directly to the desired output. They can be useful if only a single operation is being performed, so the intermediate analysis object isn’t useful:

dis. code_info ( x )

Return a formatted multi-line string with detailed code object information for the supplied function, generator, method, source code string or code object.

Note that the exact contents of code info strings are highly implementation dependent and they may change arbitrarily across Python VMs or Python releases.

3.2 版新增。

dis. show_code ( x , * , file=None )

Print detailed code object information for the supplied function, method, source code string or code object to file (或 sys.stdout if file is not specified).

This is a convenient shorthand for print(code_info(x), file=file) , intended for interactive exploration at the interpreter prompt.

3.2 版新增。

3.4 版改变: 添加 file 参数。

dis. dis ( x=None , * , file=None )

反汇编 x 对象。 x can denote either a module, a class, a method, a function, a generator, a code object, a string of source code or a byte sequence of raw bytecode. For a module, it disassembles all functions. For a class, it disassembles all methods (including class and static methods). For a code object or sequence of raw bytecode, it prints one line per bytecode instruction. Strings are first compiled to code objects with the compile() built-in function before being disassembled. If no object is provided, this function disassembles the last traceback.

The disassembly is written as text to the supplied file argument if provided and to sys.stdout 否则。

3.4 版改变: 添加 file 参数。

dis. distb ( tb=None , * , file=None )

Disassemble the top-of-stack function of a traceback, using the last traceback if none was passed. The instruction causing the exception is indicated.

The disassembly is written as text to the supplied file argument if provided and to sys.stdout 否则。

3.4 版改变: 添加 file 参数。

dis. disassemble ( code , lasti=-1 , * , file=None )
dis. disco ( code , lasti=-1 , * , file=None )

Disassemble a code object, indicating the last instruction if lasti was provided. The output is divided in the following columns:

  1. the line number, for the first instruction of each line
  2. the current instruction, indicated as --> ,
  3. a labelled instruction, indicated with >> ,
  4. the address of the instruction,
  5. the operation code name,
  6. operation parameters, and
  7. interpretation of the parameters in parentheses.

The parameter interpretation recognizes local and global variable names, constant values, branch targets, and compare operators.

The disassembly is written as text to the supplied file argument if provided and to sys.stdout 否则。

3.4 版改变: 添加 file 参数。

dis. get_instructions ( x , * , first_line=None )

Return an iterator over the instructions in the supplied function, method, source code string or code object.

The iterator generates a series of Instruction named tuples giving the details of each operation in the supplied code.

first_line 不是 None , it indicates the line number that should be reported for the first source line in the disassembled code. Otherwise, the source line information (if any) is taken directly from the disassembled code object.

3.4 版新增。

dis. findlinestarts ( code )

This generator function uses the co_firstlineno and co_lnotab attributes of the code object code to find the offsets which are starts of lines in the source code. They are generated as (offset, lineno) pairs.

dis. findlabels ( code )

Detect all offsets in the code object code which are jump targets, and return a list of these offsets.

dis. stack_effect ( opcode [ , oparg ] )

Compute the stack effect of opcode 采用自变量 oparg .

3.4 版新增。

32.12.3. Python Bytecode Instructions

The get_instructions() 函数和 Bytecode class provide details of bytecode instructions as Instruction 实例:

class dis. Instruction

用于字节码操作的细节

opcode

numeric code for operation, corresponding to the opcode values listed below and the bytecode values in the 操作码集合 .

opname

人性化可读操作的名称

arg

numeric argument to operation (if any), otherwise None

argval

resolved arg value (if known), otherwise same as arg

argrepr

human readable description of operation argument

offset

start index of operation within bytecode sequence

starts_line

line started by this opcode (if any), otherwise None

is_jump_target

True if other code jumps to here, otherwise False

3.4 版新增。

The Python compiler currently generates the following bytecode instructions.

General instructions

NOP

Do nothing code. Used as a placeholder by the bytecode optimizer.

POP_TOP

Removes the top-of-stack (TOS) item.

ROT_TWO

Swaps the two top-most stack items.

ROT_THREE

Lifts second and third stack item one position up, moves top down to position three.

DUP_TOP

Duplicates the reference on top of the stack.

DUP_TOP_TWO

Duplicates the two references on top of the stack, leaving them in the same order.

Unary operations

Unary operations take the top of the stack, apply the operation, and push the result back on the stack.

UNARY_POSITIVE

实现 TOS = +TOS .

UNARY_NEGATIVE

实现 TOS = -TOS .

UNARY_NOT

实现 TOS = not TOS .

UNARY_INVERT

实现 TOS = ~TOS .

GET_ITER

实现 TOS = iter(TOS) .

GET_YIELD_FROM_ITER

TOS 生成器迭代器 or 协程 object it is left as is. Otherwise, implements TOS = iter(TOS) .

3.5 版新增。

Binary operations

Binary operations remove the top of the stack (TOS) and the second top-most stack item (TOS1) from the stack. They perform the operation, and put the result back on the stack.

BINARY_POWER

实现 TOS = TOS1 ** TOS .

BINARY_MULTIPLY

实现 TOS = TOS1 * TOS .

BINARY_MATRIX_MULTIPLY

实现 TOS = TOS1 @ TOS .

3.5 版新增。

BINARY_FLOOR_DIVIDE

实现 TOS = TOS1 // TOS .

BINARY_TRUE_DIVIDE

实现 TOS = TOS1 / TOS .

BINARY_MODULO

实现 TOS = TOS1 % TOS .

BINARY_ADD

实现 TOS = TOS1 + TOS .

BINARY_SUBTRACT

实现 TOS = TOS1 - TOS .

BINARY_SUBSCR

实现 TOS = TOS1[TOS] .

BINARY_LSHIFT

实现 TOS = TOS1 << TOS .

BINARY_RSHIFT

实现 TOS = TOS1 >> TOS .

BINARY_AND

实现 TOS = TOS1 & TOS .

BINARY_XOR

实现 TOS = TOS1 ^ TOS .

BINARY_OR

实现 TOS = TOS1 | TOS .

In-place operations

In-place operations are like binary operations, in that they remove TOS and TOS1, and push the result back on the stack, but the operation is done in-place when TOS1 supports it, and the resulting TOS may be (but does not have to be) the original TOS1.

INPLACE_POWER

Implements in-place TOS = TOS1 ** TOS .

INPLACE_MULTIPLY

Implements in-place TOS = TOS1 * TOS .

INPLACE_MATRIX_MULTIPLY

Implements in-place TOS = TOS1 @ TOS .

3.5 版新增。

INPLACE_FLOOR_DIVIDE

Implements in-place TOS = TOS1 // TOS .

INPLACE_TRUE_DIVIDE

Implements in-place TOS = TOS1 / TOS .

INPLACE_MODULO

Implements in-place TOS = TOS1 % TOS .

INPLACE_ADD

Implements in-place TOS = TOS1 + TOS .

INPLACE_SUBTRACT

Implements in-place TOS = TOS1 - TOS .

INPLACE_LSHIFT

Implements in-place TOS = TOS1 << TOS .

INPLACE_RSHIFT

Implements in-place TOS = TOS1 >> TOS .

INPLACE_AND

Implements in-place TOS = TOS1 & TOS .

INPLACE_XOR

Implements in-place TOS = TOS1 ^ TOS .

INPLACE_OR

Implements in-place TOS = TOS1 | TOS .

STORE_SUBSCR

实现 TOS1[TOS] = TOS2 .

DELETE_SUBSCR

实现 del TOS1[TOS] .

Coroutine opcodes

GET_AWAITABLE

实现 TOS = get_awaitable(TOS) ,其中 get_awaitable(o) 返回 o if o is a coroutine object or a generator object with the CO_ITERABLE_COROUTINE flag, or resolves o.__await__ .

GET_AITER

实现 TOS = get_awaitable(TOS.__aiter__()) 。见 GET_AWAITABLE for details about get_awaitable

GET_ANEXT

实现 PUSH(get_awaitable(TOS.__anext__())) 。见 GET_AWAITABLE for details about get_awaitable

BEFORE_ASYNC_WITH

Resolves __aenter__ and __aexit__ from the object on top of the stack. Pushes __aexit__ and result of __aenter__() to the stack.

SETUP_ASYNC_WITH

创建新的帧对象。

Miscellaneous opcodes

PRINT_EXPR

Implements the expression statement for the interactive mode. TOS is removed from the stack and printed. In non-interactive mode, an expression statement is terminated with POP_TOP .

BREAK_LOOP

Terminates a loop due to a break 语句。

CONTINUE_LOOP ( target )

Continues a loop due to a continue 语句。 target is the address to jump to (which should be a FOR_ITER instruction).

SET_ADD ( i )

调用 set.add(TOS1[-i], TOS) . Used to implement set comprehensions.

LIST_APPEND ( i )

调用 list.append(TOS[-i], TOS) . Used to implement list comprehensions.

MAP_ADD ( i )

调用 dict.setitem(TOS1[-i], TOS, TOS1) . Used to implement dict comprehensions.

For all of the SET_ADD , LIST_APPEND and MAP_ADD instructions, while the added value or key/value pair is popped off, the container object remains on the stack so that it is available for further iterations of the loop.

RETURN_VALUE

Returns with TOS to the caller of the function.

YIELD_VALUE

Pops TOS and yields it from a generator .

YIELD_FROM

Pops TOS and delegates to it as a subiterator from a generator .

3.3 版新增。

IMPORT_STAR

Loads all symbols not starting with '_' directly from the module TOS to the local namespace. The module is popped after loading all names. This opcode implements from module import * .

POP_BLOCK

Removes one block from the block stack. Per frame, there is a stack of blocks, denoting nested loops, try statements, and such.

POP_EXCEPT

Removes one block from the block stack. The popped block must be an exception handler block, as implicitly created when entering an except handler. In addition to popping extraneous values from the frame stack, the last three popped values are used to restore the exception state.

END_FINALLY

Terminates a finally clause. The interpreter recalls whether the exception has to be re-raised, or whether the function returns, and continues with the outer-next block.

LOAD_BUILD_CLASS

Pushes builtins.__build_class__() onto the stack. It is later called by CALL_FUNCTION to construct a class.

SETUP_WITH ( delta )

This opcode performs several operations before a with block starts. First, it loads __exit__() from the context manager and pushes it onto the stack for later use by WITH_CLEANUP . Then, __enter__() is called, and a finally block pointing to delta is pushed. Finally, the result of calling the enter method is pushed onto the stack. The next opcode will either ignore it ( POP_TOP ), or store it in (a) variable(s) ( STORE_FAST , STORE_NAME ,或 UNPACK_SEQUENCE ).

WITH_CLEANUP_START

Cleans up the stack when a with statement block exits. TOS is the context manager’s __exit__() bound method. Below TOS are 1–3 values indicating how/why the finally clause was entered:

  • SECOND = None
  • (SECOND, THIRD) = ( WHY_{RETURN,CONTINUE} ), retval
  • SECOND = WHY_* ; no retval below it
  • (SECOND, THIRD, FOURTH) = exc_info()

In the last case, TOS(SECOND, THIRD, FOURTH) 被调用,否则 TOS(None, None, None) . Pushes SECOND and result of the call to the stack.

WITH_CLEANUP_FINISH

Pops exception type and result of ‘exit’ function call from the stack.

If the stack represents an exception, and the function call returns a ‘true’ value, this information is “zapped” and replaced with a single WHY_SILENCED to prevent END_FINALLY from re-raising the exception. (But non-local gotos will still be resumed.)

All of the following opcodes expect arguments. An argument is two bytes, with the more significant byte last.

STORE_NAME ( namei )

实现 name = TOS . namei is the index of name in the attribute co_names of the code object. The compiler tries to use STORE_FAST or STORE_GLOBAL 若可能的话。

DELETE_NAME ( namei )

实现 del name ,其中 namei is the index into co_names attribute of the code object.

UNPACK_SEQUENCE ( count )

Unpacks TOS into count individual values, which are put onto the stack right-to-left.

UNPACK_EX ( counts )

Implements assignment with a starred target: Unpacks an iterable in TOS into individual values, where the total number of values can be smaller than the number of items in the iterable: one of the new values will be a list of all leftover items.

The low byte of counts is the number of values before the list value, the high byte of counts the number of values after it. The resulting values are put onto the stack right-to-left.

STORE_ATTR ( namei )

实现 TOS.name = TOS1 ,其中 namei is the index of name in co_names .

DELETE_ATTR ( namei )

实现 del TOS.name ,使用 namei as index into co_names .

STORE_GLOBAL ( namei )

Works as STORE_NAME , but stores the name as a global.

DELETE_GLOBAL ( namei )

Works as DELETE_NAME , but deletes a global name.

LOAD_CONST ( consti )

Pushes co_consts[consti] 在堆栈。

LOAD_NAME ( namei )

Pushes the value associated with co_names[namei] 在堆栈。

BUILD_TUPLE ( count )

Creates a tuple consuming count items from the stack, and pushes the resulting tuple onto the stack.

BUILD_LIST ( count )

Works as BUILD_TUPLE , but creates a list.

BUILD_SET ( count )

Works as BUILD_TUPLE , but creates a set.

BUILD_MAP ( count )

Pushes a new dictionary object onto the stack. Pops 2 * count items so that the dictionary holds count entries: {..., TOS3: TOS2, TOS1: TOS} .

3.5 版改变: The dictionary is created from stack items instead of creating an empty dictionary pre-sized to hold count 项。

BUILD_TUPLE_UNPACK ( count )

Pops count iterables from the stack, joins them in a single tuple, and pushes the result. Implements iterable unpacking in tuple displays (*x, *y, *z) .

3.5 版新增。

BUILD_LIST_UNPACK ( count )

这类似于 BUILD_TUPLE_UNPACK , but pushes a list instead of tuple. Implements iterable unpacking in list displays [*x, *y, *z] .

3.5 版新增。

BUILD_SET_UNPACK ( count )

这类似于 BUILD_TUPLE_UNPACK , but pushes a set instead of tuple. Implements iterable unpacking in set displays {*x, *y, *z} .

3.5 版新增。

BUILD_MAP_UNPACK ( count )

Pops count mappings from the stack, merges them into a single dictionary, and pushes the result. Implements dictionary unpacking in dictionary displays {**x, **y, **z} .

3.5 版新增。

BUILD_MAP_UNPACK_WITH_CALL ( oparg )

这类似于 BUILD_MAP_UNPACK , but is used for f(**x, **y, **z) call syntax. The lowest byte of oparg is the count of mappings, the relative position of the corresponding callable f is encoded in the second byte of oparg .

3.5 版新增。

LOAD_ATTR ( namei )

Replaces TOS with getattr(TOS, co_names[namei]) .

COMPARE_OP ( opname )

Performs a Boolean operation. The operation name can be found in cmp_op[opname] .

IMPORT_NAME ( namei )

Imports the module co_names[namei] . TOS and TOS1 are popped and provide the fromlist and level arguments of __import__() . The module object is pushed onto the stack. The current namespace is not affected: for a proper import statement, a subsequent STORE_FAST instruction modifies the namespace.

IMPORT_FROM ( namei )

Loads the attribute co_names[namei] from the module found in TOS. The resulting object is pushed onto the stack, to be subsequently stored by a STORE_FAST instruction.

JUMP_FORWARD ( delta )

Increments bytecode counter by delta .

POP_JUMP_IF_TRUE ( target )

If TOS is true, sets the bytecode counter to target . TOS is popped.

POP_JUMP_IF_FALSE ( target )

If TOS is false, sets the bytecode counter to target . TOS is popped.

JUMP_IF_TRUE_OR_POP ( target )

If TOS is true, sets the bytecode counter to target and leaves TOS on the stack. Otherwise (TOS is false), TOS is popped.

JUMP_IF_FALSE_OR_POP ( target )

If TOS is false, sets the bytecode counter to target and leaves TOS on the stack. Otherwise (TOS is true), TOS is popped.

JUMP_ABSOLUTE ( target )

将字节码计数器设为 target .

FOR_ITER ( delta )

TOS 是 iterator . Call its __next__() method. If this yields a new value, push it on the stack (leaving the iterator below it). If the iterator indicates it is exhausted TOS is popped, and the byte code counter is incremented by delta .

LOAD_GLOBAL ( namei )

Loads the global named co_names[namei] 在堆栈。

SETUP_LOOP ( delta )

Pushes a block for a loop onto the block stack. The block spans from the current instruction with a size of delta 字节。

SETUP_EXCEPT ( delta )

Pushes a try block from a try-except clause onto the block stack. delta points to the first except block.

SETUP_FINALLY ( delta )

Pushes a try block from a try-except clause onto the block stack. delta points to the finally block.

LOAD_FAST ( var_num )

将引用压入本地 co_varnames[var_num] 在堆栈。

STORE_FAST ( var_num )

将 TOS 存储到本地 co_varnames[var_num] .

DELETE_FAST ( var_num )

删除本地 co_varnames[var_num] .

LOAD_CLOSURE ( i )

Pushes a reference to the cell contained in slot i of the cell and free variable storage. The name of the variable is co_cellvars[i] if i is less than the length of co_cellvars . Otherwise it is co_freevars[i - len(co_cellvars)] .

LOAD_DEREF ( i )

Loads the cell contained in slot i of the cell and free variable storage. Pushes a reference to the object the cell contains on the stack.

LOAD_CLASSDEREF ( i )

Much like LOAD_DEREF but first checks the locals dictionary before consulting the cell. This is used for loading free variables in class bodies.

STORE_DEREF ( i )

Stores TOS into the cell contained in slot i of the cell and free variable storage.

DELETE_DEREF ( i )

Empties the cell contained in slot i of the cell and free variable storage. Used by the del 语句。

RAISE_VARARGS ( argc )

Raises an exception. argc indicates the number of parameters to the raise statement, ranging from 0 to 3. The handler will find the traceback as TOS2, the parameter as TOS1, and the exception as TOS.

CALL_FUNCTION ( argc )

Calls a function. The low byte of argc indicates the number of positional parameters, the high byte the number of keyword parameters. On the stack, the opcode finds the keyword parameters first. For each keyword argument, the value is on top of the key. Below the keyword parameters, the positional parameters are on the stack, with the right-most parameter on top. Below the parameters, the function object to call is on the stack. Pops all function arguments, and the function itself off the stack, and pushes the return value.

MAKE_FUNCTION ( argc )

Pushes a new function object on the stack. From bottom to top, the consumed stack must consist of

  • argc & 0xFF default argument objects in positional order
  • (argc >> 8) & 0xFF pairs of name and default argument, with the name just below the object on the stack, for keyword-only parameters
  • (argc >> 16) & 0x7FFF parameter annotation objects
  • a tuple listing the parameter names for the annotations (only if there are ony annotation objects)
  • the code associated with the function (at TOS1)
  • the 合格名称 of the function (at TOS)
MAKE_CLOSURE ( argc )

Creates a new function object, sets its __closure__ slot, and pushes it on the stack. TOS is the 合格名称 of the function, TOS1 is the code associated with the function, and TOS2 is the tuple containing cells for the closure’s free variables. argc is interpreted as in MAKE_FUNCTION ; the annotations and defaults are also in the same order below TOS2.

BUILD_SLICE ( argc )

Pushes a slice object on the stack. argc must be 2 or 3. If it is 2, slice(TOS1, TOS) is pushed; if it is 3, slice(TOS2, TOS1, TOS) is pushed. See the slice() built-in function for more information.

EXTENDED_ARG ( ext )

Prefixes any opcode which has an argument too big to fit into the default two bytes. ext holds two additional bytes which, taken together with the subsequent opcode’s argument, comprise a four-byte argument, ext being the two most-significant bytes.

CALL_FUNCTION_VAR ( argc )

Calls a function. argc is interpreted as in CALL_FUNCTION . The top element on the stack contains the variable argument list, followed by keyword and positional arguments.

CALL_FUNCTION_KW ( argc )

Calls a function. argc is interpreted as in CALL_FUNCTION . The top element on the stack contains the keyword arguments dictionary, followed by explicit keyword and positional arguments.

CALL_FUNCTION_VAR_KW ( argc )

Calls a function. argc is interpreted as in CALL_FUNCTION . The top element on the stack contains the keyword arguments dictionary, followed by the variable-arguments tuple, followed by explicit keyword and positional arguments.

HAVE_ARGUMENT

This is not really an opcode. It identifies the dividing line between opcodes which don’t take arguments < HAVE_ARGUMENT and those which do >= HAVE_ARGUMENT .

32.12.4. 操作码集合

These collections are provided for automatic introspection of bytecode instructions:

dis. opname

Sequence of operation names, indexable using the bytecode.

dis. opmap

Dictionary mapping operation names to bytecodes.

dis. cmp_op

Sequence of all compare operation names.

dis. hasconst

Sequence of bytecodes that have a constant parameter.

dis. hasfree

Sequence of bytecodes that access a free variable (note that ‘free’ in this context refers to names in the current scope that are referenced by inner scopes or names in outer scopes that are referenced from this scope. It does not include references to global or builtin scopes).

dis. hasname

Sequence of bytecodes that access an attribute by name.

dis. hasjrel

Sequence of bytecodes that have a relative jump target.

dis. hasjabs

Sequence of bytecodes that have an absolute jump target.

dis. haslocal

Sequence of bytecodes that access a local variable.

dis. hascompare

Sequence of bytecodes of Boolean operations.