3. 数据模型 ¶
3.1. 对象、值及类型 ¶
对象 是 Python 对数据的抽象。Python 程序中的所有数据,由对象 (或对象之间的关系) 表示 (从某种意义上说,与冯·诺伊曼的 "存储程序计算机" 模型一致,代码也由对象表示)。
每个对象都有标识、类型和值。对象的
identity
never changes once it has been created; you may think of it as the object’s address in memory. The
is
operator compares the identity of two objects; the
id()
函数返回其标识的表示整数。
CPython 实现细节:
对于 CPython,
id(x)
是内存地址而
x
是存储。
An object’s type determines the operations that the object supports (e.g., “does it have a length?”) and also defines the possible values for objects of that type. The
type()
function returns an object’s type (which is an object itself). Like its identity, an object’s
type
也是不可变的。
[
1
]
The value of some objects can change. Objects whose value can change are said to be 可变 ; objects whose value is unchangeable once they are created are called immutable . (The value of an immutable container object that contains a reference to a mutable object can change when the latter’s value is changed; however the container is still considered immutable, because the collection of objects it contains cannot be changed. So, immutability is not strictly the same as having an unchangeable value, it is more subtle.) An object’s mutability is determined by its type; for instance, numbers, strings and tuples are immutable, while dictionaries and lists are mutable.
Objects are never explicitly destroyed; however, when they become unreachable they may be garbage-collected. An implementation is allowed to postpone garbage collection or omit it altogether — it is a matter of implementation quality how garbage collection is implemented, as long as no objects are collected that are still reachable.
CPython 实现细节:
CPython currently uses a reference-counting scheme with (optional) delayed detection of cyclically linked garbage, which collects most objects as soon as they become unreachable, but is not guaranteed to collect garbage containing circular references. See the documentation of the
gc
module for information on controlling the collection of cyclic garbage. Other implementations act differently and CPython may change. Do not depend on immediate finalization of objects when they become unreachable (so you should always close files explicitly).
Note that the use of the implementation’s tracing or debugging facilities may keep objects alive that would normally be collectable. Also note that catching an exception with a
try
…
except
statement may keep objects alive.
Some objects contain references to “external” resources such as open files or windows. It is understood that these resources are freed when the object is garbage-collected, but since garbage collection is not guaranteed to happen, such objects also provide an explicit way to release the external resource, usually a
close()
method. Programs are strongly recommended to explicitly close such objects. The
try
…
finally
statement and the
with
statement provide convenient ways to do this.
Some objects contain references to other objects; these are called containers . Examples of containers are tuples, lists and dictionaries. The references are part of a container’s value. In most cases, when we talk about the value of a container, we imply the values, not the identities of the contained objects; however, when we talk about the mutability of a container, only the identities of the immediately contained objects are implied. So, if an immutable container (like a tuple) contains a reference to a mutable object, its value changes if that mutable object is changed.
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. For example, after
a = 1; b = 1
,
a
and
b
may or may not refer to the same object with the value one, depending on the implementation. This is because
int
is an immutable type, so the reference to
1
can be reused. This behaviour depends on the implementation used, so should not be relied upon, but is something to be aware of when making use of object identity tests. However, after
c = []; d = []
,
c
and
d
are guaranteed to refer to two different, unique, newly created empty lists. (Note that
e = f = []
assigns the
same
object to both
e
and
f
)。
3.2. 标准类型层次结构 ¶
Below is a list of the types that are built into Python. Extension modules (written in C, Java, or other languages, depending on the implementation) can define additional types. Future versions of Python may add types to the type hierarchy (e.g., rational numbers, efficiently stored arrays of integers, etc.), although such additions will often be provided via the standard library instead.
Some of the type descriptions below contain a paragraph listing ‘special attributes.’ These are attributes that provide access to the implementation and are not intended for general use. Their definition may change in the future.
3.2.1. None ¶
This type has a single value. There is a single object with this value. This object is accessed through the built-in name
None
. It is used to signify the absence of a value in many situations, e.g., it is returned from functions that don’t explicitly return anything. Its truth value is false.
3.2.2. NotImplemented ¶
This type has a single value. There is a single object with this value. This object is accessed through the built-in name
NotImplemented
. Numeric methods and rich comparison methods should return this value if they do not implement the operation for the operands provided. (The interpreter will then try the reflected operation, or some other fallback, depending on the operator.) It should not be evaluated in a boolean context.
见 实现算术运算 了解更多细节。
3.9 版改变:
估算
NotImplemented
在布尔上下文被弃用。虽然目前将它评估为 True,但会发出
DeprecationWarning
。它将引发
TypeError
在未来 Python 版本中。
3.2.3. Ellipsis ¶
This type has a single value. There is a single object with this value. This object is accessed through the literal
...
或内置名称
Ellipsis
。其真值为 True。
3.2.4.
numbers.Number
¶
These are created by numeric literals and returned as results by arithmetic operators and arithmetic built-in functions. Numeric objects are immutable; once created their value never changes. Python numbers are of course strongly related to mathematical numbers, but subject to the limitations of numerical representation in computers.
数值类的字符串表示,计算通过
__repr__()
and
__str__()
,拥有下列特性:
-
They are valid numeric literals which, when passed to their class constructor, produce an object having the value of the original numeric.
-
以 10 为基表示,当可能时。
-
Leading zeros, possibly excepting a single zero before a decimal point, are not shown.
-
Trailing zeros, possibly excepting a single zero after a decimal point, are not shown.
-
A sign is shown only when the number is negative.
Python distinguishes between integers, floating-point numbers, and complex numbers:
3.2.4.1.
numbers.Integral
¶
These represent elements from the mathematical set of integers (positive and negative).
注意
The rules for integer representation are intended to give the most meaningful interpretation of shift and mask operations involving negative integers.
有 2 种整数类型:
-
Integers (
int) -
These represent numbers in an unlimited range, subject to available (virtual) memory only. For the purpose of shift and mask operations, a binary representation is assumed, and negative numbers are represented in a variant of 2’s complement which gives the illusion of an infinite string of sign bits extending to the left.
-
Booleans (
bool) -
These represent the truth values False and True. The two objects representing the values
FalseandTrueare the only Boolean objects. The Boolean type is a subtype of the integer type, and Boolean values behave like the values 0 and 1, respectively, in almost all contexts, the exception being that when converted to a string, the strings"False"or"True"被分别返回。
3.2.4.2.
numbers.Real
(
float
)
¶
These represent machine-level double precision floating-point numbers. You are at the mercy of the underlying machine architecture (and C or Java implementation) for the accepted range and handling of overflow. Python does not support single-precision floating-point numbers; the savings in processor and memory usage that are usually the reason for using these are dwarfed by the overhead of using objects in Python, so there is no reason to complicate the language with two kinds of floating-point numbers.
3.2.4.3.
numbers.Complex
(
complex
)
¶
These represent complex numbers as a pair of machine-level double precision floating-point numbers. The same caveats apply as for floating-point numbers. The real and imaginary parts of a complex number
z
can be retrieved through the read-only attributes
z.real
and
z.imag
.
3.2.5. 序列 ¶
These represent finite ordered sets indexed by non-negative numbers. The built-in function
len()
returns the number of items of a sequence. When the length of a sequence is
n
, the index set contains the numbers 0, 1, …,
n
-1. Item
i
of sequence
a
is selected by
a[i]
. Some sequences, including built-in sequences, interpret negative subscripts by adding the sequence length. For example,
a[-2]
等于
a[n-2]
, the second to last item of sequence a with length
n
.
Sequences also support slicing:
a[i:j]
selects all items with index
k
这样
i
<=
k
<
j
. When used as an expression, a slice is a sequence of the same type. The comment above about negative indexes also applies to negative slice positions.
Some sequences also support “extended slicing” with a third “step” parameter:
a[i:j:k]
selects all items of
a
with index
x
where
x = i + n*k
,
n
>=
0
and
i
<=
x
<
j
.
序列根据其可变性来区分:
3.2.5.1. 不可变序列 ¶
An object of an immutable sequence type cannot change once it is created. (If the object contains references to other objects, these other objects may be mutable and may be changed; however, the collection of objects directly referenced by an immutable object cannot change.)
下列类型是不可变序列:
- 字符串
-
A string is a sequence of values that represent Unicode code points. All the code points in the range
U+0000 - U+10FFFFcan be represented in a string. Python doesn’t have a char type; instead, every code point in the string is represented as a string object with length1. The built-in functionord()converts a code point from its string form to an integer in the range0 - 10FFFF;chr()converts an integer in the range0 - 10FFFFto the corresponding length1string object.str.encode()can be used to convert astrtobytesusing the given text encoding, andbytes.decode()can be used to achieve the opposite. - 元组
-
The items of a tuple are arbitrary Python objects. Tuples of two or more items are formed by comma-separated lists of expressions. A tuple of one item (a ‘singleton’) can be formed by affixing a comma to an expression (an expression by itself does not create a tuple, since parentheses must be usable for grouping of expressions). An empty tuple can be formed by an empty pair of parentheses.
- 字节
-
A bytes object is an immutable array. The items are 8-bit bytes, represented by integers in the range 0 <= x < 256. Bytes literals (like
b'abc') and the built-inbytes()constructor can be used to create bytes objects. Also, bytes objects can be decoded to strings via thedecode()方法。
3.2.5.2. 可变序列 ¶
Mutable sequences can be changed after they are created. The subscription and slicing notations can be used as the target of assignment and
del
(删除) 语句。
注意
The
collections
and
array
module provide additional examples of mutable sequence types.
There are currently two intrinsic mutable sequence types:
- 列表
-
The items of a list are arbitrary Python objects. Lists are formed by placing a comma-separated list of expressions in square brackets. (Note that there are no special cases needed to form lists of length 0 or 1.)
- 字节数组
-
A bytearray object is a mutable array. They are created by the built-in
bytearray()constructor. Aside from being mutable (and hence unhashable), byte arrays otherwise provide the same interface and functionality as immutablebytes对象。
3.2.6. 集类型 ¶
These represent unordered, finite sets of unique, immutable objects. As such, they cannot be indexed by any subscript. However, they can be iterated over, and the built-in function
len()
returns the number of items in a set. Common uses for sets are fast membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.
For set elements, the same immutability rules apply as for dictionary keys. Note that numeric types obey the normal rules for numeric comparison: if two numbers compare equal (e.g.,
1
and
1.0
), only one of them can be contained in a set.
There are currently two intrinsic set types:
- 集
-
These represent a mutable set. They are created by the built-in
set()constructor and can be modified afterwards by several methods, such asadd(). - 冻结集
-
These represent an immutable set. They are created by the built-in
frozenset()constructor. As a frozenset is immutable and hashable , it can be used again as an element of another set, or as a dictionary key.
3.2.7. 映射 ¶
These represent finite sets of objects indexed by arbitrary index sets. The subscript notation
a[k]
selects the item indexed by
k
from the mapping
a
; this can be used in expressions and as the target of assignments or
del
statements. The built-in function
len()
returns the number of items in a mapping.
There is currently a single intrinsic mapping type:
3.2.7.1. 字典 ¶
These represent finite sets of objects indexed by nearly arbitrary values. The only types of values not acceptable as keys are values containing lists or dictionaries or other mutable types that are compared by value rather than by object identity, the reason being that the efficient implementation of dictionaries requires a key’s hash value to remain constant. Numeric types used for keys obey the normal rules for numeric comparison: if two numbers compare equal (e.g.,
1
and
1.0
) then they can be used interchangeably to index the same dictionary entry.
Dictionaries preserve insertion order, meaning that keys will be produced in the same order they were added sequentially over the dictionary. Replacing an existing key does not change the order, however removing a key and re-inserting it will add it to the end instead of keeping its old place.
字典是可变的;可以创建它们通过
{}
表示法 (见章节
字典显示
).
扩展模块
dbm.ndbm
and
dbm.gnu
提供额外映射类型范例,就像
collections
模块。
3.7 版改变: Dictionaries did not preserve insertion order in versions of Python before 3.6. In CPython 3.6, insertion order was preserved, but it was considered an implementation detail at that time rather than a language guarantee.
3.2.8. 可调用类型 ¶
这些类型,其函数调用操作 (见章节 调用 ) 可被应用:
3.2.8.1. 用户定义函数 ¶
用户定义函数对象通过函数定义创建 (见章节 函数定义 )。应该采用包含如函数形式参数列表相同项数的自变量列表,调用函数。
3.2.8.1.1. Special read-only attributes ¶
|
属性 |
含义 |
|---|---|
|
A reference to the
|
|
单元格对象拥有属性
|
3.2.8.1.2. Special writable attributes ¶
Most of these attributes check the type of the assigned value:
|
属性 |
含义 |
|---|---|
|
函数的文档编制字符串,或
|
|
The function’s name. See also:
|
|
函数的
合格名称
. See also:
Added in version 3.3. |
|
定义函数的模块名称,或
|
|
A
|
|
The code object representing the compiled function body. |
|
The namespace supporting arbitrary function attributes. See also:
|
|
A
|
|
A
|
|
A
3.12 版添加。 |
Function objects also support getting and setting arbitrary attributes, which can be used, for example, to attach metadata to functions. Regular attribute dot-notation is used to get and set such attributes.
CPython 实现细节: CPython’s current implementation only supports function attributes on user-defined functions. Function attributes on built-in functions may be supported in the future.
Additional information about a function’s definition can be retrieved from its
code object
(accessible via the
__code__
属性)。
3.2.8.2. 实例方法 ¶
实例方法对象组合类、类实例及任何可调用对象 (通常是用户定义函数)。
特殊只读属性:
|
Refers to the class instance object to which the method is bound |
|
Refers to the original function object |
|
The method’s documentation (same as
|
|
The name of the method (same as
|
|
The name of the module the method was defined in, or
|
Methods also support accessing (but not setting) the arbitrary function attributes on the underlying function object .
User-defined method objects may be created when getting an attribute of a class (perhaps via an instance of that class), if that attribute is a user-defined
function object
或
classmethod
对象。
When an instance method object is created by retrieving a user-defined
function object
from a class via one of its instances, its
__self__
attribute is the instance, and the method object is said to be
bound
. The new method’s
__func__
attribute is the original function object.
When an instance method object is created by retrieving a
classmethod
object from a class or instance, its
__self__
attribute is the class itself, and its
__func__
attribute is the function object underlying the class method.
When an instance method object is called, the underlying function (
__func__
) is called, inserting the class instance (
__self__
) in front of the argument list. For instance, when
C
is a class which contains a definition for a function
f()
,和
x
是实例化的
C
,调用
x.f(1)
相当于调用
C.f(x, 1)
.
When an instance method object is derived from a
classmethod
object, the “class instance” stored in
__self__
will actually be the class itself, so that calling either
x.f(1)
or
C.f(1)
相当于调用
f(C,1)
where
f
is the underlying function.
It is important to note that user-defined functions which are attributes of a class instance are not converted to bound methods; this only happens when the function is an attribute of the class.
3.2.8.3. 生成器函数 ¶
A function or method which uses the
yield
statement (see section
yield 语句
) is called a
generator 函数
. Such a function, when called, always returns an
iterator
object which can be used to execute the body of the function: calling the iterator’s
iterator.__next__()
method will cause the function to execute until it provides a value using the
yield
statement. When the function executes a
return
statement or falls off the end, a
StopIteration
exception is raised and the iterator will have reached the end of the set of values to be returned.
3.2.8.4. 协程函数 ¶
A function or method which is defined using
async def
is called a
协程函数
. Such a function, when called, returns a
协程
对象。它可能包含
await
表达式,及
async with
and
async for
statements. See also the
协程对象
章节。
3.2.8.5. 异步生成器函数 ¶
A function or method which is defined using
async def
and which uses the
yield
statement is called a
asynchronous generator function
. Such a function, when called, returns an
异步迭代器
object which can be used in an
async for
statement to execute the body of the function.
调用异步迭代器的
aiterator.__anext__
method will return an
awaitable
which when awaited will execute until it provides a value using the
yield
expression. When the function executes an empty
return
statement or falls off the end, a
StopAsyncIteration
exception is raised and the asynchronous iterator will have reached the end of the set of values to be yielded.
3.2.8.6. 内置函数 ¶
内置函数对象是围绕 C 函数的包裹器。例如,内置函数
len()
and
math.sin()
(
math
是标准内置模块)。自变量的数值和类型由 C 函数确定。特殊只读属性:
-
__doc__是函数的文档编制字符串,或Noneif unavailable. Seefunction.__doc__. -
__name__is the function’s name. Seefunction.__name__. -
__self__被设为None(but see the next item). -
__module__是在其中定义函数的模块名称或Noneif unavailable. Seefunction.__module__.
3.2.8.7. 内置方法 ¶
This is really a different disguise of a built-in function, this time containing an object passed to the C function as an implicit extra argument. An example of a built-in method is
alist.append()
, assuming
alist
is a list object. In this case, the special read-only attribute
__self__
is set to the object denoted by
alist
. (The attribute has the same semantics as it does with
other instance methods
)。
3.2.8.8. 类 ¶
类是可调用的。这些对象通常充当自身的新实例工厂,但类类型可能变体,当覆盖
__new__()
。调用自变量被传递给
__new__()
且在典型情况下,
__init__()
用于初始化新实例。
3.2.8.9. 类实例 ¶
可以使任意类实例可调用,通过定义
__call__()
方法在其类中。
3.2.9. 模块 ¶
模块是 Python 代码的基本组织单元,且创建通过
导入系统
作为援引通过
import
语句,或通过调用函数譬如
importlib.import_module()
和内置
__import__()
. A module object has a namespace implemented by a
dictionary
object (this is the dictionary referenced by the
__globals__
attribute of functions defined in the module). Attribute references are translated to lookups in this dictionary, e.g.,
m.x
相当于
m.__dict__["x"]
. A module object does not contain the code object used to initialize the module (since it isn’t needed once the initialization is done).
属性赋值更新模块的名称空间字典,如,
m.x = 1
相当于
m.__dict__["x"] = 1
.
- 模块。 __name__ ¶
-
The name used to uniquely identify the module in the import system. For a directly executed module, this will be set to
"__main__".This attribute must be set to the fully qualified name of the module. It is expected to match the value of
module.__spec__.name.
- 模块。 __spec__ ¶
-
A record of the module’s import-system-related state.
Set to the
module specthat was used when importing the module. See 模块特定 了解更多细节。Added in version 3.4.
- 模块。 __package__ ¶
-
The 包 a module belongs to.
If the module is top-level (that is, not a part of any specific package) then the attribute should be set to
''(the empty string). Otherwise, it should be set to the name of the module’s package (which can be equal tomodule.__name__if the module itself is a package). See PEP 366 进一步了解细节。This attribute is used instead of
__name__to calculate explicit relative imports for main modules. It defaults toNonefor modules created dynamically using thetypes.ModuleTypeconstructor; useimportlib.util.module_from_spec()instead to ensure the attribute is set to astr.It is strongly recommended that you use
module.__spec__.parent而不是module.__package__.__package__is now only used as a fallback if__spec__.parentis not set, and this fallback path is deprecated.3.4 版改变: This attribute now defaults to
Nonefor modules created dynamically using thetypes.ModuleTypeconstructor. Previously the attribute was optional.3.6 版改变: 值
__package__is expected to be the same as__spec__.parent.__package__is now only used as a fallback during import resolution if__spec__.parent未定义。3.10 版改变:
ImportWarningis raised if an import resolution falls back to__package__而不是__spec__.parent.3.12 版改变: 引发
DeprecationWarning而不是ImportWarningwhen falling back to__package__during import resolution.Deprecated since version 3.13, will be removed in version 3.15:
__package__will cease to be set or taken into consideration by the import system or standard library.
- 模块。 __loader__ ¶
-
The loader object that the import machinery used to load the module.
This attribute is mostly useful for introspection, but can be used for additional loader-specific functionality, for example getting data associated with a loader.
__loader__默认为Nonefor modules created dynamically using thetypes.ModuleTypeconstructor; useimportlib.util.module_from_spec()instead to ensure the attribute is set to a loader 对象。It is strongly recommended that you use
module.__spec__.loader而不是module.__loader__.3.4 版改变: This attribute now defaults to
Nonefor modules created dynamically using thetypes.ModuleTypeconstructor. Previously the attribute was optional.Deprecated since version 3.12, will be removed in version 3.16: 设置
__loader__on a module while failing to set__spec__.loaderis deprecated. In Python 3.16,__loader__will cease to be set or taken into consideration by the import system or the standard library.