Python allows the writer of a C extension module to define new types that can be manipulated from Python code, much like the built-in
str
and
list
types. The code for all extension types follows a pattern, but there are some details that you need to understand before you can get started. This document is a gentle introduction to the topic.
CPython
runtime sees all Python objects as variables of type
PyObject*
, which serves as a “base type” for all Python objects.
PyObject
structure itself only contains the object’s
reference count
and a pointer to the object’s “type object”. This is where the action is; the type object determines which (C) functions get called by the interpreter when, for instance, an attribute gets looked up on an object, a method called, or it is multiplied by another object. These C functions are called “type methods”.
So, if you want to define a new extension type, you need to create a new type object.
This sort of thing can only be explained by example, so here’s a minimal, but complete, module that defines a new type named
自定义
inside a C extension module
custom
:
注意
What we’re showing here is the traditional way of defining
static
extension types. It should be adequate for most uses. The C API also allows defining heap-allocated extension types using the
PyType_FromSpec()
function, which isn’t covered in this tutorial.
#include <Python.h>
typedef struct {
PyObject_HEAD
/* Type-specific fields go here. */
} CustomObject;
static PyTypeObject CustomType = {
PyVarObject_HEAD_INIT(NULL, 0)
.tp_name = "custom.Custom",
.tp_doc = "Custom objects",
.tp_basicsize = sizeof(CustomObject),
.tp_itemsize = 0,
.tp_flags = Py_TPFLAGS_DEFAULT,
.tp_new = PyType_GenericNew,
};
static PyModuleDef custommodule = {
PyModuleDef_HEAD_INIT,
.m_name = "custom",
.m_doc = "Example module that creates an extension type.",
.m_size = -1,
};
PyMODINIT_FUNC
PyInit_custom(void)
{
PyObject *m;
if (PyType_Ready(&CustomType) < 0)
return NULL;
m = PyModule_Create(&custommodule);
if (m == NULL)
return NULL;
Py_INCREF(&CustomType);
PyModule_AddObject(m, "Custom", (PyObject *) &CustomType);
return m;
}
Now that’s quite a bit to take in at once, but hopefully bits will seem familiar from the previous chapter. This file defines three things:
自定义
object
contains: this is the
CustomObject
struct, which is allocated once for each
自定义
实例。
自定义
type
behaves: this is the
CustomType
struct,
which defines a set of flags and function pointers that the interpreter
inspects when specific operations are requested.
custom
module: this is the
PyInit_custom
function and the associated
custommodule
struct.
The first bit is:
typedef struct {
PyObject_HEAD
} CustomObject;
This is what a Custom object will contain.
PyObject_HEAD
is mandatory at the start of each object struct and defines a field called
ob_base
of type
PyObject
, containing a pointer to a type object and a reference count (these can be accessed using the macros
Py_REFCNT
and
Py_TYPE
respectively). The reason for the macro is to abstract away the layout and to enable additional fields in debug builds.
注意
There is no semicolon above after the
PyObject_HEAD
macro. Be wary of adding one by accident: some compilers will complain.
Of course, objects generally store additional data besides the standard
PyObject_HEAD
boilerplate; for example, here is the definition for standard Python floats:
typedef struct {
PyObject_HEAD
double ob_fval;
} PyFloatObject;
The second bit is the definition of the type object.
static PyTypeObject CustomType = {
PyVarObject_HEAD_INIT(NULL, 0)
.tp_name = "custom.Custom",
.tp_doc = "Custom objects",
.tp_basicsize = sizeof(CustomObject),
.tp_itemsize = 0,
.tp_new = PyType_GenericNew,
};
注意
We recommend using C99-style designated initializers as above, to avoid listing all the
PyTypeObject
fields that you don’t care about and also to avoid caring about the fields’ declaration order.
The actual definition of
PyTypeObject
in
object.h
has many more
fields
than the definition above. The remaining fields will be filled with zeros by the C compiler, and it’s common practice to not specify them explicitly unless you need them.
We’re going to pick it apart, one field at a time:
PyVarObject_HEAD_INIT(NULL, 0)
This line is mandatory boilerplate to initialize the
ob_base
field mentioned above.
.tp_name = "custom.Custom",
The name of our type. This will appear in the default textual representation of our objects and in some error messages, for example:
>>> "" + custom.Custom()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "custom.Custom") to str
Note that the name is a dotted name that includes both the module name and the name of the type within the module. The module in this case is
custom
and the type is
自定义
, so we set the type name to
custom.Custom
. Using the real dotted import path is important to make your type compatible with the
pydoc
and
pickle
模块。
.tp_basicsize = sizeof(CustomObject),
.tp_itemsize = 0,
This is so that Python knows how much memory to allocate when creating new
自定义
实例。
tp_itemsize
is only used for variable-sized objects and should otherwise be zero.
注意
If you want your type to be subclassable from Python, and your type has the same
tp_basicsize
as its base type, you may have problems with multiple inheritance. A Python subclass of your type will have to list your type first in its
__bases__
, or else it will not be able to call your type’s
__new__()
method without getting an error. You can avoid this problem by ensuring that your type has a larger value for
tp_basicsize
than its base type does. Most of the time, this will be true anyway, because either your base type will be
object
, or else you will be adding data members to your base type, and therefore increasing its size.
将类标志设为
Py_TPFLAGS_DEFAULT
.
.tp_flags = Py_TPFLAGS_DEFAULT,
All types should include this constant in their flags. It enables all of the members defined until at least Python 3.3. If you need further members, you will need to OR the corresponding flags.
We provide a doc string for the type in
tp_doc
.
.tp_doc = "Custom objects",
To enable object creation, we have to provide a
tp_new
handler. This is the equivalent of the Python method
__new__()
, but has to be specified explicitly. In this case, we can just use the default implementation provided by the API function
PyType_GenericNew()
.
.tp_new = PyType_GenericNew,
Everything else in the file should be familiar, except for some code in
PyInit_custom()
:
if (PyType_Ready(&CustomType) < 0)
return;
这初始化
自定义
type, filling in a number of members to the appropriate default values, including
ob_type
that we initially set to
NULL
.
PyModule_AddObject(m, "Custom", (PyObject *) &CustomType);
This adds the type to the module dictionary. This allows us to create
自定义
instances by calling the
自定义
类:
>>> import custom
>>> mycustom = custom.Custom()
That’s it! All that remains is to build it; put the above code in a file called
custom.c
and:
from distutils.core import setup, Extension
setup(name="custom", version="1.0",
ext_modules=[Extension("custom", ["custom.c"])])
in a file called
setup.py
; then typing
$ python setup.py build
at a shell should produce a file
custom.so
in a subdirectory; move to that directory and fire up Python — you should be able to
import
custom
and play around with Custom objects.
That wasn’t so hard, was it?
Of course, the current Custom type is pretty uninteresting. It has no data and doesn’t do anything. It can’t even be subclassed.
注意
While this documentation showcases the standard
distutils
module for building C extensions, it is recommended in real-world use cases to use the newer and better-maintained
setuptools
library. Documentation on how to do this is out of scope for this document and can be found in the
Python Packaging User’s Guide
.
Let’s extend the basic example to add some data and methods. Let’s also make the type usable as a base class. We’ll create a new module,
custom2
that adds these capabilities:
#include <Python.h>
#include "structmember.h"
typedef struct {
PyObject_HEAD
PyObject *first; /* first name */
PyObject *last; /* last name */
int number;
} CustomObject;
static void
Custom_dealloc(CustomObject *self)
{
Py_XDECREF(self->first);
Py_XDECREF(self->last);
Py_TYPE(self)->tp_free((PyObject *) self);
}
static PyObject *
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
CustomObject *self;
self = (CustomObject *) type->tp_alloc(type, 0);
if (self != NULL) {
self->first = PyUnicode_FromString("");
if (self->first == NULL) {
Py_DECREF(self);
return NULL;
}
self->last = PyUnicode_FromString("");
if (self->last == NULL) {
Py_DECREF(self);
return NULL;
}
self->number = 0;
}
return (PyObject *) self;
}
static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
static char *kwlist[] = {"first", "last", "number", NULL};
PyObject *first = NULL, *last = NULL, *tmp;
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
&first, &last,
&self->number))
return -1;
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_XDECREF(tmp);
}
if (last) {
tmp = self->last;
Py_INCREF(last);
self->last = last;
Py_XDECREF(tmp);
}
return 0;
}
static PyMemberDef Custom_members[] = {
{"first", T_OBJECT_EX, offsetof(CustomObject, first), 0,
"first name"},
{"last", T_OBJECT_EX, offsetof(CustomObject, last), 0,
"last name"},
{"number", T_INT, offsetof(CustomObject, number), 0,
"custom number"},
{NULL} /* Sentinel */
};
static PyObject *
Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
{
if (self->first == NULL) {
PyErr_SetString(PyExc_AttributeError, "first");
return NULL;
}
if (self->last == NULL) {
PyErr_SetString(PyExc_AttributeError, "last");
return NULL;
}
return PyUnicode_FromFormat("%S %S", self->first, self->last);
}
static PyMethodDef Custom_methods[] = {
{"name", (PyCFunction) Custom_name, METH_NOARGS,
"Return the name, combining the first and last name"
},
{NULL} /* Sentinel */
};
static PyTypeObject CustomType = {
PyVarObject_HEAD_INIT(NULL, 0)
.tp_name = "custom2.Custom",
.tp_doc = "Custom objects",
.tp_basicsize = sizeof(CustomObject),
.tp_itemsize = 0,
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
.tp_new = Custom_new,
.tp_init = (initproc) Custom_init,
.tp_dealloc = (destructor) Custom_dealloc,
.tp_members = Custom_members,
.tp_methods = Custom_methods,
};
static PyModuleDef custommodule = {
PyModuleDef_HEAD_INIT,
.m_name = "custom2",
.m_doc = "Example module that creates an extension type.",
.m_size = -1,
};
PyMODINIT_FUNC
PyInit_custom2(void)
{
PyObject *m;
if (PyType_Ready(&CustomType) < 0)
return NULL;
m = PyModule_Create(&custommodule);
if (m == NULL)
return NULL;
Py_INCREF(&CustomType);
PyModule_AddObject(m, "Custom", (PyObject *) &CustomType);
return m;
}
This version of the module has a number of changes.
We’ve added an extra include:
#include <structmember.h>
This include provides declarations that we use to handle attributes, as described a bit later.
自定义
type now has three data attributes in its C struct,
first
,
last
,和
number
。
first
and
last
variables are Python strings containing first and last names. The
number
attribute is a C integer.
The object structure is updated accordingly:
typedef struct {
PyObject_HEAD
PyObject *first; /* first name */
PyObject *last; /* last name */
int number;
} CustomObject;
Because we now have data to manage, we have to be more careful about object allocation and deallocation. At a minimum, we need a deallocation method:
static void
Custom_dealloc(CustomObject *self)
{
Py_XDECREF(self->first);
Py_XDECREF(self->last);
Py_TYPE(self)->tp_free((PyObject *) self);
}
which is assigned to the
tp_dealloc
成员:
.tp_dealloc = (destructor) Custom_dealloc,
This method first clears the reference counts of the two Python attributes.
Py_XDECREF()
correctly handles the case where its argument is
NULL
(which might happen here if
tp_new
failed midway). It then calls the
tp_free
member of the object’s type (computed by
Py_TYPE(self)
) to free the object’s memory. Note that the object’s type might not be
CustomType
, because the object may be an instance of a subclass.
注意
The explicit cast to
destructor
above is needed because we defined
Custom_dealloc
to take a
CustomObject
*
argument, but the
tp_dealloc
function pointer expects to receive a
PyObject
*
argument. Otherwise, the compiler will emit a warning. This is object-oriented polymorphism, in C!
We want to make sure that the first and last names are initialized to empty strings, so we provide a
tp_new
implementation:
static PyObject *
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
CustomObject *self;
self = (CustomObject *) type->tp_alloc(type, 0);
if (self != NULL) {
self->first = PyUnicode_FromString("");
if (self->first == NULL) {
Py_DECREF(self);
return NULL;
}
self->last = PyUnicode_FromString("");
if (self->last == NULL) {
Py_DECREF(self);
return NULL;
}
self->number = 0;
}
return (PyObject *) self;
}
并将它安装在
tp_new
成员:
.tp_new = Custom_new,
tp_new
handler is responsible for creating (as opposed to initializing) objects of the type. It is exposed in Python as the
__new__()
method. It is not required to define a
tp_new
member, and indeed many extension types will simply reuse
PyType_GenericNew()
as done in the first version of the
自定义
type above. In this case, we use the
tp_new
handler to initialize the
first
and
last
attributes to non-
NULL
default values.
tp_new
is passed the type being instantiated (not necessarily
CustomType
, if a subclass is instantiated) and any arguments passed when the type was called, and is expected to return the instance created.
tp_new
handlers always accept positional and keyword arguments, but they often ignore the arguments, leaving the argument handling to initializer (a.k.a.
tp_init
in C or
__init__
in Python) methods.
注意
tp_new
shouldn’t call
tp_init
explicitly, as the interpreter will do it itself.
tp_new
实现调用
tp_alloc
slot to allocate memory:
self = (CustomObject *) type->tp_alloc(type, 0);
Since memory allocation may fail, we must check the
tp_alloc
result against
NULL
before proceeding.
注意
We didn’t fill the
tp_alloc
slot ourselves. Rather
PyType_Ready()
fills it for us by inheriting it from our base class, which is
object
by default. Most types use the default allocation strategy.
注意
If you are creating a co-operative
tp_new
(one that calls a base type’s
tp_new
or
__new__()
), you must
not
try to determine what method to call using method resolution order at runtime. Always statically determine what type you are going to call, and call its
tp_new
directly, or via
type->tp_base->tp_new
. If you do not do this, Python subclasses of your type that also inherit from other Python-defined classes may not work correctly. (Specifically, you may not be able to create instances of such subclasses without getting a
TypeError
.)
We also define an initialization function which accepts arguments to provide initial values for our instance:
static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
static char *kwlist[] = {"first", "last", "number", NULL};
PyObject *first = NULL, *last = NULL, *tmp;
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
&first, &last,
&self->number))
return -1;
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_XDECREF(tmp);
}
if (last) {
tmp = self->last;
Py_INCREF(last);
self->last = last;
Py_XDECREF(tmp);
}
return 0;
}
by filling the
tp_init
slot.
.tp_init = (initproc) Custom_init,
tp_init
slot is exposed in Python as the
__init__()
method. It is used to initialize an object after it’s created. Initializers always accept positional and keyword arguments, and they should return either
0
on success or
-1
当出错时。
不像
tp_new
handler, there is no guarantee that
tp_init
is called at all (for example, the
pickle
module by default doesn’t call
__init__()
on unpickled instances). It can also be called multiple times. Anyone can call the
__init__()
method on our objects. For this reason, we have to be extra careful when assigning the new attribute values. We might be tempted, for example to assign the
first
member like this:
if (first) {
Py_XDECREF(self->first);
Py_INCREF(first);
self->first = first;
}
But this would be risky. Our type doesn’t restrict the type of the
first
member, so it could be any kind of object. It could have a destructor that causes code to be executed that tries to access the
first
member; or that destructor could release the
Global interpreter Lock
and let arbitrary code run in other threads that accesses and modifies our object.
To be paranoid and protect ourselves against this possibility, we almost always reassign members before decrementing their reference counts. When don’t we have to do this?
tp_dealloc
handler on a type which doesn’t support cyclic garbage collection
[2]
.
We want to expose our instance variables as attributes. There are a number of ways to do that. The simplest way is to define member definitions:
static PyMemberDef Custom_members[] = {
{"first", T_OBJECT_EX, offsetof(CustomObject, first), 0,
"first name"},
{"last", T_OBJECT_EX, offsetof(CustomObject, last), 0,
"last name"},
{"number", T_INT, offsetof(CustomObject, number), 0,
"custom number"},
{NULL} /* Sentinel */
};
and put the definitions in the
tp_members
槽:
.tp_members = Custom_members,
Each member definition has a member name, type, offset, access flags and documentation string. See the Generic Attribute Management section below for details.
A disadvantage of this approach is that it doesn’t provide a way to restrict the types of objects that can be assigned to the Python attributes. We expect the first and last names to be strings, but any Python objects can be assigned. Further, the attributes can be deleted, setting the C pointers to NULL . Even though we can make sure the members are initialized to non- NULL values, the members can be set to NULL if the attributes are deleted.
定义一个方法
Custom.name()
, that outputs the objects name as the concatenation of the first and last names.
static PyObject *
Custom_name(CustomObject *self)
{
if (self->first == NULL) {
PyErr_SetString(PyExc_AttributeError, "first");
return NULL;
}
if (self->last == NULL) {
PyErr_SetString(PyExc_AttributeError, "last");
return NULL;
}
return PyUnicode_FromFormat("%S %S", self->first, self->last);
}
The method is implemented as a C function that takes a
自定义
(或
自定义
subclass) instance as the first argument. Methods always take an instance as the first argument. Methods often take positional and keyword arguments as well, but in this case we don’t take any and don’t need to accept a positional argument tuple or keyword argument dictionary. This method is equivalent to the Python method:
def name(self):
return "%s %s" % (self.first, self.last)
Note that we have to check for the possibility that our
first
and
last
members are
NULL
. This is because they can be deleted, in which case they are set to
NULL
. It would be better to prevent deletion of these attributes and to restrict the attribute values to be strings. We’ll see how to do that in the next section.
Now that we’ve defined the method, we need to create an array of method definitions:
static PyMethodDef Custom_methods[] = {
{"name", (PyCFunction) Custom_name, METH_NOARGS,
"Return the name, combining the first and last name"
},
{NULL} /* Sentinel */
};
(note that we used the
METH_NOARGS
flag to indicate that the method is expecting no arguments other than
self
)
and assign it to the
tp_methods
槽:
.tp_methods = Custom_methods,
Finally, we’ll make our type usable as a base class for subclassing. We’ve written our methods carefully so far so that they don’t make any assumptions about the type of the object being created or used, so all we need to do is to add the
Py_TPFLAGS_BASETYPE
to our class flag definition:
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
重命名
PyInit_custom()
to
PyInit_custom2()
,更新模块名称在
PyModuleDef
struct, and update the full class name in the
PyTypeObject
struct.
最后,更新
setup.py
file to build the new module:
from distutils.core import setup, Extension
setup(name="custom", version="1.0",
ext_modules=[
Extension("custom", ["custom.c"]),
Extension("custom2", ["custom2.c"]),
])
In this section, we’ll provide finer control over how the
first
and
last
attributes are set in the
自定义
example. In the previous version of our module, the instance variables
first
and
last
could be set to non-string values or even deleted. We want to make sure that these attributes always contain strings.
#include <Python.h>
#include "structmember.h"
typedef struct {
PyObject_HEAD
PyObject *first; /* first name */
PyObject *last; /* last name */
int number;
} CustomObject;
static void
Custom_dealloc(CustomObject *self)
{
Py_XDECREF(self->first);
Py_XDECREF(self->last);
Py_TYPE(self)->tp_free((PyObject *) self);
}
static PyObject *
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
CustomObject *self;
self = (CustomObject *) type->tp_alloc(type, 0);
if (self != NULL) {
self->first = PyUnicode_FromString("");
if (self->first == NULL) {
Py_DECREF(self);
return NULL;
}
self->last = PyUnicode_FromString("");
if (self->last == NULL) {
Py_DECREF(self);
return NULL;
}
self->number = 0;
}
return (PyObject *) self;
}
static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
static char *kwlist[] = {"first", "last", "number", NULL};
PyObject *first = NULL, *last = NULL, *tmp;
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
&first, &last,
&self->number))
return -1;
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_DECREF(tmp);
}
if (last) {
tmp = self->last;
Py_INCREF(last);
self->last = last;
Py_DECREF(tmp);
}
return 0;
}
static PyMemberDef Custom_members[] = {
{"number", T_INT, offsetof(CustomObject, number), 0,
"custom number"},
{NULL} /* Sentinel */
};
static PyObject *
Custom_getfirst(CustomObject *self, void *closure)
{
Py_INCREF(self->first);
return self->first;
}
static int
Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
{
PyObject *tmp;
if (value == NULL) {
PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
return -1;
}
if (!PyUnicode_Check(value)) {
PyErr_SetString(PyExc_TypeError,
"The first attribute value must be a string");
return -1;
}
tmp = self->first;
Py_INCREF(value);
self->first = value;
Py_DECREF(tmp);
return 0;
}
static PyObject *
Custom_getlast(CustomObject *self, void *closure)
{
Py_INCREF(self->last);
return self->last;
}
static int
Custom_setlast(CustomObject *self, PyObject *value, void *closure)
{
PyObject *tmp;
if (value == NULL) {
PyErr_SetString(PyExc_TypeError, "Cannot delete the last attribute");
return -1;
}
if (!PyUnicode_Check(value)) {
PyErr_SetString(PyExc_TypeError,
"The last attribute value must be a string");
return -1;
}
tmp = self->last;
Py_INCREF(value);
self->last = value;
Py_DECREF(tmp);
return 0;
}
static PyGetSetDef Custom_getsetters[] = {
{"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
"first name", NULL},
{"last", (getter) Custom_getlast, (setter) Custom_setlast,
"last name", NULL},
{NULL} /* Sentinel */
};
static PyObject *
Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
{
return PyUnicode_FromFormat("%S %S", self->first, self->last);
}
static PyMethodDef Custom_methods[] = {
{"name", (PyCFunction) Custom_name, METH_NOARGS,
"Return the name, combining the first and last name"
},
{NULL} /* Sentinel */
};
static PyTypeObject CustomType = {
PyVarObject_HEAD_INIT(NULL, 0)
.tp_name = "custom3.Custom",
.tp_doc = "Custom objects",
.tp_basicsize = sizeof(CustomObject),
.tp_itemsize = 0,
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
.tp_new = Custom_new,
.tp_init = (initproc) Custom_init,
.tp_dealloc = (destructor) Custom_dealloc,
.tp_members = Custom_members,
.tp_methods = Custom_methods,
.tp_getset = Custom_getsetters,
};
static PyModuleDef custommodule = {
PyModuleDef_HEAD_INIT,
.m_name = "custom3",
.m_doc = "Example module that creates an extension type.",
.m_size = -1,
};
PyMODINIT_FUNC
PyInit_custom3(void)
{
PyObject *m;
if (PyType_Ready(&CustomType) < 0)
return NULL;
m = PyModule_Create(&custommodule);
if (m == NULL)
return NULL;
Py_INCREF(&CustomType);
PyModule_AddObject(m, "Custom", (PyObject *) &CustomType);
return m;
}
To provide greater control, over the
first
and
last
attributes, we’ll use custom getter and setter functions. Here are the functions for getting and setting the
first
attribute:
static PyObject *
Custom_getfirst(CustomObject *self, void *closure)
{
Py_INCREF(self->first);
return self->first;
}
static int
Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
{
PyObject *tmp;
if (value == NULL) {
PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
return -1;
}
if (!PyUnicode_Check(value)) {
PyErr_SetString(PyExc_TypeError,
"The first attribute value must be a string");
return -1;
}
tmp = self->first;
Py_INCREF(value);
self->first = value;
Py_DECREF(tmp);
return 0;
}
The getter function is passed a
自定义
object and a “closure”, which is a void pointer. In this case, the closure is ignored. (The closure supports an advanced usage in which definition data is passed to the getter and setter. This could, for example, be used to allow a single set of getter and setter functions that decide the attribute to get or set based on data in the closure.)
The setter function is passed the
自定义
object, the new value, and the closure. The new value may be
NULL
, in which case the attribute is being deleted. In our setter, we raise an error if the attribute is deleted or if its new value is not a string.
We create an array of
PyGetSetDef
结构:
static PyGetSetDef Custom_getsetters[] = {
{"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
"first name", NULL},
{"last", (getter) Custom_getlast, (setter) Custom_setlast,
"last name", NULL},
{NULL} /* Sentinel */
};
and register it in the
tp_getset
槽:
.tp_getset = Custom_getsetters,
The last item in a
PyGetSetDef
structure is the “closure” mentioned above. In this case, we aren’t using a closure, so we just pass
NULL
.
We also remove the member definitions for these attributes:
static PyMemberDef Custom_members[] = {
{"number", T_INT, offsetof(CustomObject, number), 0,
"custom number"},
{NULL} /* Sentinel */
};
We also need to update the
tp_init
handler to only allow strings
[3]
to be passed:
static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
static char *kwlist[] = {"first", "last", "number", NULL};
PyObject *first = NULL, *last = NULL, *tmp;
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
&first, &last,
&self->number))
return -1;
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_DECREF(tmp);
}
if (last) {
tmp = self->last;
Py_INCREF(last);
self->last = last;
Py_DECREF(tmp);
}
return 0;
}
With these changes, we can assure that the
first
and
last
members are never
NULL
so we can remove checks for
NULL
values in almost all cases. This means that most of the
Py_XDECREF()
calls can be converted to
Py_DECREF()
calls. The only place we can’t change these calls is in the
tp_dealloc
implementation, where there is the possibility that the initialization of these members failed in
tp_new
.
We also rename the module initialization function and module name in the initialization function, as we did before, and we add an extra definition to the
setup.py
文件。
Python 拥有 cyclic garbage collector (GC) that can identify unneeded objects even when their reference counts are not zero. This can happen when objects are involved in cycles. For example, consider:
>>> l = []
>>> l.append(l)
>>> del l
In this example, we create a list that contains itself. When we delete it, it still has a reference from itself. Its reference count doesn’t drop to zero. Fortunately, Python’s cyclic garbage collector will eventually figure out that the list is garbage and free it.
In the second version of the
自定义
example, we allowed any kind of object to be stored in the
first
or
last
attributes
[4]
. Besides, in the second and third versions, we allowed subclassing
自定义
, and subclasses may add arbitrary attributes. For any of those two reasons,
自定义
objects can participate in cycles:
>>> import custom3
>>> class Derived(custom3.Custom): pass
...
>>> n = Derived()
>>> n.some_attribute = n
To allow a
自定义
instance participating in a reference cycle to be properly detected and collected by the cyclic GC, our
自定义
type needs to fill two additional slots and to enable a flag that enables these slots:
#include <Python.h>
#include "structmember.h"
typedef struct {
PyObject_HEAD
PyObject *first; /* first name */
PyObject *last; /* last name */
int number;
} CustomObject;
static int
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
{
Py_VISIT(self->first);
Py_VISIT(self->last);
return 0;
}
static int
Custom_clear(CustomObject *self)
{
Py_CLEAR(self->first);
Py_CLEAR(self->last);
return 0;
}
static void
Custom_dealloc(CustomObject *self)
{
PyObject_GC_UnTrack(self);
Custom_clear(self);
Py_TYPE(self)->tp_free((PyObject *) self);
}
static PyObject *
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
CustomObject *self;
self = (CustomObject *) type->tp_alloc(type, 0);
if (self != NULL) {
self->first = PyUnicode_FromString("");
if (self->first == NULL) {
Py_DECREF(self);
return NULL;
}
self->last = PyUnicode_FromString("");
if (self->last == NULL) {
Py_DECREF(self);
return NULL;
}
self->number = 0;
}
return (PyObject *) self;
}
static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
static char *kwlist[] = {"first", "last", "number", NULL};
PyObject *first = NULL, *last = NULL, *tmp;
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
&first, &last,
&self->number))
return -1;
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_DECREF(tmp);
}
if (last) {
tmp = self->last;
Py_INCREF(last);
self->last = last;
Py_DECREF(tmp);
}
return 0;
}
static PyMemberDef Custom_members[] = {
{"number", T_INT, offsetof(CustomObject, number), 0,
"custom number"},
{NULL} /* Sentinel */
};
static PyObject *
Custom_getfirst(CustomObject *self, void *closure)
{
Py_INCREF(self->first);
return self->first;
}
static int
Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
{
if (value == NULL) {
PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
return -1;
}
if (!PyUnicode_Check(value)) {
PyErr_SetString(PyExc_TypeError,
"The first attribute value must be a string");
return -1;
}
Py_INCREF(value);
Py_CLEAR(self->first);
self->first = value;
return 0;
}
static PyObject *
Custom_getlast(CustomObject *self, void *closure)
{
Py_INCREF(self->last);
return self->last;
}
static int
Custom_setlast(CustomObject *self, PyObject *value, void *closure)
{
if (value == NULL) {
PyErr_SetString(PyExc_TypeError, "Cannot delete the last attribute");
return -1;
}
if (!PyUnicode_Check(value)) {
PyErr_SetString(PyExc_TypeError,
"The last attribute value must be a string");
return -1;
}
Py_INCREF(value);
Py_CLEAR(self->last);
self->last = value;
return 0;
}
static PyGetSetDef Custom_getsetters[] = {
{"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
"first name", NULL},
{"last", (getter) Custom_getlast, (setter) Custom_setlast,
"last name", NULL},
{NULL} /* Sentinel */
};
static PyObject *
Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
{
return PyUnicode_FromFormat("%S %S", self->first, self->last);
}
static PyMethodDef Custom_methods[] = {
{"name", (PyCFunction) Custom_name, METH_NOARGS,
"Return the name, combining the first and last name"
},
{NULL} /* Sentinel */
};
static PyTypeObject CustomType = {
PyVarObject_HEAD_INIT(NULL, 0)
.tp_name = "custom4.Custom",
.tp_doc = "Custom objects",
.tp_basicsize = sizeof(CustomObject),
.tp_itemsize = 0,
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC,
.tp_new = Custom_new,
.tp_init = (initproc) Custom_init,
.tp_dealloc = (destructor) Custom_dealloc,
.tp_traverse = (traverseproc) Custom_traverse,
.tp_clear = (inquiry) Custom_clear,
.tp_members = Custom_members,
.tp_methods = Custom_methods,
.tp_getset = Custom_getsetters,
};
static PyModuleDef custommodule = {
PyModuleDef_HEAD_INIT,
.m_name = "custom4",
.m_doc = "Example module that creates an extension type.",
.m_size = -1,
};
PyMODINIT_FUNC
PyInit_custom4(void)
{
PyObject *m;
if (PyType_Ready(&CustomType) < 0)
return NULL;
m = PyModule_Create(&custommodule);
if (m == NULL)
return NULL;
Py_INCREF(&CustomType);
PyModule_AddObject(m, "Custom", (PyObject *) &CustomType);
return m;
}
First, the traversal method lets the cyclic GC know about subobjects that could participate in cycles:
static int
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
{
int vret;
if (self->first) {
vret = visit(self->first, arg);
if (vret != 0)
return vret;
}
if (self->last) {
vret = visit(self->last, arg);
if (vret != 0)
return vret;
}
return 0;
}
For each subobject that can participate in cycles, we need to call the
visit()
function, which is passed to the traversal method. The
visit()
function takes as arguments the subobject and the extra argument
arg
passed to the traversal method. It returns an integer value that must be returned if it is non-zero.
Python provides a
Py_VISIT()
macro that automates calling visit functions. With
Py_VISIT()
, we can minimize the amount of boilerplate in
Custom_traverse
:
static int
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
{
Py_VISIT(self->first);
Py_VISIT(self->last);
return 0;
}
注意
tp_traverse
implementation must name its arguments exactly
visit
and
arg
in order to use
Py_VISIT()
.
Second, we need to provide a method for clearing any subobjects that can participate in cycles:
static int
Custom_clear(CustomObject *self)
{
Py_CLEAR(self->first);
Py_CLEAR(self->last);
return 0;
}
Notice the use of the
Py_CLEAR()
macro. It is the recommended and safe way to clear data attributes of arbitrary types while decrementing their reference counts. If you were to call
Py_XDECREF()
instead on the attribute before setting it to
NULL
, there is a possibility that the attribute’s destructor would call back into code that reads the attribute again (
especially
if there is a reference cycle).
注意
You could emulate
Py_CLEAR()
by writing:
PyObject *tmp;
tmp = self->first;
self->first = NULL;
Py_XDECREF(tmp);
Nevertheless, it is much easier and less error-prone to always use
Py_CLEAR()
when deleting an attribute. Don’t try to micro-optimize at the expense of robustness!
The deallocator
Custom_dealloc
may call arbitrary code when clearing attributes. It means the circular GC can be triggered inside the function. Since the GC assumes reference count is not zero, we need to untrack the object from the GC by calling
PyObject_GC_UnTrack()
before clearing members. Here is our reimplemented deallocator using
PyObject_GC_UnTrack()
and
Custom_clear
:
static void
Custom_dealloc(CustomObject *self)
{
PyObject_GC_UnTrack(self);
Custom_clear(self);
Py_TYPE(self)->tp_free((PyObject *) self);
}
Finally, we add the
Py_TPFLAGS_HAVE_GC
flag to the class flags:
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC,
That’s pretty much it. If we had written custom
tp_alloc
or
tp_free
handlers, we’d need to modify them for cyclic garbage collection. Most extensions will use the versions automatically provided.
It is possible to create new extension types that are derived from existing types. It is easiest to inherit from the built in types, since an extension can easily use the
PyTypeObject
it needs. It can be difficult to share these
PyTypeObject
structures between extension modules.
In this example we will create a
SubList
type that inherits from the built-in
list
type. The new type will be completely compatible with regular lists, but will have an additional
increment()
method that increases an internal counter:
>>> import sublist
>>> s = sublist.SubList(range(3))
>>> s.extend(s)
>>> print(len(s))
6
>>> print(s.increment())
1
>>> print(s.increment())
2
#include <Python.h>
typedef struct {
PyListObject list;
int state;
} SubListObject;
static PyObject *
SubList_increment(SubListObject *self, PyObject *unused)
{
self->state++;
return PyLong_FromLong(self->state);
}
static PyMethodDef SubList_methods[] = {
{"increment", (PyCFunction) SubList_increment, METH_NOARGS,
PyDoc_STR("increment state counter")},
{NULL},
};
static int
SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
{
if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
return -1;
self->state = 0;
return 0;
}
static PyTypeObject SubListType = {
PyVarObject_HEAD_INIT(NULL, 0)
.tp_name = "sublist.SubList",
.tp_doc = "SubList objects",
.tp_basicsize = sizeof(SubListObject),
.tp_itemsize = 0,
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
.tp_init = (initproc) SubList_init,
.tp_methods = SubList_methods,
};
static PyModuleDef sublistmodule = {
PyModuleDef_HEAD_INIT,
.m_name = "sublist",
.m_doc = "Example module that creates an extension type.",
.m_size = -1,
};
PyMODINIT_FUNC
PyInit_sublist(void)
{
PyObject *m;
SubListType.tp_base = &PyList_Type;
if (PyType_Ready(&SubListType) < 0)
return NULL;
m = PyModule_Create(&sublistmodule);
if (m == NULL)
return NULL;
Py_INCREF(&SubListType);
PyModule_AddObject(m, "SubList", (PyObject *) &SubListType);
return m;
}
As you can see, the source code closely resembles the
自定义
examples in previous sections. We will break down the main differences between them.
typedef struct {
PyListObject list;
int state;
} SubListObject;
The primary difference for derived type objects is that the base type’s object structure must be the first value. The base type will already include the
PyObject_HEAD()
at the beginning of its structure.
When a Python object is a
SubList
instance, its
PyObject
*
pointer can be safely cast to both
PyListObject
*
and
SubListObject
*
:
static int
SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
{
if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
return -1;
self->state = 0;
return 0;
}
We see above how to call through to the
__init__
method of the base type.
This pattern is important when writing a type with custom
tp_new
and
tp_dealloc
members. The
tp_new
handler should not actually create the memory for the object with its
tp_alloc
, but let the base class handle it by calling its own
tp_new
.
PyTypeObject
struct supports a
tp_base
specifying the type’s concrete base class. Due to cross-platform compiler issues, you can’t fill that field directly with a reference to
PyList_Type
; it should be done later in the module initialization function:
PyMODINIT_FUNC
PyInit_sublist(void)
{
PyObject* m;
SubListType.tp_base = &PyList_Type;
if (PyType_Ready(&SubListType) < 0)
return NULL;
m = PyModule_Create(&sublistmodule);
if (m == NULL)
return NULL;
Py_INCREF(&SubListType);
PyModule_AddObject(m, "SubList", (PyObject *) &SubListType);
return m;
}
Before calling
PyType_Ready()
, the type structure must have the
tp_base
slot filled in. When we are deriving an existing type, it is not necessary to fill out the
tp_alloc
slot with
PyType_GenericNew()
– the allocation function from the base type will be inherited.
After that, calling
PyType_Ready()
and adding the type object to the module is the same as with the basic
自定义
范例。
脚注
| [1] | This is true when we know that the object is a basic type, like a string or a float. |
| [2] |
We relied on this in the
tp_dealloc
handler in this example, because our type doesn’t support garbage collection.
|
| [3] | We now know that the first and last members are strings, so perhaps we could be less careful about decrementing their reference counts, however, we accept instances of string subclasses. Even though deallocating normal strings won’t call back into our objects, we can’t guarantee that deallocating an instance of a string subclass won’t call back into our objects. |
| [4] |
Also, even with our attributes restricted to strings instances, the user could pass arbitrary
str
subclasses and therefore still create reference cycles.
|