Python源码学习（五）——Python中的整数对象

Python中对“整数”是靠PyIntObject对象来完成的。Python中的对象有定长对象和变长对象的区分，根据对象维护数据的可变性可以将其分为可变对象和不可变对象，Python的整数对象是不可变对象，即储存在整数对象池中，几乎所有的Python内建对象都会有自己所特有的对象池机制。与对象有关的元信息都储存在与对象对应的类型对象中。在intobject.c文件中有如下代码：
[code lang=”C”]
PyTypeObject PyInt_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"int",
sizeof(PyIntObject),
0,
(destructor)int_dealloc, /* tp_dealloc */
(printfunc)int_print, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
(cmpfunc)int_compare, /* tp_compare */
(reprfunc)int_to_decimal_string, /* tp_repr */
&int_as_number, /* tp_as_number */
0, /* tp_as_sequence */
0, /* tp_as_mapping */
(hashfunc)int_hash, /* tp_hash */
0, /* tp_call */
(reprfunc)int_to_decimal_string, /* tp_str */
PyObject_GenericGetAttr, /* tp_getattro */
0, /* tp_setattro */
0, /* tp_as_buffer */
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_CHECKTYPES |
Py_TPFLAGS_BASETYPE | Py_TPFLAGS_INT_SUBCLASS, /* tp_flags */
int_doc, /* tp_doc */
0, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
0, /* tp_iter */
0, /* tp_iternext */
int_methods, /* tp_methods */
0, /* tp_members */
int_getset, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
0, /* tp_init */
0, /* tp_alloc */
int_new, /* tp_new */
(freefunc)int_free, /* tp_free */
};
[/code]
现在稍微介绍一下PyIntObject对象所支持的操作：
int_dealloc PyIntObject对象的析构函数
int_free PyIntObject对象释放函数
int_repr 转化为PyStringObject对象
int_bash 获得HASH值
int_print 打印PyIntObject对象
int_compare 比较操作
int_as_number 数值操作集合
int_methods 成员函数集合

我们来看一下intobject.c中的源代码：
[code lang=”C”]
#define CONVERT_TO_LONG(obj, lng) \
if (PyInt_Check(obj)) { \
lng = PyInt_AS_LONG(obj); \
} \
else { \
Py_INCREF(Py_NotImplemented); \
return Py_NotImplemented; \
}

static PyObject *
int_add(PyIntObject *v, PyIntObject *w)
{
register long a, b, x;
CONVERT_TO_LONG(v, a);
CONVERT_TO_LONG(w, b);
/* casts in the line below avoid undefined behaviour on overflow */
x = (long)((unsigned long)a + b);
if ((x^a) >= 0 || (x^b) >= 0)
return PyInt_FromLong(x);
return PyLong_Type.tp_as_number->nb_add((PyObject *)v, (PyObject *)w);
}
[/code]
看一下这段代码，就是将两个整数对象的ob_ival相加，然后返回一个新的PyObject对象的指针，即参与加法操作的任何一个对象都没有发生改变，取而代之的是一个全新的PyIntObject对象。
那么PyIntObject对象是如何创建和维护的呢？Python中int对象有三种创建方法，分别是从long值生成，从string生成和从Py_UNICODE对象生成，后两种方法都是先将其转换为float类型，再由float转换为int对象。在创建一个整数对象时，分为小整数和大整数对象，打开intobject.c文件，找到如下代码：
[code lang=”C”]
#ifndef NSMALLPOSINTS
#define NSMALLPOSINTS 257
#endif
#ifndef NSMALLNEGINTS
#define NSMALLNEGINTS 5
#endif
#if NSMALLNEGINTS + NSMALLPOSINTS > 0
/* References to small integers are saved in this array so that they
can be shared.
The integers that are saved are those in the range
-NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
*/
static PyIntObject *small_ints[NSMALLNEGINTS + NSMALLPOSINTS];
#endif
#ifdef COUNT_ALLOCS
Py_ssize_t quick_int_allocs;
Py_ssize_t quick_neg_int_allocs;
#endif
[/code]
从以上代码可以分析出，Python中小整数对象的集合是[-5,257),当然，这就意味着我们可以改，改完之后重新编译Python，这时你的Python小对象集合就发生变化了，接下来就是大整数对象了，大整数对象在内存中是靠一个PyIntBlock来维护，再看下面一段代码：
[code lang=”C”]
#define BLOCK_SIZE 1000 /* 1K less typical malloc overhead */
#define BHEAD_SIZE 8 /* Enough for a 64-bit pointer */
#define N_INTOBJECTS ((BLOCK_SIZE – BHEAD_SIZE) / sizeof(PyIntObject))

struct _intblock {
struct _intblock *next;
PyIntObject objects[N_INTOBJECTS];
};

typedef struct _intblock PyIntBlock;

static PyIntBlock *block_list = NULL;
static PyIntObject *free_list = NULL;
[/code]
PyIntObject objects[N_INTOBJECTS]表示这个_intblock维护着N_INTOBJECTS个对象，通常是82个，不过我们自己也可以修改，换句话说，你创建1个大整数对象和创建80个大整数对象，Python所占内存大小是一样的，因为block块自身就占用很大的一块空间，你再创建一个对象，对象依然放在block中进行维护，当你创建83个对象时，就会有一个新的block空间，接下来我慢慢地讲一下整数对象的创建添加和删除，在intobject.c文件中找到如下代码：
[code lang=”C”]
PyObject *
PyInt_FromLong(long ival)
{
register PyIntObject *v;
#if NSMALLNEGINTS + NSMALLPOSINTS > 0
if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS) {
v = small_ints[ival + NSMALLNEGINTS];
Py_INCREF(v);
#ifdef COUNT_ALLOCS
if (ival >= 0)
quick_int_allocs++;
else
quick_neg_int_allocs++;
#endif
return (PyObject *) v;
}
#endif
if (free_list == NULL) {
if ((free_list = fill_free_list()) == NULL)
return NULL;
}
/* Inline PyObject_New */
v = free_list;
free_list = (PyIntObject *)Py_TYPE(v);
PyObject_INIT(v, &PyInt_Type);
v->ob_ival = ival;
return (PyObject *) v;
}
[/code]
分析这段代码，我们会知道PyIntObject对象的创建分两步：
1.判断小整数池是否被激活，若小整数池被激活，则使用小整数池；
2.否则使用通用整数对象池。
我们来看一段代码：
[code lang=”C”]
static PyIntObject *
fill_free_list(void)
{
PyIntObject *p, *q;
/* Python’s object allocator isn’t appropriate for large blocks. */
p = (PyIntObject *) PyMem_MALLOC(sizeof(PyIntBlock));
if (p == NULL)
return (PyIntObject *) PyErr_NoMemory();
((PyIntBlock *)p)->next = block_list;
block_list = (PyIntBlock *)p;
/* Link the int objects together, from rear to front, then return
the address of the last int object in the block. */
p = &((PyIntBlock *)p)->objects[0];
q = p + N_INTOBJECTS;
while (–q > p)
Py_TYPE(q) = (struct _typeobject *)(q-1);
Py_TYPE(q) = NULL;
return p + N_INTOBJECTS – 1;
}
[/code]
如果小整数对象池没有被激活，即使这个是小整数，也会被放入通用整数对象池中，通用整数对象池是个非常重要的概念，他的创建过程基本如下：
1、在创建PyInt_FromLong时，free_list为NULL，此时free_list会被调用创建一个新的block，从而创建新的内存块；
2、每次在block中添加一个int对象时，N_OBJECTS就自加一次，当N_OBJECTS达到上限时（Python默认是82），free_list指针的值重新赋值为NULL；
3、当再有整数对象创建时重复步骤1和2。
当一个证书对象被释放后，他的内存空间将返回给block，而不会直接返回给内存，当这个block中N_OBJECTS为0时，block块的整个内存全部释放；
画个图来解释一下：

最后，我们捎带手将一下Python中小整数对象池的初始化，在small_ints中，他仅仅只维护一个PyIntObject的指针，他是通过_PyInt_Init来实现初始化的，在Python初始化时，_PyInt_Init被调用，内存被申请，小整数对象被创建，然后在你的Python运行过程中一直存在。

Python源码学习（五）——Python中的整数对象

One thought on “Python源码学习（五）——Python中的整数对象”

发表评论取消回复

One thought on “Python源码学习（五）——Python中的整数对象”

发表评论 取消回复

发表评论取消回复