multiprocessing.shared_memory
— 用于跨进程直接访问的共享内存
¶
源代码: Lib/multiprocessing/shared_memory.py
Added in version 3.8.
此模块提供类
SharedMemory
,用于分配和管理由多核或 SMP (对称多处理器) 机器中的一个或多个进程所访问的共享内存。为辅助共享内存的生命周期管理,尤其是跨不同进程,
BaseManager
子类,
SharedMemoryManager
,也有提供在
multiprocessing.managers
模块。
In this module, shared memory refers to “POSIX style” shared memory blocks (though is not necessarily implemented explicitly as such) and does not refer to “distributed shared memory”. This style of shared memory permits distinct processes to potentially read and write to a common (or shared) region of volatile memory. Processes are conventionally limited to only have access to their own process memory space but shared memory permits the sharing of data between processes, avoiding the need to instead send messages between processes containing that data. Sharing data directly via memory can provide significant performance benefits compared to sharing data via disk or socket or other communications requiring the serialization/deserialization and copying of data.
创建实例化的
SharedMemory
class for either creating a new shared memory block or attaching to an existing shared memory block. Each shared memory block is assigned a unique name. In this way, one process can create a shared memory block with a particular name and a different process can attach to that same shared memory block using that same name.
作为跨进程共享数据的资源,共享内存块可能比创建它们的原始进程长寿。当一个进程不再需要访问,其它进程可能仍然需要的共享内存块时,
close()
方法应被调用。当任何进程不再需要共享内存块时,
unlink()
方法应被调用以确保适当清理。
名称
(
str
|
None
) – The unique name for the requested shared memory, specified as a string. When creating a new shared memory block, if
None
(默认) 被供给为名称,将生成 novel (新颖) 名称。
create
(
bool
) – Control whether a new shared memory block is created (
True
) 还是附加到现有共享内存块 (
False
).
size ( int ) – The requested number of bytes when creating a new shared memory block. Because some platforms choose to allocate chunks of memory based upon that platform’s memory page size, the exact size of the shared memory block may be larger or equal to the size requested. When attaching to an existing shared memory block, the size 参数被忽略。
track
(
bool
) – When
True
, register the shared memory block with a resource tracker process on platforms where the OS does not do this automatically. The resource tracker ensures proper cleanup of the shared memory even if all other processes with access to the memory exit without doing so. Python processes created from a common ancestor using
multiprocessing
facilities share a single resource tracker process, and the lifetime of shared memory segments is handled automatically among these processes. Python processes created in any other way will receive their own resource tracker when accessing shared memory with
track
enabled. This will cause the shared memory to be deleted by the resource tracker of the first process that terminates. To avoid this issue, users of
subprocess
or standalone Python processes should set
track
to
False
when there is already another process in place that does the bookkeeping.
track
is ignored on Windows, which has its own tracking and automatically deletes shared memory when all handles to it have been closed.
Changed in version 3.13: 添加 track 参数。
Close the file descriptor/handle to the shared memory from this instance.
close()
should be called once access to the shared memory block from this instance is no longer needed. Depending on operating system, the underlying memory may or may not be freed even if all handles to it have been closed. To ensure proper cleanup, use the
unlink()
方法。
Delete the underlying shared memory block. This should be called only once per shared memory block regardless of the number of handles to it, even in other processes.
unlink()
and
close()
can be called in any order, but trying to access data inside a shared memory block after
unlink()
may result in memory access errors, depending on platform.
This method has no effect on Windows, where the only way to delete a shared memory block is to close all handles.
共享内存块内容的内存视图。
只读访问共享内存块的唯一名称。
只读访问共享内存块的字节大小。
以下范例演示低级用法为
SharedMemory
实例:
>>> from multiprocessing import shared_memory >>> shm_a = shared_memory.SharedMemory(create=True, size=10) >>> type(shm_a.buf) <class 'memoryview'> >>> buffer = shm_a.buf >>> len(buffer) 10 >>> buffer[:4] = bytearray([22, 33, 44, 55]) # Modify multiple at once >>> buffer[4] = 100 # Modify single byte at a time >>> # Attach to an existing shared memory block >>> shm_b = shared_memory.SharedMemory(shm_a.name) >>> import array >>> array.array('b', shm_b.buf[:5]) # Copy the data into a new array.array array('b', [22, 33, 44, 55, 100]) >>> shm_b.buf[:5] = b'howdy' # Modify via shm_b using bytes >>> bytes(shm_a.buf[:5]) # Access via shm_a b'howdy' >>> shm_b.close() # Close each SharedMemory instance >>> shm_a.close() >>> shm_a.unlink() # Call unlink only once to release the shared memory
以下范例演示实际使用
SharedMemory
类采用
NumPy 数组
,访问同一
numpy.ndarray
从 2 个不同的 Python Shell:
>>> # In the first Python interactive shell
>>> import numpy as np
>>> a = np.array([1, 1, 2, 3, 5, 8]) # Start with an existing NumPy array
>>> from multiprocessing import shared_memory
>>> shm = shared_memory.SharedMemory(create=True, size=a.nbytes)
>>> # Now create a NumPy array backed by shared memory
>>> b = np.ndarray(a.shape, dtype=a.dtype, buffer=shm.buf)
>>> b[:] = a[:] # Copy the original data into shared memory
>>> b
array([1, 1, 2, 3, 5, 8])
>>> type(b)
<class 'numpy.ndarray'>
>>> type(a)
<class 'numpy.ndarray'>
>>> shm.name # We did not specify a name so one was chosen for us
'psm_21467_46075'
>>> # In either the same shell or a new Python shell on the same machine
>>> import numpy as np
>>> from multiprocessing import shared_memory
>>> # Attach to the existing shared memory block
>>> existing_shm = shared_memory.SharedMemory(name='psm_21467_46075')
>>> # Note that a.shape is (6,) and a.dtype is np.int64 in this example
>>> c = np.ndarray((6,), dtype=np.int64, buffer=existing_shm.buf)
>>> c
array([1, 1, 2, 3, 5, 8])
>>> c[-1] = 888
>>> c
array([ 1, 1, 2, 3, 5, 888])
>>> # Back in the first Python interactive shell, b reflects this change
>>> b
array([ 1, 1, 2, 3, 5, 888])
>>> # Clean up from within the second Python shell
>>> del c # Unnecessary; merely emphasizing the array is no longer used
>>> existing_shm.close()
>>> # Clean up from within the first Python shell
>>> del b # Unnecessary; merely emphasizing the array is no longer used
>>> shm.close()
>>> shm.unlink() # Free and release the shared memory block at the very end
子类化的
multiprocessing.managers.BaseManager
可以用于跨进程管理共享内存块。
调用
start()
在
SharedMemoryManager
实例将导致启动新进程。此新进程的唯一用途,是管理透过它创建的所有共享内存块的生命周期。要触发由该进程管理的所有共享内存块的释放,调用
shutdown()
在实例。这触发
unlink()
调用当所有
SharedMemory
对象由该进程管理,然后停止进程本身。通过创建
SharedMemory
实例透过
SharedMemoryManager
,避免需要手动追踪和触发共享内存资源的释放。
此类提供的方法用于创建并返回
SharedMemory
实例和用于创建像列表对象 (
ShareableList
) 以共享内存作后盾。
参考
BaseManager
了解描述为继承的
address
and
authkey
可选输入自变量及如何使用它们以连接到现有
SharedMemoryManager
服务从其它进程。
创建并返回新的
SharedMemory
对象采用指定
size
以字节为单位。
创建并返回新的
ShareableList
对象,初始值来自输入
sequence
.
以下范例演示基本机制为
SharedMemoryManager
:
>>> from multiprocessing.managers import SharedMemoryManager
>>> smm = SharedMemoryManager()
>>> smm.start() # Start the process that manages the shared memory blocks
>>> sl = smm.ShareableList(range(4))
>>> sl
ShareableList([0, 1, 2, 3], name='psm_6572_7512')
>>> raw_shm = smm.SharedMemory(size=128)
>>> another_sl = smm.ShareableList('alpha')
>>> another_sl
ShareableList(['a', 'l', 'p', 'h', 'a'], name='psm_6572_12221')
>>> smm.shutdown() # Calls unlink() on sl, raw_shm, and another_sl
以下范例叙述潜在更方便的模式为使用
SharedMemoryManager
对象凭借
with
语句以确保释放所有共享内存块,当不再需要它们时:
>>> with SharedMemoryManager() as smm:
... sl = smm.ShareableList(range(2000))
... # Divide the work among two processes, storing partial results in sl
... p1 = Process(target=do_work, args=(sl, 0, 1000))
... p2 = Process(target=do_work, args=(sl, 1000, 2000))
... p1.start()
... p2.start() # A multiprocessing.Pool might be more efficient
... p1.join()
... p2.join() # Wait for all work to complete in both processes
... total_result = sum(sl) # Consolidate the partial results now in sl
当使用
SharedMemoryManager
在
with
语句,会释放使用该管理器创建的所有共享内存块,当
with
语句的代码块完成执行。
Provide a mutable list-like object where all values stored within are stored in a shared memory block. This constrains storable values to the following built-in data types:
int
(signed 64-bit)
str
(less than 10M bytes each when encoded as UTF-8)
bytes
(less than 10M bytes each)
None
It also notably differs from the built-in
list
type in that these lists can not change their overall length (i.e. no
append()
,
insert()
, etc.) and do not support the dynamic creation of new
ShareableList
实例凭借切片。
sequence
是用于填充新
ShareableList
的完整值。设为
None
而不是附加到已存在
ShareableList
通过其唯一共享内存名称。
name
是请求共享内存的唯一名称,如定义中描述的对于
SharedMemory
。当附加到现有
ShareableList
,指定其共享内存块的唯一名称,而剩下的
sequence
设为
None
.
注意
A known issue exists for
bytes
and
str
values. If they end with
\x00
nul bytes or characters, those may be
silently stripped
when fetching them by index from the
ShareableList
。此
.rstrip(b'\x00')
behavior is considered a bug and may go away in the future. See
gh-106939
.
For applications where rstripping of trailing nulls is a problem, work around it by always unconditionally appending an extra non-0 byte to the end of such values when storing and unconditionally removing it when fetching:
>>> from multiprocessing import shared_memory
>>> nul_bug_demo = shared_memory.ShareableList(['?\x00', b'\x03\x02\x01\x00\x00\x00'])
>>> nul_bug_demo[0]
'?'
>>> nul_bug_demo[1]
b'\x03\x02\x01'
>>> nul_bug_demo.shm.unlink()
>>> padded = shared_memory.ShareableList(['?\x00\x07', b'\x03\x02\x01\x00\x00\x00\x07'])
>>> padded[0][:-1]
'?\x00'
>>> padded[1][:-1]
b'\x03\x02\x01\x00\x00\x00'
>>> padded.shm.unlink()
Return the number of occurrences of value .
Return first index position of
value
。引发
ValueError
if
value
不存在。
只读属性包含
struct
打包格式,用于所有的目前存储值。
The
SharedMemory
实例,在其中存储值。
以下范例演示基本用法为
ShareableList
实例:
>>> from multiprocessing import shared_memory
>>> a = shared_memory.ShareableList(['howdy', b'HoWdY', -273.154, 100, None, True, 42])
>>> [ type(entry) for entry in a ]
[<class 'str'>, <class 'bytes'>, <class 'float'>, <class 'int'>, <class 'NoneType'>, <class 'bool'>, <class 'int'>]
>>> a[2]
-273.154
>>> a[2] = -78.5
>>> a[2]
-78.5
>>> a[2] = 'dry ice' # Changing data types is supported as well
>>> a[2]
'dry ice'
>>> a[2] = 'larger than previously allocated storage space'
Traceback (most recent call last):
...
ValueError: exceeds available storage for existing str
>>> a[2]
'dry ice'
>>> len(a)
7
>>> a.index(42)
6
>>> a.count(b'howdy')
0
>>> a.count(b'HoWdY')
1
>>> a.shm.close()
>>> a.shm.unlink()
>>> del a # Use of a ShareableList after call to unlink() is unsupported
以下范例叙述 1 个、2 个或多个进程如何访问同一
ShareableList
通过提供共享内存块名称,稍后:
>>> b = shared_memory.ShareableList(range(5)) # In a first process
>>> c = shared_memory.ShareableList(name=b.shm.name) # In a second process
>>> c
ShareableList([0, 1, 2, 3, 4], name='...')
>>> c[-1] = -999
>>> b[-1]
-999
>>> b.shm.close()
>>> c.shm.close()
>>> c.shm.unlink()
以下范例演示
ShareableList
(和底层
SharedMemory
) 对象可以被腌制和取消腌制,若需要。注意,它仍然是同一共享对象。这会发生,是因为反序列化对象拥有相同唯一名称,且恰好采用同一名称附加到现有对象 (若对象仍存活):
>>> import pickle
>>> from multiprocessing import shared_memory
>>> sl = shared_memory.ShareableList(range(10))
>>> list(sl)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> deserialized_sl = pickle.loads(pickle.dumps(sl))
>>> list(deserialized_sl)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> sl[0] = -1
>>> deserialized_sl[1] = -2
>>> list(sl)
[-1, -2, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(deserialized_sl)
[-1, -2, 2, 3, 4, 5, 6, 7, 8, 9]
>>> sl.shm.close()
>>> sl.shm.unlink()