codecs — 编解码器注册和基类

源代码: Lib/codecs.py


该模块定义标准 Python 编解码器 (编码器和解码器) 基类,并提供对内部 Python 编解码器注册的访问 (管理编解码器和错误处理的查找过程)。大多数标准编解码器都是 文本编码 ,将文本编码成字节 (和将字节解码成文本),但还提供将文本编码成文本和将字节编码成字节的编解码器。自定义编解码器可以在任意类型之间编码和解码,但一些模块特征限定具体使用采用 文本编码 或采用编解码器编码成 bytes .

模块定义了采用任何编解码器编码和解码的下列函数:

codecs. encode ( obj , encoding = 'utf-8' , errors = 'strict' )

编码 obj 使用注册编解码器为 encoding .

错误 may be given to set the desired error handling scheme. The default error handler is 'strict' meaning that encoding errors raise ValueError (or a more codec specific subclass, such as UnicodeEncodeError ). Refer to 编解码器基类 for more information on codec error handling.

codecs. decode ( obj , encoding = 'utf-8' , errors = 'strict' )

解码 obj 使用注册编解码器为 encoding .

错误 may be given to set the desired error handling scheme. The default error handler is 'strict' meaning that decoding errors raise ValueError (or a more codec specific subclass, such as UnicodeDecodeError ). Refer to 编解码器基类 for more information on codec error handling.

还可以直接查找每个编解码器的完整细节:

codecs. lookup ( encoding )

Looks up the codec info in the Python codec registry and returns a CodecInfo object as defined below.

Encodings are first looked up in the registry’s cache. If not found, the list of registered search functions is scanned. If no CodecInfo object is found, a LookupError is raised. Otherwise, the CodecInfo object is stored in the cache and returned to the caller.

class codecs. CodecInfo ( encode , decode , streamreader = None , streamwriter = None , incrementalencoder = None , incrementaldecoder = None , 名称 = None )

Codec details when looking up the codec registry. The constructor arguments are stored in attributes of the same name:

名称

编码的名称。

encode
decode

The stateless encoding and decoding functions. These must be functions or methods which have the same interface as the encode() and decode() methods of Codec instances (see Codec Interface ). The functions or methods are expected to work in a stateless mode.

incrementalencoder
incrementaldecoder

Incremental encoder and decoder classes or factory functions. These have to provide the interface defined by the base classes IncrementalEncoder and IncrementalDecoder , respectively. Incremental codecs can maintain state.

streamwriter
streamreader

Stream writer and reader classes or factory functions. These have to provide the interface defined by the base classes StreamWriter and StreamReader , respectively. Stream codecs can maintain state.

To simplify access to the various codec components, the module provides these additional functions which use lookup() for the codec lookup:

codecs. getencoder ( encoding )

Look up the codec for the given encoding and return its encoder function.

引发 LookupError 在找不到编码的情况下。

codecs. getdecoder ( encoding )

Look up the codec for the given encoding and return its decoder function.

引发 LookupError 在找不到编码的情况下。

codecs. getincrementalencoder ( encoding )

Look up the codec for the given encoding and return its incremental encoder class or factory function.

引发 LookupError in case the encoding cannot be found or the codec doesn’t support an incremental encoder.

codecs. getincrementaldecoder ( encoding )

Look up the codec for the given encoding and return its incremental decoder class or factory function.

引发 LookupError in case the encoding cannot be found or the codec doesn’t support an incremental decoder.

codecs. getreader ( encoding )

Look up the codec for the given encoding and return its StreamReader 类或工厂函数。

引发 LookupError 在找不到编码的情况下。

codecs. getwriter ( encoding )

Look up the codec for the given encoding and return its StreamWriter 类或工厂函数。

引发 LookupError 在找不到编码的情况下。

自定义编解码器可用于注册合适编解码器搜索功能:

codecs. register ( search_function )

Register a codec search function. Search functions are expected to take one argument, being the encoding name in all lower case letters with hyphens and spaces converted to underscores, and return a CodecInfo object. In case a search function cannot find a given encoding, it should return None .

3.9 版改变: Hyphens and spaces are converted to underscore.