email.headerregistry : 自定义头对象

源代码: Lib/email/headerregistry.py


3.6 版新增: 1

头表示是通过定制子类化 str 。用于表示给定 Header 头的特定类的确定是通过 header_factory policy 有效当创建头时。此章节文档化特定 header_factory 实现通过 email 包为处理 RFC 5322 兼容 Email 消息,它不仅为各种头类型提供定制头对象,还为应用程序提供扩展机制以添加自己的自定义头类型。

当使用的任何策略对象派生自 EmailPolicy ,所有头的产生通过 HeaderRegistry 和拥有 BaseHeader 作为它们的最后基类。每个头类都拥有通过头类型确定的额外基类。例如,许多头拥有类 UnstructuredHeader 作为它们的其它基类。头的第 2 专用类由头名称确定,使用的查找表存储在 HeaderRegistry 。对于典型应用程序,所有这些是透明管理的,但提供用于修改默认行为的接口以供更复杂应用程序使用。

以下章节首先文档化头基类及其属性,紧接着是用于修改行为的 API 在 HeaderRegistry ,最后是支持类 (用于表示从结构化头剖析获得而来的数据)。

class email.headerregistry. BaseHeader ( name , value )

name and value 被传递给 BaseHeader header_factory 调用。任何头对象的字符串值都是 value 被完全解码成 Unicode。

此基类定义了下列只读特性:

name

头的名称 (在 : 之前的字段部分)。此值被准确传入 header_factory 调用对于 name ;也就是说,保留大小写。

defects

元组 HeaderDefect 实例报告在剖析期间找到的任何 RFC 合规性问题。email 包试着完整检测合规性问题。见 errors 模块了解可能报告的缺陷类型的论述。

max_count

此类型的头的最大数量,可以拥有相同 name 。值为 None 意味着不受限制。 BaseHeader 值对于此属性为 None ;期望专用头类根据需要覆盖此值。

BaseHeader 还提供通过 email 库代码调用的如下方法,且一般不应被应用程序所调用:

fold ( * , policy )

返回字符串包含 linesep 字符按要求正确折叠头根据 policy cte_type of 8bit 将被视为 7bit ,由于头可能不包含任意二进制数据。若 utf8 is False ,非 ASCII 数据将是 RFC 2047 编码。

BaseHeader by itself cannot be used to create a header object. It defines a protocol that each specialized header cooperates with in order to produce the header object. Specifically, BaseHeader requires that the specialized class provide a classmethod() named parse 。此方法的调用如下:

parse(string, kwds)
									

kwds is a dictionary containing one pre-initialized key, defects . defects is an empty list. The parse method should append any detected defects to this list. On return, the kwds dictionary must contain values for at least the keys decoded and defects . decoded should be the string value for the header (that is, the header value fully decoded to unicode). The parse method should assume that string may contain content-transfer-encoded parts, but should correctly handle all valid unicode characters as well so that it can parse un-encoded header values.

BaseHeader ’s __new__ then creates the header instance, and calls its init method. The specialized class only needs to provide an init method if it wishes to set additional attributes beyond those provided by BaseHeader itself. Such an init 方法应该看起来像这样:

def init(self, /, *args, **kw):
    self._myattr = kw.pop('myattr')
    super().init(*args, **kw)
									

That is, anything extra that the specialized class puts in to the kwds dictionary should be removed and handled, and the remaining contents of kw (and args ) passed to the BaseHeader init 方法。

class email.headerregistry. UnstructuredHeader

An “unstructured” header is the default type of header in RFC 5322 . Any header that does not have a specified syntax is treated as unstructured. The classic example of an unstructured header is the Subject 头。

RFC 5322 , an unstructured header is a run of arbitrary text in the ASCII character set. RFC 2047 , however, has an RFC 5322 compatible mechanism for encoding non-ASCII text as ASCII characters within a header value. When a value containing encoded words is passed to the constructor, the UnstructuredHeader parser converts such encoded words into unicode, following the RFC 2047 rules for unstructured text. The parser uses heuristics to attempt to decode certain non-compliant encoded words. Defects are registered in such cases, as well as defects for issues such as invalid characters within the encoded words or the non-encoded text.

此 Header 头类型不提供额外属性。

class email.headerregistry. DateHeader

RFC 5322 为 Email 头中的日期指定非常具体的格式。 DateHeader 剖析器识别该日期格式,及识别有时 in the wild (在野外) 发现的许多变体形式。

此头类型提供下列额外属性:

datetime

若可以将头值识别成一种有效日期形式或另一种形式,此属性将包含 datetime 实例表示该日期。若输入日期的时区被指定为 -0000 (指示它是 UTC 但不包含源时区的有关信息),那么 datetime 将是单纯 datetime 。若找到特定时区偏移 (包括 +0000 ),那么 datetime 将包含感知 datetime 使用 datetime.timezone 记录时区偏移。

decoded 值对于头的确定通过格式化 datetime 根据 RFC 5322 规则;也就是说,它被设为:

email.utils.format_datetime(self.datetime)
									

当创建 DateHeader , value 可以是 datetime 实例。这意味着,例如,以下代码有效且会做人们所期望的:

msg['Date'] = datetime(2011, 7, 15, 21)
									

因为这是单纯 datetime 将被解释成 UTC 时间戳,且结果值的时区为 -0000 。更有用的是使用 localtime() 函数从 utils 模块:

msg['Date'] = utils.localtime()
									

此范例使用当前时区偏移,将日期头设为当前时间和日期。

class email.headerregistry. AddressHeader

地址头是最复杂的结构化头类型之一。 AddressHeader 类为任何地址头提供一般接口。

此头类型提供下列额外属性:

groups

元组 Group objects encoding the addresses and groups found in the header value. Addresses that are not part of a group are represented in this list as single-address Groups whose display_name is None .

addresses

元组 Address objects encoding all of the individual addresses from the header value. If the header value contains any groups, the individual addresses from the group are included in the list at the point where the group occurs in the value (that is, the list of addresses is “flattened” into a one dimensional list).

decoded value of the header will have all encoded words decoded to unicode. idna encoded domain names are also decoded to unicode. The decoded value is set by join ing the str value of the elements of the groups 属性采用 ', ' .

A list of Address and Group objects in any combination may be used to set the value of an address header. Group objects whose display_name is None will be interpreted as single addresses, which allows an address list to be copied with groups intact by using the list obtained from the groups attribute of the source header.

class email.headerregistry. SingleAddressHeader

子类化的 AddressHeader 添加一额外属性:

address

The single address encoded by the header value. If the header value actually contains more than one address (which would be a violation of the RFC under the default policy ),访问此属性将导致 ValueError .

上面的许多类还拥有 Unique 变体 (例如, UniqueUnstructuredHeader ). The only difference is that in the Unique 变体, max_count 被设为 1。

class email.headerregistry. MIMEVersionHeader

There is really only one valid value for the MIME-Version header, and that is 1.0 . For future proofing, this header class supports other valid version numbers. If a version number has a valid value per RFC 2045 , then the header object will have non- None values for the following attributes:

version

The version number as a string, with any whitespace and/or comments removed.

major

作为整数的主要版本号

minor

作为整数的次要版本号

class email.headerregistry. ParameterizedMIMEHeader

MIME headers all start with the prefix ‘Content-’. Each specific header has a certain value, described under the class for that header. Some can also take a list of supplemental parameters, which have a common format. This class serves as a base for all the MIME headers that take parameters.

params

将参数名称映射到参数值的字典。

class email.headerregistry. ContentTypeHeader

ParameterizedMIMEHeader 类处理 Content-Type 头。

content_type

内容类型字符串,按形式 maintype/subtype .

maintype
subtype
class email.headerregistry. ContentDispositionHeader

ParameterizedMIMEHeader 类处理 Content-Disposition 头。

content_disposition

inline and attachment 是唯一常用有效值。

class email.headerregistry. ContentTransferEncoding

处理 Content-Transfer-Encoding 头。

cte

有效值是 7bit , 8bit , base64 ,和 quoted-printable 。见 RFC 2045 了解更多信息。

class email.headerregistry. HeaderRegistry ( base_class=BaseHeader , default_class=UnstructuredHeader , use_default_map=True )

This is the factory used by EmailPolicy 在默认情况下。 HeaderRegistry builds the class used to create a header instance dynamically, using base_class and a specialized class retrieved from a registry that it holds. When a given header name does not appear in the registry, the class specified by default_class is used as the specialized class. When use_default_map is True (the default), the standard mapping of header names to classes is copied in to the registry during initialization. base_class is always the last class in the generated class’s __bases__ 列表。

默认映射:

subject

UniqueUnstructuredHeader

date

UniqueDateHeader

resent-date

DateHeader

orig-date

UniqueDateHeader

sender

UniqueSingleAddressHeader

resent-sender

SingleAddressHeader

to

UniqueAddressHeader

resent-to

AddressHeader

cc

UniqueAddressHeader

resent-cc

AddressHeader

bcc

UniqueAddressHeader

resent-bcc

AddressHeader

from

UniqueAddressHeader

resent-from

AddressHeader

reply-to

UniqueAddressHeader

mime-version

MIMEVersionHeader

content-type

ContentTypeHeader

content-disposition

ContentDispositionHeader

content-transfer-encoding

ContentTransferEncodingHeader

message-id

MessageIDHeader

HeaderRegistry 拥有下列方法:

map_to_type ( self , name , cls )

name is the name of the header to be mapped. It will be converted to lower case in the registry. cls is the specialized class to be used, along with base_class , to create the class used to instantiate headers that match name .

__getitem__ ( name )

构造并返回类来处理创建 name 头。

__call__ ( name , value )

Retrieves the specialized header associated with name from the registry (using default_class if name does not appear in the registry) and composes it with base_class to produce a class, calls the constructed class’s constructor, passing it the same argument list, and finally returns the class instance created thereby.

The following classes are the classes used to represent data parsed from structured headers and can, in general, be used by an application program to construct structured values to assign to specific headers.

class email.headerregistry. Address ( display_name='' , username='' , domain='' , addr_spec=None )

The class used to represent an email address. The general form of an address is:

[display_name] <username@domain>
									

或:

username@domain
									

where each part must conform to specific syntax rules spelled out in RFC 5322 .

为了方便 addr_spec can be specified instead of username and domain , in which case username and domain will be parsed from the addr_spec . An addr_spec must be a properly RFC quoted string; if it is not Address will raise an error. Unicode characters are allowed and will be property encoded when serialized. However, per the RFCs, unicode is not allowed in the username portion of the address.

display_name

The display name portion of the address, if any, with all quoting removed. If the address does not have a display name, this attribute will be an empty string.

username

username 属于地址,移除所有引用。

domain

domain 属于地址。

addr_spec

username@domain portion of the address, correctly quoted for use as a bare address (the second form shown above). This attribute is not mutable.

__str__ ( )

str value of the object is the address quoted according to RFC 5322 rules, but with no Content Transfer Encoding of any non-ASCII characters.

为支持 SMTP ( RFC 5321 ), Address 处理一种特殊情况:若 username and domain 两者是空字符串 (或 None ),那么字符串值的 Address is <> .

class email.headerregistry. Group ( display_name=None , addresses=None )

用于表示地址组的类。地址组的一般形式为:

display_name: [address-list];
									

As a convenience for processing lists of addresses that consist of a mixture of groups and single addresses, a Group may also be used to represent single addresses that are not part of a group by setting display_name to None and providing a list of the single address as addresses .

display_name

display_name of the group. If it is None and there is exactly one Address in addresses , then the Group represents a single address that is not in a group.

addresses

可能为空的元组 Address 对象表示组中地址。

__str__ ( )

str value of a Group is formatted according to RFC 5322 , but with no Content Transfer Encoding of any non-ASCII characters. If display_name 为 None 且有一个 Address addresses 列表, str value will be the same as the str of that single Address .

脚注

1

最初在 3.3 添加作为 暂行模块

上一话题

email.errors :异常和缺陷类

下一话题

email.contentmanager :管理 MIME 内容

本页