18.1.5. email.header: Internationalized headers It derives from the older <span class="target" id="index-1"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc822.html"><strong>RFC 822</strong></a> standard which came into widespread use at a time when most email was composed of ASCII characters only. <span class="target" id="index-2"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2822.html"><strong>RFC 2822</strong></a> is a specification written assuming email contains only 7-bit ASCII characters.</p> <p>Of course, as email has been deployed worldwide, it has become internationalized, such that language specific character sets can now be used in email messages. The base standard still requires email messages to be transferred using only 7-bit ASCII characters, so a slew of RFCs have been written describing how to encode email containing non-ASCII characters into <span class="target" id="index-3"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2822.html"><strong>RFC 2822</strong></a>-compliant format. These RFCs include <span class="target" id="index-4"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2045.html"><strong>RFC 2045</strong></a>, <span class="target" id="index-5"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2046.html"><strong>RFC 2046</strong></a>, <span class="target" id="index-6"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2047.html"><strong>RFC 2047</strong></a>, and <span class="target" id="index-7"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2231.html"><strong>RFC 2231</strong></a>. The <a class="reference internal" href="email.html#module-email" title="email: Package supporting the parsing, manipulating, and generating email messages, including MIME documents."><tt class="xref py py-mod docutils literal"><span class="pre">email</span></tt></a> package supports these standards in its <a class="reference internal" href="#module-email.header" title="email.header: Representing non-ASCII headers"><tt class="xref py py-mod docutils literal"><span class="pre">email.header</span></tt></a> and <a class="reference internal" href="email.charset.html#module-email.charset" title="email.charset: Character Sets"><tt class="xref py py-mod docutils literal"><span class="pre">email.charset</span></tt></a> modules.</p> <p>If you want to include non-ASCII characters in your email headers, say in the <em class="mailheader">Subject</em> or <em class="mailheader">To</em> fields, you should use the <a class="reference internal" href="#email.header.Header" title="email.header.Header"><tt class="xref py py-class docutils literal"><span class="pre">Header</span></tt></a> class and assign the field in the <a class="reference internal" href="email.message.html#email.message.Message" title="email.message.Message"><tt class="xref py py-class docutils literal"><span class="pre">Message</span></tt></a> object to an instance of <a class="reference internal" href="#email.header.Header" title="email.header.Header"><tt class="xref py py-class docutils literal"><span class="pre">Header</span></tt></a> instead of using a string for the header value. Import the <a class="reference internal" href="#email.header.Header" title="email.header.Header"><tt class="xref py py-class docutils literal"><span class="pre">Header</span></tt></a> class from the <a class="reference internal" href="#module-email.header" title="email.header: Representing non-ASCII headers"><tt class="xref py py-mod docutils literal"><span class="pre">email.header</span></tt></a> module. For example:</p> <div class="highlight-python"><div class="highlight"><pre><span class="gp">>>> </span><span class="kn">from</span> <span class="nn">email.message</span> <span class="kn">import</span> <span class="n">Message</span> <span class="gp">>>> </span><span class="kn">from</span> <span class="nn">email.header</span> <span class="kn">import</span> <span class="n">Header</span> <span class="gp">>>> </span><span class="n">msg</span> <span class="o">=</span> <span class="n">Message</span><span class="p">()</span> <span class="gp">>>> </span><span class="n">h</span> <span class="o">=</span> <span class="n">Header</span><span class="p">(</span><span class="s">'p</span><span class="se">\xf6</span><span class="s">stal'</span><span class="p">,</span> <span class="s">'iso-8859-1'</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">msg</span><span class="p">[</span><span class="s">'Subject'</span><span class="p">]</span> <span class="o">=</span> <span class="n">h</span> <span class="gp">>>> </span><span class="k">print</span> <span class="n">msg</span><span class="o">.</span><span class="n">as_string</span><span class="p">()</span> <span class="go">Subject: =?iso-8859-1?q?p=F6stal?=</span> </pre></div> </div> <p>Notice here how we wanted the <em class="mailheader">Subject</em> field to contain a non-ASCII character? We did this by creating a <a class="reference internal" href="#email.header.Header" title="email.header.Header"><tt class="xref py py-class docutils literal"><span class="pre">Header</span></tt></a> instance and passing in the character set that the byte string was encoded in. When the subsequent <a class="reference internal" href="email.message.html#email.message.Message" title="email.message.Message"><tt class="xref py py-class docutils literal"><span class="pre">Message</span></tt></a> instance was flattened, the <em class="mailheader">Subject</em> field was properly <span class="target" id="index-8"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2047.html"><strong>RFC 2047</strong></a> encoded. MIME-aware mail readers would show this header using the embedded ISO-8859-1 character.</p> <p class="versionadded"> <span class="versionmodified">New in version 2.2.2.</span></p> <p>Here is the <a class="reference internal" href="#email.header.Header" title="email.header.Header"><tt class="xref py py-class docutils literal"><span class="pre">Header</span></tt></a> class description:</p> <dl class="class"> <dt id="email.header.Header"> <em class="property">class </em><tt class="descclassname">email.header.</tt><tt class="descname">Header</tt><big>(</big><span class="optional">[</span><em>s</em><span class="optional">[</span>, <em>charset</em><span class="optional">[</span>, <em>maxlinelen</em><span class="optional">[</span>, <em>header_name</em><span class="optional">[</span>, <em>continuation_ws</em><span class="optional">[</span>, <em>errors</em><span class="optional">]</span><span class="optional">]</span><span class="optional">]</span><span class="optional">]</span><span class="optional">]</span><span class="optional">]</span><big>)</big><a class="headerlink" href="#email.header.Header" title="Permalink to this definition">¶</a></dt> <dd><p>Create a MIME-compliant header that can contain strings in different character sets.</p> <p>Optional <em>s</em> is the initial header value. If <tt class="docutils literal"><span class="pre">None</span></tt> (the default), the initial header value is not set. You can later append to the header with <a class="reference internal" href="#email.header.Header.append" title="email.header.Header.append"><tt class="xref py py-meth docutils literal"><span class="pre">append()</span></tt></a> method calls. <em>s</em> may be a byte string or a Unicode string, but see the <a class="reference internal" href="#email.header.Header.append" title="email.header.Header.append"><tt class="xref py py-meth docutils literal"><span class="pre">append()</span></tt></a> documentation for semantics.</p> <p>Optional <em>charset</em> serves two purposes: it has the same meaning as the <em>charset</em> argument to the <a class="reference internal" href="#email.header.Header.append" title="email.header.Header.append"><tt class="xref py py-meth docutils literal"><span class="pre">append()</span></tt></a> method. It also sets the default character set for all subsequent <a class="reference internal" href="#email.header.Header.append" title="email.header.Header.append"><tt class="xref py py-meth docutils literal"><span class="pre">append()</span></tt></a> calls that omit the <em>charset</em> argument. If <em>charset</em> is not provided in the constructor (the default), the <tt class="docutils literal"><span class="pre">us-ascii</span></tt> character set is used both as <em>s</em>‘s initial charset and as the default for subsequent <a class="reference internal" href="#email.header.Header.append" title="email.header.Header.append"><tt class="xref py py-meth docutils literal"><span class="pre">append()</span></tt></a> calls.</p> <p>The maximum line length can be specified explicitly via <em>maxlinelen</em>. For splitting the first line to a shorter value (to account for the field header which isn’t included in <em>s</em>, e.g. <em class="mailheader">Subject</em>) pass in the name of the field in <em>header_name</em>. The default <em>maxlinelen</em> is 76, and the default value for <em>header_name</em> is <tt class="docutils literal"><span class="pre">None</span></tt>, meaning it is not taken into account for the first line of a long, split header.</p> <p>Optional <em>continuation_ws</em> must be <span class="target" id="index-9"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2822.html"><strong>RFC 2822</strong></a>-compliant folding whitespace, and is usually either a space or a hard tab character. This character will be prepended to continuation lines. <em>continuation_ws</em> defaults to a single space character (” ”).</p> <p>Optional <em>errors</em> is passed straight through to the <a class="reference internal" href="#email.header.Header.append" title="email.header.Header.append"><tt class="xref py py-meth docutils literal"><span class="pre">append()</span></tt></a> method.</p> <dl class="method"> <dt id="email.header.Header.append"> <tt class="descname">append</tt><big>(</big><em>s</em><span class="optional">[</span>, <em>charset</em><span class="optional">[</span>, <em>errors</em><span class="optional">]</span><span class="optional">]</span><big>)</big><a class="headerlink" href="#email.header.Header.append" title="Permalink to this definition">¶</a></dt> <dd><p>Append the string <em>s</em> to the MIME header.</p> <p>Optional <em>charset</em>, if given, should be a <a class="reference internal" href="email.charset.html#email.charset.Charset" title="email.charset.Charset"><tt class="xref py py-class docutils literal"><span class="pre">Charset</span></tt></a> instance (see <a class="reference internal" href="email.charset.html#module-email.charset" title="email.charset: Character Sets"><tt class="xref py py-mod docutils literal"><span class="pre">email.charset</span></tt></a>) or the name of a character set, which will be converted to a <a class="reference internal" href="email.charset.html#email.charset.Charset" title="email.charset.Charset"><tt class="xref py py-class docutils literal"><span class="pre">Charset</span></tt></a> instance. A value of <tt class="docutils literal"><span class="pre">None</span></tt> (the default) means that the <em>charset</em> given in the constructor is used.</p> <p><em>s</em> may be a byte string or a Unicode string. If it is a byte string (i.e. <tt class="docutils literal"><span class="pre">isinstance(s,</span> <span class="pre">str)</span></tt> is true), then <em>charset</em> is the encoding of that byte string, and a <a class="reference internal" href="exceptions.html#exceptions.UnicodeError" title="exceptions.UnicodeError"><tt class="xref py py-exc docutils literal"><span class="pre">UnicodeError</span></tt></a> will be raised if the string cannot be decoded with that character set.</p> <p>If <em>s</em> is a Unicode string, then <em>charset</em> is a hint specifying the character set of the characters in the string. In this case, when producing an <span class="target" id="index-10"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2822.html"><strong>RFC 2822</strong></a>-compliant header using <span class="target" id="index-11"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2047.html"><strong>RFC 2047</strong></a> rules, the Unicode string will be encoded using the following charsets in order: <tt class="docutils literal"><span class="pre">us-ascii</span></tt>, the <em>charset</em> hint, <tt class="docutils literal"><span class="pre">utf-8</span></tt>. The first character set to not provoke a <a class="reference internal" href="exceptions.html#exceptions.UnicodeError" title="exceptions.UnicodeError"><tt class="xref py py-exc docutils literal"><span class="pre">UnicodeError</span></tt></a> is used.</p> <p>Optional <em>errors</em> is passed through to any <a class="reference internal" href="functions.html#unicode" title="unicode"><tt class="xref py py-func docutils literal"><span class="pre">unicode()</span></tt></a> or <tt class="xref py py-func docutils literal"><span class="pre">ustr.encode()</span></tt> call, and defaults to “strict”.</p> </dd></dl> <dl class="method"> <dt id="email.header.Header.encode"> <tt class="descname">encode</tt><big>(</big><span class="optional">[</span><em>splitchars</em><span class="optional">]</span><big>)</big><a class="headerlink" href="#email.header.Header.encode" title="Permalink to this definition">¶</a></dt> <dd><p>Encode a message header into an RFC-compliant format, possibly wrapping long lines and encapsulating non-ASCII parts in base64 or quoted-printable encodings. Optional <em>splitchars</em> is a string containing characters to split long ASCII lines on, in rough support of <span class="target" id="index-12"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2822.html"><strong>RFC 2822</strong></a>‘s <em>highest level syntactic breaks</em>. This doesn’t affect <span class="target" id="index-13"></span><a class="rfc reference external" href="http://tools.ietf.org/html/rfc2047.html"><strong>RFC 2047</strong></a> encoded lines.</p> </dd></dl> <p>The <a class="reference internal" href="#email.header.Header" title="email.header.Header"><tt class="xref py py-class docutils literal"><span class="pre">Header</span></tt></a> class also provides a number of methods to support standard operators and built-in functions.</p> <dl class="method"> <dt id="email.header.Header.__str__"> <tt class="descname">__str__</tt><big>(</big><big>)</big><a class="headerlink" href="#email.header.Header.__str__" title="Permalink to this definition">¶</a></dt> <dd><p>A synonym for <a class="reference internal" href="#email.header.Header.encode" title="email.header.Header.encode"><tt class="xref py py-meth docutils literal"><span class="pre">Header.encode()</span></tt></a>. Useful for <tt class="docutils literal"><span class="pre">str(aHeader)</span></tt>.</p> </dd></dl> <dl class="method"> <dt id="email.header.Header.__unicode__"> <tt class="descname">__unicode__</tt><big>(</big><big>)</big><a class="headerlink" href="#email.header.Header.__unicode__" title="Permalink to this definition">¶</a></dt> <dd><p>A helper for the built-in <a class="reference internal" href="functions.html#unicode" title="unicode"><tt class="xref py py-func docutils literal"><span class="pre">unicode()</span></tt></a> function. Returns the header as a Unicode string.</p> </dd></dl> <dl class="method"> <dt id="email.header.Header.__eq__"> <tt class="descname">__eq__</tt><big>(</big><em>other</em><big>)</big><a class="headerlink" href="#email.header.Header.__eq__" title="Permalink to this definition">¶</a></dt> <dd><p>This method allows you to compare two <a class="reference internal" href="#email.header.Header" title="email.header.Header"><tt class="xref py py-class docutils literal"><span class="pre">Header</span></tt></a> instances for equality.</p> </dd></dl> <dl class="method"> <dt id="email.header.Header.__ne__"> <tt class="descname">__ne__</tt><big>(</big><em>other</em><big>)</big><a class="headerlink" href="#email.header.Header.__ne__" title="Permalink to this definition">¶</a></dt> <dd><p>This method allows you to compare two <a class="reference internal" href="#email.header.Header" title="email.header.Header"><tt class="xref py py-class docutils literal"><span class="pre">Header</span></tt></a> instances for inequality.</p> </dd></dl> </dd></dl> <p>The <a class="reference internal" href="#module-email.header" title="email.header: Representing non-ASCII headers"><tt class="xref py py-mod docutils literal"><span class="pre">email.header</span></tt></a> module also provides the following convenient functions.</p> <dl class="function"> <dt id="email.header.decode_header"> <tt class="descclassname">email.header.</tt><tt class="descname">decode_header</tt><big>(</big><em>header</em><big>)</big><a class="headerlink" href="#email.header.decode_header" title="Permalink to this definition">¶</a></dt> <dd><p>Decode a message header value without converting the character set. The header value is in <em>header</em>.</p> <p>This function returns a list of <tt class="docutils literal"><span class="pre">(decoded_string,</span> <span class="pre">charset)</span></tt> pairs containing each of the decoded parts of the header. <em>charset</em> is <tt class="docutils literal"><span class="pre">None</span></tt> for non-encoded parts of the header, otherwise a lower case string containing the name of the character set specified in the encoded string.</p> <p>Here’s an example:</p> <div class="highlight-python"><div class="highlight"><pre><span class="gp">>>> </span><span class="kn">from</span> <span class="nn">email.header</span> <span class="kn">import</span> <span class="n">decode_header</span> <span class="gp">>>> </span><span class="n">decode_header</span><span class="p">(</span><span class="s">'=?iso-8859-1?q?p=F6stal?='</span><span class="p">)</span> <span class="go">[('p\xf6stal', 'iso-8859-1')]</span> </pre></div> </div> </dd></dl> <dl class="function"> <dt id="email.header.make_header"> <tt class="descclassname">email.header.</tt><tt class="descname">make_header</tt><big>(</big><em>decoded_seq</em><span class="optional">[</span>, <em>maxlinelen</em><span class="optional">[</span>, <em>header_name</em><span class="optional">[</span>, <em>continuation_ws</em><span class="optional">]</span><span class="optional">]</span><span class="optional">]</span><big>)</big><a class="headerlink" href="#email.header.make_header" title="Permalink to this definition">¶</a></dt> <dd><p>Create a <a class="reference internal" href="#email.header.Header" title="email.header.Header"><tt class="xref py py-class docutils literal"><span class="pre">Header</span></tt></a> instance from a sequence of pairs as returned by <a class="reference internal" href="#email.header.decode_header" title="email.header.decode_header"><tt class="xref py py-func docutils literal"><span class="pre">decode_header()</span></tt></a>.</p> <p><a class="reference internal" href="#email.header.decode_header" title="email.header.decode_header"><tt class="xref py py-func docutils literal"><span class="pre">decode_header()</span></tt></a> takes a header value string and returns a sequence of pairs of the format <tt class="docutils literal"><span class="pre">(decoded_string,</span> <span class="pre">charset)</span></tt> where <em>charset</em> is the name of the character set.</p> <p>This function takes one of those sequence of pairs and returns a <a class="reference internal" href="#email.header.Header" title="email.header.Header"><tt class="xref py py-class docutils literal"><span class="pre">Header</span></tt></a> instance. 