The modern email header parser (email._header_value_parser, used under EmailPolicy / email.policy.default) raises a bare IndexError instead of a HeaderParseError / defect on two malformed inputs. The shared root cause is the same bug class: an empty parser token-list or string is indexed (value[0] / res[-1]) without first checking it is non-empty, so the parser escapes its own error-recovery contract.
Instance 1 — MIME parameter name ending with *
import email, email.policy
email.message_from_string("Content-Type: text/plain; name*\n\n",
policy=email.policy.default)['content-type'].params
get_parameter consumes the extended-parameter marker *, leaving value == '', then evaluates value[0] → IndexError. parse_mime_parameters only catches HeaderParseError, so the IndexError escapes. Also reachable via name*0* (sectioned) and a trailing x=1; name*.
Instance 2 — address display name that is only a comment
email.message_from_string("To: (c):\n\n", policy=email.policy.default)['to']
email.message_from_string("Cc: (x): a@b.com;\n\n", policy=email.policy.default)['cc']
DisplayName.display_name guards the empty-list case, but when the name is a single cfws token, res.pop(0) empties res and the subsequent res[-1] raises IndexError (this also propagates through the .value property). The second example is a syntactically valid RFC 5322 group whose display name is a comment.
Both are reachable from ordinary parsing of untrusted/malformed address and parameter headers under the modern policies.
Fix
Add minimal guards so each input degrades to the existing recovery path (a parse defect / empty display name), matching how the parser already treats other malformed input.
After the fix, 100k+ randomized message_from_string parses over malformed address and parameter headers produce no escaping non-HeaderParseError exception. (The empty-string IndexError still produced by directly calling the low-level get_* token parsers is their documented non-empty precondition and is not reachable from public header parsing — every caller checks non-empty first.)
Affected versions
Both instances reproduce on main and on the maintained 3.13 / 3.14 / 3.15 bugfix branches; the affected code (get_parameter and DisplayName.display_name) has been present since well before 3.9.
Linked PR
A PR fixing both instances follows.
Linked PRs
The modern email header parser (
email._header_value_parser, used underEmailPolicy/email.policy.default) raises a bareIndexErrorinstead of aHeaderParseError/ defect on two malformed inputs. The shared root cause is the same bug class: an empty parser token-list or string is indexed (value[0]/res[-1]) without first checking it is non-empty, so the parser escapes its own error-recovery contract.Instance 1 — MIME parameter name ending with
*get_parameterconsumes the extended-parameter marker*, leavingvalue == '', then evaluatesvalue[0]→IndexError.parse_mime_parametersonly catchesHeaderParseError, so theIndexErrorescapes. Also reachable vianame*0*(sectioned) and a trailingx=1; name*.Instance 2 — address display name that is only a comment
DisplayName.display_nameguards the empty-list case, but when the name is a singlecfwstoken,res.pop(0)emptiesresand the subsequentres[-1]raisesIndexError(this also propagates through the.valueproperty). The second example is a syntactically valid RFC 5322 group whose display name is a comment.Both are reachable from ordinary parsing of untrusted/malformed address and parameter headers under the modern policies.
Fix
Add minimal guards so each input degrades to the existing recovery path (a parse defect / empty display name), matching how the parser already treats other malformed input.
After the fix, 100k+ randomized
message_from_stringparses over malformed address and parameter headers produce no escaping non-HeaderParseErrorexception. (The empty-stringIndexErrorstill produced by directly calling the low-levelget_*token parsers is their documented non-empty precondition and is not reachable from public header parsing — every caller checks non-empty first.)Affected versions
Both instances reproduce on
mainand on the maintained 3.13 / 3.14 / 3.15 bugfix branches; the affected code (get_parameterandDisplayName.display_name) has been present since well before 3.9.Linked PR
A PR fixing both instances follows.
Linked PRs