Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-121650 : detect newlines in headers #121812

Conversation

basbloemsaat
Copy link
Contributor

@basbloemsaat basbloemsaat commented Jul 15, 2024

@encukou as promised, the fix for this issue.

This PR fixes both issues addressed in the issue, both the newlines at the end as well as the encoded newlines.

As this has security implication, it may need to be backported as well.

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to confirm that this PR fixes #121650. The original reproducer still contains the embedded newline:

from email import message_from_string
from email.policy import default

email_in = """\
To: [email protected]
From: External Sender <[email protected]>
Subject: Here's an =?UTF-8?Q?embedded_newline=0A?=
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

<html>
<head><title>An embeded newline</title></head>
<body>
  <p>I sent you an embedded newline in the subject. How do you like that?!</p>
</body>
</html>
"""

msg = message_from_string(email_in, policy=default)
msg = message_from_string(email_in, policy=default)
for header, value in msg.items():
    del msg[header]
    msg[header] = value
email_out = str(msg)
print(email_out)

@basbloemsaat
Copy link
Contributor Author

basbloemsaat commented Jul 16, 2024

I wasn't able to confirm that this PR fixes #121650. The original reproducer still contains the embedded newline:
...

I missed headers that were parsed from a message, as in original issue. I updated the fix and the tests.

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed that this fixes #121650. I'm pretty sure this is a security fix (as you could previously inject email headers using this method), so this should need a backport all the way to 3.8

@encukou
Copy link
Member

encukou commented Jul 16, 2024

@warsaw, @bitdancer, @maxking: as the email experts, do you have any comments?

encukou
encukou previously approved these changes Jul 16, 2024
Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix looks good to me! Thank you for digging into it!

@encukou
Copy link
Member

encukou commented Jul 19, 2024

I take back the review. There's more to this, unfortunately :(

Here's another reproducer:

from email import message_from_string
from email.policy import default

email_in = """\
To: [email protected]
From: External Sender <[email protected]>  =?UTF-8?Q?embedded_newline=0A?=Smuggled-Data: Bad
Subject: foo <bar> Here's an
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

<html>
<head><title>An embeded newline</title></head>
<body>
  <p>I sent you an embedded newline in the subject. How do you like that?!</p>
</body>
</html>
"""

msg = message_from_string(email_in, policy=default)
print(msg)
for header, value in msg.items():
    del msg[header]
    msg[header] = value
email_out = str(msg)
print(email_out)

@basbloemsaat
Copy link
Contributor Author

I take back the review. There's more to this, unfortunately :(

I'll look into this...

@basbloemsaat basbloemsaat requested a review from encukou July 19, 2024 22:07
@basbloemsaat
Copy link
Contributor Author

basbloemsaat commented Jul 19, 2024

@encukou I tried all header types. This eliminates all newlines. Two notes:

  • date headers (and derivatives) parse the date, and the offending code is eliminated that way
  • the MIME-Version header doesn't decode the encoded newline, so it doesn't break the message. Improving the parsing of that header would mean rewriting the parsing code to do so, but I think that goes beyond the scope of this ticket.

@encukou
Copy link
Member

encukou commented Jul 24, 2024

After reading up on the email module, I propose to fix the issue in a different part of the code: see #122233.

@encukou
Copy link
Member

encukou commented Aug 1, 2024

Closing in favour of #122233.
Thank you for the work here, @basbloemsaat! And sorry that I “stole” the issue.

@encukou encukou closed this Aug 1, 2024
@basbloemsaat basbloemsaat deleted the fix-issue-121650-detect-newlines-in-headers branch August 1, 2024 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants