PT-2024-35391 · Lxml+1 · Lxml+1

Jorianwoltjer

·

Published

2024-11-19

·

Updated

2024-12-13

·

CVE-2024-52595

CVSS v3.1

7.7

High

VectorAV:N/AC:H/PR:N/UI:N/S:U/C:H/I:L/A:H
Name of the Vulnerable Software and Affected Versions lxml html clean versions prior to 0.4.0
Description The HTML Parser in lxml does not properly handle context-switching for special HTML tags such as <svg>, <math>, and <noscript>. This behavior deviates from how web browsers parse and interpret such tags. Specifically, content in CSS comments is ignored by lxml html clean but may be interpreted differently by web browsers, enabling malicious scripts to bypass the cleaning process. This could lead to Cross-Site Scripting (XSS) attacks, compromising the security of users relying on lxml html clean in default configuration for sanitizing untrusted HTML content.
Recommendations For versions prior to 0.4.0, upgrade to lxml 0.4.0 to address this issue. As a temporary mitigation, configure lxml html clean with the following settings:
  • Use remove tags to specify tags to remove, moving their content to their parents' tags.
  • Use kill tags to specify tags to be removed completely.
  • Use allow tags to restrict the set of permissible tags, excluding context-switching tags like <svg>, <math>, and <noscript>.

Exploit

Fix

XSS

Incomplete List of Disallowed Inputs

Weakness Enumeration

Related Identifiers

CVE-2024-52595
GHSA-5JFW-GQ64-Q45F
PYSEC-2024-160

Affected Products

Lxml
Lxml Html Clean