Unknown · Sanitize-Html · CVE-2024-34078
**Name of the Vulnerable Software and Affected Versions**
html-sanitizer versions prior to 2.4.2
**Description**
The issue concerns an allowlist-based HTML cleaner. If using `keep typographic whitespace=False` (which is the default), the sanitizer normalizes unicode to the NFKC form at the end. Some unicode characters normalize to chevrons; this allows specially crafted HTML to escape sanitization.
**Recommendations**
For versions prior to 2.4.2, update to version 2.4.2 to resolve the issue.
As a temporary workaround, consider setting `keep typographic whitespace=True` explicitly, or normalize to NFKC yourself earlier.