PT-2026-28763 · Pypi · Justhtml

Published

2026-03-18

·

Updated

2026-03-18

CVSS v4.0

5.3

Medium

VectorAV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:N/VA:N/SC:L/SI:L/SA:N

Summary

to markdown() does not sufficiently escape text content that looks like HTML. As a result, untrusted input that is safe in to html() can become raw HTML in Markdown output.
This is not specific to tokenizer raw-text states like <title>, <noscript>, or <plaintext>, although those states can trigger the behavior. The root cause is broader: Markdown text serialization leaves angle brackets unescaped in text nodes.

Details

When converting a parsed document to Markdown, text nodes are escaped for a small set of Markdown metacharacters, but HTML-significant characters such as < and > are preserved. That means content parsed as text, including entity-decoded text or text produced by RCDATA/RAWTEXT-style parsing, can be emitted into Markdown as raw HTML.
Examples of affected input include:
  • Text produced from entity-decoded input such as &lt;script&gt;...&lt;/script&gt;
  • Text inside elements like <title>, <textarea>, <noscript> (when parsed as raw text), and <plaintext>
This is distinct from actual <script> or <style> elements in the DOM. Those are already dropped by default in to markdown() unless html passthrough=True.

Proof of Concept

General case

python
from justhtml import JustHTML

doc = JustHTML("<p>&lt;img src=x onerror=alert(1)&gt;</p>", fragment=True)

print(doc.to html())
print()
print(doc.to markdown())

Fix

XSS

Found an issue in the description? Have something to add? Feel free to write us 👾

Weakness Enumeration

Related Identifiers

GHSA-3RCM-VJRC-P45J

Affected Products

Justhtml