PT-2024-15407 · Langchain Ai · Langchain

Eyurtsev

·

Published

2024-02-24

·

Updated

2025-02-25

·

CVE-2024-0243

CVSS v3.1

8.1

High

VectorAV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H
Name of the Vulnerable Software and Affected Versions langchain versions prior to the version that includes the fix from https://github.com/langchain-ai/langchain/pull/15559
Description The issue arises when an attacker controls the contents of a website, such as https://example.com, and places a malicious HTML file with links to external sites, like https://example.completely.different/my file.html. Even with prevent outside=True set in the crawler configuration, the RecursiveUrlLoader would still download the file from the external site. This is due to the loader's behavior when encountering links in the HTML content.
Recommendations For versions prior to the fix in https://github.com/langchain-ai/langchain/pull/15559, consider updating to a version that includes this fix to resolve the issue. As a temporary workaround, consider restricting the url parameter in the RecursiveUrlLoader to only allow links from trusted domains until a patch is available. Additionally, be cautious when using the extractor parameter with lambda functions that parse HTML content, as this could potentially lead to unintended downloads.

Exploit

Fix

SSRF

Found an issue in the description? Have something to add? Feel free to write us 👾

Weakness Enumeration

Related Identifiers

CVE-2024-0243
GHSA-H9J7-5XVC-QHG5
PYSEC-2024-235

Affected Products

Langchain