PT-2025-19362 · Pypi · Vllm

Publicado

2025-04-15

·

Atualizado

2025-04-15

CVSS v3.1

6.5

Média

VetorAV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Impact

This report is to highlight a vulnerability in XGrammar, a library used by the structured output feature in vLLM. The XGrammar advisory is here: https://github.com/mlc-ai/xgrammar/security/advisories/GHSA-389x-67px-mjg3
The xgrammar library is the default backend used by vLLM to support structured output (a.k.a. guided decoding). Xgrammar provides a required, built-in cache for its compiled grammars stored in RAM. xgrammar is available by default through the OpenAI compatible API server with both the V0 and V1 engines.
A malicious user can send a stream of very short decoding requests with unique schemas, resulting in an addition to the cache for each request. This can result in a Denial of Service by consuming all of the system's RAM.
Note that even if vLLM was configured to use a different backend by default, it is still possible to choose xgrammar on a per-request basis using the guided decoding backend key of the extra body field of the request with the V0 engine. This per-request choice is not available when using the V1 engine.

Patches

Workarounds

There is no way to workaround this issue in existing versions of vLLM other than preventing untrusted access to the OpenAI compatible API server.

References

Correção

Allocation of Resources Without Limits

Encontrou algum problema na descrição? Tem algo a acrescentar? Fique à vontade para nos escrever 👾

Enumeração de Fraquezas

Identificadores relacionados

GHSA-HF3C-WXG2-49Q9

Produtos afetados

Vllm