PT-2026-50489 · Pypi · Vllm
Publicado
2026-06-17
·
Atualizado
2026-06-17
·
CVE-2026-54233
CVSS v3.1
6.5
Média
| Vetor | AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H |
Summary
vLLM's
/v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. Tested on vLLM v0.19.0.Details
SpeechToTextProcessor rejects uploads over VLLM MAX AUDIO CLIP FILESIZE MB (default 25MB) based on compressed byte length, but the audio decoder in audio.py accumulates all decoded frames into memory with no size limit before returning:python
# speech to text.py L184-189
if len(audio data) / 1024 ** 2 > self.max audio filesize mb:
raise VLLMValidationError(...)
y, sr = load audio(buf, sr=self.asr config.sample rate) # decoded size unchecked
# audio.py L77-107
chunks: list[npt.NDArray] = []
for frame in container.decode(stream):
chunks.append(frame.to ndarray())
audio = np.concatenate(chunks, axis=-1).astype(np.float32) # single contiguous allocationA 25MB OPUS file at 6kbps encodes ~8.7 hours of audio. Decoding produces ~5.7GB of float32 PCM (232x amplification), and
np.concatenate then allocates a second contiguous array, bringing peak RSS to ~14.9GB from a single request. SpeechToTextConfig.max audio clip s (default 30s) applies only after the full decode and does not prevent the allocation.Impact
An unauthenticated attacker can exhaust server memory with a small number of concurrent requests, each a valid upload within the documented size limit. Severity was assessed with reference to prior OOM vulnerability reports in vLLM.
Fix
A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/44970
Correção
Encontrou algum problema na descrição? Tem algo a acrescentar? Fique à vontade para nos escrever 👾
Enumeração de Fraquezas
Identificadores relacionados
Produtos afetados
Vllm