PT-2026-51839 · Linux · Linux
Publicado
2026-06-24
·
Atualizado
2026-06-24
·
CVE-2026-52945
Nenhuma
Não há classificações de severidade ou métricas disponíveis. Quando houver, atualizaremos as informações correspondentes na página.
In the Linux kernel, the following vulnerability has been resolved:
Revert "wireguard: device: enable threaded NAPI"
This reverts commit 933466fc50a8e4eb167acbd0d8ec96a078462e9c which is
commit db9ae3b6b43c79b1ba87eea849fd65efa05b4b2e upstream.
We have had three independent production user reports in combination
with Cilium utilizing WireGuard as encryption underneath that k8s Pod
E/W traffic to certain peer nodes fully stalled. The situation appears
as follows:
- Occurs very rarely but at random times under heavy networking load.
- Once the issue triggers the decryption side stops working completely for that WireGuard peer, other peers keep working fine. The stall happens also for newly initiated connections towards that particular WireGuard peer.
- Only the decryption side is affected, never the encryption side.
- Once it triggers, it never recovers and remains in this state, the CPU/mem on that node looks normal, no leak, busy loop or crash.
- bpftrace on the affected system shows that wg prev queue enqueue fails, thus the MAX QUEUED PACKETS (1024 skbs!) for the peer's rx queue is reached.
- Also, bpftrace shows that wg packet rx poll for that peer is never called again after reaching this state for that peer. For other peers wg packet rx poll does get called normally.
- Commit db9ae3b ("wireguard: device: enable threaded NAPI") switched WireGuard to threaded NAPI by default. The default has not been changed for triggering the issue, neither did CPU hotplugging occur (i.e. 5bd8de2 ("wireguard: queueing: always return valid online CPU in wg cpumask choose online()")).
- The issue has been observed with stable kernels of v5.15 as well as v6.1. It was reported to us that v5.10 stable is working fine, and no report on v6.6 stable either (somewhat related discussion in [0] though).
- In the WireGuard driver the only material difference between v5.10 stable and v5.15 stable is the switch to threaded NAPI by default.
[0] https://lore.kernel.org/netdev/CA+wXwBTT74RErDGAnj98PqS=wvdh8eM1pi4q6tTdExtjnokKqA@mail.gmail.com/
Breakdown of the problem:
- skbs arriving for decryption are enqueued to the peer->rx queue in wg packet consume data via wg queue enqueue per device and peer.
- The latter only moves the skb into the MPSC peer queue if it does not surpass MAX QUEUED PACKETS (1024) which is kept track in an atomic counter via wg prev queue enqueue.
- In case enqueueing was successful, the skb is also queued up in the device queue, round-robin picks a next online CPU, and schedules the decryption worker.
- The wg packet decrypt worker, once scheduled, picks these up from the queue, decrypts the packets and once done calls into wg queue enqueue per peer rx.
- The latter updates the state to PACKET STATE CRYPTED on success and calls napi schedule on the per peer->napi instance.
- NAPI then polls via wg packet rx poll. wg prev queue peek checks on the peer->rx queue. It will wg prev queue dequeue if the queue->peeked skb was not cached yet, or just return the latter otherwise. (wg prev queue drop peeked later clears the cache.)
- From an ordering perspective, the peer->rx queue has skbs in order while the device queue with the per-CPU worker threads from a global ordering PoV can finish the decryption and signal the skb PACKET STATE CRYPTED out of order.
- A situation can be observed that the first packet coming in will be stuck waiting for the decryption worker to be scheduled for a longer time when the system is under pressure.
- While this is the case, the other CPUs in the meantime finish decryption and call into napi schedule.
- Now in wg packet rx poll it picks up the first in-order skb from the peer->rx queue and sees that its state is still PACKET STATE UNCRYPTED. The NAPI poll routine then exits e ---truncated---
Encontrou algum problema na descrição? Tem algo a acrescentar? Fique à vontade para nos escrever 👾
Identificadores relacionados
Produtos afetados
Linux