0x00. Before We Start
CVE-2024-0582 is a Use-After-Free vulnerability found in the Linux kernel’s io_uring subsystem, which is caused by the lack of check of the memory usage in the ring buffer. An unprivileged attacker can exploit this vulnerability by registering a ring buffer with memory allocated by IORING_REGISTER_PBUF_RING
in a specific io_uring
, doing the mmap then, and freeing the ring buffer. This security flaw allows an unprivileged local user to crash the system or to escalate their privileges.
The CVSS score of this vulnerability is 7.8
, detailed as follow.
Score | Severity | Version | Vector String |
---|---|---|---|
7.8 | High | 3.1 | CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H |
0x01. Analysis of the vulnerability
In this article, we will use the 6.5
version of the Linux kernel source code for our detailed analyzation.
As we all know that the IO_URING
has provided us with three new system calls:
io_uring_setup()
: This system call is used to create a new context ofio_uring
, which mainly consists of aSQ
queue and aCQ
queue with elements of a specific amount. A file descriptor will be returned to us for further operations.io_uring_register()
: This system call is used to configure a specificio_uring
instance. Available operations include registering new buffers, updating contents of buffers and unregistering buffers, etc.io_uring_enter()
: This system call is used to submit a new I/O request and user can choose to synchronously wait for the I/O to be complete or not.
For the io_uring_register()
syscall, its prototype is as follow:
1 | SYSCALL_DEFINE4(io_uring_register, unsigned int, fd, unsigned int, opcode, |
In the core function of this system call which is __io_uring_register(), there is a big switch
statement for handling different opcode
by calling corresponding functions. We mainly focus on the one related to the IORING_REGISTER_PBUF_RING
.
PBUF_RING Internal
The pbuf
(i.e., packet buffer
) is a feature of the io_uring
, which is somewhat a legacy concept originally coming from the network programming.
I. Ring Registration: IORING_REGISTER_PBUF_RING
The io_uring
allow users to create a ring buffer with the opcode IORING_REGISTER_PBUF_RING
through the io_uring_register()
, which will finally calls to the function io_register_pbuf_ring() :
1 | int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg) |
Ignoring those checkers on parameters, we now take a look at its core logic:
- Firstly it will call io_buffer_get_list() to obtain existing
io_buffer_list
structure, or allocate a new one if nothing exists. - If the bit
IOU_PBUF_RING_MMAP
is set in the flag of the request, it will call io_alloc_pbuf_ring() to allocate continuous pages, otherwise the io_pin_pbuf_ring() will be called to pin pages from user space to the ring. - After all that have been completed, the result will be written into the
io_buffer_list
structure before, which will be saved into current context.
As the vulnerability happens on the code path related to the mmap()
, we now mainly focus on the path calling the io_alloc_pbuf_ring() , which will finally call__get_free_pages()
to allocate pages.
1 | static int io_alloc_pbuf_ring(struct io_uring_buf_reg *reg, |
The structure of the io_buffer_list
is as following figure.
II. Unregistration: IORING_UNREGISTER_PBUF_RING
Corresponding to the registration, io_uring
allows users to unregister a PBUF_RING
with the opcode IORING_UNREGISTER_PBUF_RING
, which will calls to io_unregister_pbuf_ring() to handle that.
1 | int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg) |
Its core logics are:
- Firstly it will call the io_buffer_get_list() to obtain existing
io_buffer_list
structure, if nothing exists it will return. - Then it will call the __io_remove_buffers() to release pages stored in the
io_buffer_list
structure. - Finally it will call the xa_erase() to remove this
io_buffer_list
from our context and release it as well.
Before we take a look into the __io_remove_buffers(), let’s firstly have a quick look at the io_alloc_pbuf_ring(). We can notice that some members of the io_buffer_list
are assigned with specific values.
1 | static int io_alloc_pbuf_ring(struct io_uring_buf_reg *reg, |
Hence we will go into this path in the __io_remove_buffers() to release pages we allocated before.
1 | static int __io_remove_buffers(struct io_ring_ctx *ctx, |
In newer version of this function the
put_page_testzero()
will be replaced byfolio_put(virt_to_folio(bl->buf_ring));
, but the core logics of them are the same.
III. Usage: io_uring_mmap
How can we access these pages in the PBUF_RING
? An easy way is to do the mmap()
on the io_uring
, which will call to the function io_uring_mmap().
1 | static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) |
In the function io_uring_validate_mmap_request() it will firstly determine the specific operation by the offset
parameter of the mmap()
syscall, which means that this value is not the legacy offset, but using higher bits as the type and lower bits as the value. We mainly focus on the path related to the PBUF_RING
.
1 | static void *io_uring_validate_mmap_request(struct file *file, |
The logic of io_pbuf_get_address is much simpler, which just take our buf_ring
allocated before.
1 | void *io_pbuf_get_address(struct io_ring_ctx *ctx, unsigned long bgid) |
Root Cause
After the code analysis above, we can easily realize that the code of releasing a PBUF_RING
lacks of a checker on the mmap()
, which means that we can still access these freed pages by the memory-mapped region after releasing the ring buffer , leading to the use-after-free vulnerability.
Proof Of Concept
Following code is a proof of concept written by me. This program just simply exploits the UAF vulnerability to overwrite the seq_file::seq_operations
to cause the kernel panic. Note that you will need to compile it together with the liburing library.
1 | /** |
0x02. Exploitation
As the vulnerability has provided us with the capability to read and write the use-after-free memory with almost no limits. It is very easy to be exploited with many of different techniques.
Following exploitation program is written by me, which reallocates the UAF page as pipe_buffer
to grant attackers with the capability to do the arbitrary kernel memory read & write by overwriting the pipe_buffer::page
. This exploitation uses such capabilities to overwrite the cred
of current process to complete a local privilege escalation.
1 | /** |
0x03. Patch
This vulnerability got fixed with the commit c392cbecd8eca4c53f2bf508731257d9d0a21c2d, which has done the following patches:
- Add a linked list to record corresponding data for the delay release.
- Delay the release of the ring buffer to the time of closing the
io_uring
(i.e., calling thefile_operations::release()
in kernel), hence the memory will be reclaimed only after themmap()
region was destroyed.