Customer Service
kitctf
Task: a holpy (HOL theorem prover) proof checker accepts a user-supplied proof and prints the flag if you can prove the const false with no hypotheses. Solution: chain an SMT-translation soundness bug (real_divide done in Python float, 2^54+2 / 2 collides to 2^53) to get a HOL-false-but-Z3-valid lemma, then exploit a dead axiom guard (list == int) to define false as that lemma, and discharge it with primitive kernel rules.
$ ls tags/ techniques/
Customer Service — GPNCTF 2024 (kitctf / gpn24)
Description
The customer is always right. RIGHT? Experienced staff will tell you that customers are always worst case users. Just last week one customer proclaimed that pineapple does not belong on sushi pizza. Yeah I know, how could he? But the customer is always right. My friend fears for his sanity. So please help me work out the logic details for such an argument.
English summary: we are given checker.py, a proof checker built on a holpy fork (the HOL theorem prover in Python). It reads a hex-encoded JSON "proof" from the user, checks it, and if the checked theorem is the constant false with no hypotheses and no assumptions, it calls win() and prints the flag.
The flavor text is the spec: "the customer is always right" = the checker trusts the claimed theorem; "worst case users" = adversarial input; "work out the logic details for such an argument" = construct a (bogus) logical proof of falsehood. This is a soundness/explosion challenge: prove ⊢ false.
Win condition
checker.py:
hex→ bytes → UTF-8 JSON{"imports": [...], "content": [item, ...]}.- Each import →
basic.load_theory(imp)(failures only warned). - Each content item →
items.parse_item, then:thmitem:monitor.check_proof(item, rewrite=False). If status isOK/ProofOK, runreport = theory.thy.checked_extend(item.get_extension()).- else: same
checked_extend.
- Axiom guards on each fresh
report:if len(report.get_axioms()) > 1: exit(1)elif report.get_axioms() == 1 and item.ty != "thm": exit(1)← dead code (see Bug 2)
- Fetch the theorem and call
theorem_proves_false_unconditioned(thm), which is true iff the conclusion is the constfalse(prints asfalse/?false) with no assumptions and no hypotheses. If true →win().
So: produce a thm item whose proof checks and whose theorem is ⊢ false (empty hyps/assums).
Analysis — the "no library" environment
The critical, intentional environment fact: the holpy fork's wheel packaging
([tool.hatch.build.targets.wheel]) lists the code packages but excludes the
library/ directory. In the deployed uv run checker.py install there is
therefore no theory library at all — every basic.load_theory('nat' | 'real' | 'int' | ...) FAILS.
What remains usable:
- The
EmptyTheory: typesbool,fun; constsequals,implies,all. - Whatever macros were registered at import time. Proof checking runs at the
default
check_level=0, which means level-0 macros are TRUSTED (theirevalresult is taken as-is, the macro is not expanded and re-verified from kernel primitives).
The only usable trusted level-0 macros are z3 and nat_eval.
data.real / data.integer cannot even be imported (they load_theory at
module top level and crash). This restriction is what shapes the whole solve:
you cannot just import arithmetic theorems and rewrite — you must manufacture
everything from the empty theory plus the two trusted SMT-ish macros.
Bug 1 — z3 macro: real division done in Python float (SMT translation soundness hole)
The z3 macro is level = 0 ⇒ TRUSTED. Its eval(args, prevs):
- builds
assms = [prev.prop for prev in prevs], - calls Z3
solve(Implies(*(assms + [args]))), - if Z3 says the implication is valid, returns
Thm(args, *(th.hyps for th in prevs)).
I.e. if Z3 considers it valid, the macro hands back the claimed conclusion
args with the union of the prevs' hypotheses (empty when prevs are no-hyp).
The soundness of this hinges entirely on z3wrapper.convert faithfully
translating HOL terms to Z3.
It does not. For a HOL real_divide term a / b:
- if the whole divide term
is_number(a normalized fraction, gcd handling),convertuses exactdest_number/Fraction— safe; - otherwise (a non-normalized fraction, gcd ≠ 1) it recurses:
rec(t.arg1) / rec(t.arg2)— that is Pythonint / int, producing a float.
So pick numbers where is_number is false and 64-bit float division loses
precision. With
N = 2**53 = 9007199254740992M = 2*(N+1) = 2**54 + 2 = 18014398509481986
then:
- In HOL,
(M::real)/2is exactlyN+1. - In Python,
float(M)/2rounds to2**53 = N(the 53-bit mantissa cannot representN+1).
Therefore the proposition (M::real)/2 ≤ N:
- is HOL-FALSE (
N+1 ≤ N), - but Z3 sees it as a tautology (after the float collision:
2^53 ≤ 2^53).
The trusted z3 macro thus proves, with no hypotheses:
⊢ ((18014398509481986::real) / 2 ≤ (9007199254740992::real))
a bogus, HOL-false but checker-accepted lemma. Call its RHS RHS.
The consistency wall — why z3 alone cannot reach false
You might hope to ask the z3 macro directly for ⊢ false. You can't:
- Every sentence the
z3macro returns is, by construction, a Z3-tautology. - The HOL const
falseconverts to Z3BoolVal(False), which is not a tautology, sosolve(Implies(false))is not valid — the macro refuses. - There is no no-hyp route to a Z3-False theorem either.
So z3 gives you a HOL-false proposition (RHS), never the literal const
false. nat_eval doesn't bridge the gap either: it gates on is_number
exactly like convert, so on terms it accepts it never disagrees with z3.
The missing link is a way to equate the bogus proposition RHS with the const
false. That bridge is Bug 2.
Bug 2 — the dead axiom guard lets us def false := RHS
The checker's second guard:
elif report.get_axioms() == 1 and item.ty != "thm": sys.exit(1)
get_axioms() returns a list; comparing a list to the int 1 is always
False. The elif is dead. Only len(report.get_axioms()) > 1 is enforced,
and report is fresh per item, so each item may freely introduce exactly 1
axiom.
A def (Definition) item emits [Constant(...), Theorem(name+"_def", Thm(prop))].
The Theorem has no proof, so checked_extend records it as 1 axiom
(allowed: len == 1, not > 1). Also note: def.ax constants and type.ax
types are not axioms at all, so we may declare arbitrarily many supporting
constants/types for free.
Therefore we can define a fresh const false :: bool together with the
"free" axiom:
false_def : false ⟷ RHS
That is the bridge. Now false is provably equal to a proposition we can prove
with the z3 macro.
Solution — the content array
Items, in order:
-
type.ax: declare typesnatandreal. The EmptyTheory has neither, and our consts mention them (e.g.of_nat :: nat ⇒ real). Gotcha: first attempts failed withUnknown type nat— these declarations are mandatory. -
def.axaxiomatic constants (FREE, not axioms) so the numerals / division /≤parse with concrete real types (avoids fragile overload resolution through the pipeline):zero :: real,one :: realbit0 :: real ⇒ real,bit1 :: real ⇒ realof_nat :: nat ⇒ realreal_divide :: real ⇒ real ⇒ realless_eq :: real ⇒ real ⇒ bool
-
defitem: namefalse, typebool, propfalse ⟷ ((18014398509481986::real) / 2 ≤ (9007199254740992::real))(M=2^54+2 over 2, vs N=2^53). Adds the constfalseand axiomfalse_def. -
thmitempwn, propfalse, with an explicit low-level proof (list of proof-term steps):step rule args / prevs resulting thm 0 z3args = RHS, prevs[]⊢ RHS(Bug 1 float collision)1 theoremargs false_def⊢ false ⟷ RHS2 symmetricprevs [1]⊢ RHS ⟷ false3 equal_elimprevs [2, 0]⊢ falseStep 3 is a pure kernel rule (no z3):
equal_elimtakesA ⟷ BandAto giveB. HereA = RHS(from step 0),A ⟷ B = RHS ⟷ false(from step 2), yielding⊢ false— the constfalse, no hyps, no assums. -
Result:
monitor.check_proof→ProofOK;checked_extendreportslen(axioms) == 1(the thm itself, allowed becausety == "thm");theorem_proves_false_unconditionedreturns True →win()prints the flag.
Notes on the term strings: holpy uses unicode operators ⟷ (iff / bool
equality), ≤ (less_eq), ⊢ (turnstile), ⇒ for function types. Bare
numerals need explicit ::real annotations; the LHS of / forces real
inference. Internally numbers are of_nat (bit1/bit0 ... one), which is why the
bit0/bit1/of_nat/one/zero consts must be declared.
Exploit script
#!/usr/bin/env python3 # Customer Service — GPNCTF 2024 (kitctf) # Builds a holpy proof of `false` via: # Bug 1: z3 macro real_divide -> Python float collision (2^54+2)/2 == 2^53 # Bug 2: dead axiom guard (list == 1) lets a `def` inject `false_def` import json, ssl, socket N = 9007199254740992 # 2**53 M = 18014398509481986 # 2**54 + 2 RHS = f"(({M}::real) / 2 ≤ ({N}::real))" content = [ # 1) types (EmptyTheory has neither) -- fixes "Unknown type nat" {"ty": "type.ax", "name": "nat", "args": []}, {"ty": "type.ax", "name": "real", "args": []}, # 2) free axiomatic constants so numerals / "/" / "<=" parse over real {"ty": "def.ax", "name": "zero", "type": "real"}, {"ty": "def.ax", "name": "one", "type": "real"}, {"ty": "def.ax", "name": "bit0", "type": "real ⇒ real"}, {"ty": "def.ax", "name": "bit1", "type": "real ⇒ real"}, {"ty": "def.ax", "name": "of_nat", "type": "nat ⇒ real"}, {"ty": "def.ax", "name": "real_divide", "type": "real ⇒ real ⇒ real"}, {"ty": "def.ax", "name": "less_eq", "type": "real ⇒ real ⇒ bool"}, # 3) define `false` as the bogus (HOL-false) proposition -> axiom false_def {"ty": "def", "name": "false", "type": "bool", "prop": f"false ⟷ {RHS}"}, # 4) prove `⊢ false` with 4 proof-term steps {"ty": "thm", "name": "pwn", "prop": "false", "proof": [ {"id": "0", "rule": "z3", "args": RHS, "prevs": [], "th": f"⊢ {RHS}"}, {"id": "1", "rule": "theorem", "args": "false_def","prevs": [], "th": f"⊢ false ⟷ {RHS}"}, {"id": "2", "rule": "symmetric", "args": "", "prevs": ["1"], "th": f"⊢ {RHS} ⟷ false"}, {"id": "3", "rule": "equal_elim", "args": "", "prevs": ["2", "0"], "th": "⊢ false"}, ]}, ] payload = {"imports": [], "content": content} hexproof = json.dumps(payload).encode("utf-8").hex() print(f"[*] hex length: {len(hexproof)}") HOST = "<instance-host>.gpn24.ctf.kitctf.de" PORT = 443 ctx = ssl.create_default_context() ctx.check_hostname = False ctx.verify_mode = ssl.CERT_NONE with socket.create_connection((HOST, PORT)) as raw: with ctx.wrap_socket(raw, server_hostname=HOST) as s: # Delivery gotcha: read the banner/prompt FIRST, then send one line. banner = s.recv(4096) print(banner.decode(errors="replace")) s.sendall(hexproof.encode() + b"\n") data = b"" while True: chunk = s.recv(4096) if not chunk: break data += chunk print(data.decode(errors="replace"))
Delivery quirk
The JSON is UTF-8 then hex encoded (~2658 hex chars) and sent as a single line. The live service is reached with TLS on port 443:
ncat --ssl <instance-host>.gpn24.ctf.kitctf.de 443
It terminates TLS into a line service that prints give me your hex proof and
then reads one line. Read the banner prompt first, then send hex + "\n".
Sending the hex with a trailing newline before reading the banner, or piping
via plain ncat, sometimes hit a front HTTP gateway returning 400 Bad Request;
reading the prompt first avoids that race.
Final response:
✓ Proof check passed
Congratulations! You've found the flag: GPNCTF{Ex-UN4-LINEa-v4cuA-s3qui7UR-Qu0DLiB37}
$ cat /etc/motd
Liked this one?
Pro unlocks every writeup, every flag, and API access. $9/mo.
$ cat pricing.md$ grep --similar
Similar writeups
- [crypto][free]Guess the Taste— kitctf
- [crypto][free]COMpetition— gpn24
- [pwn][Pro]Taste— grodno_new_year_2026
- [reverse][free]Auto Cooker— GPNCTF 2026
- [crypto][Pro]Gambler's Fallacy— uoftctf2026