reversefreemedium

roulette

umdctf

Task: a stripped static ELF64 validator checks a 106-byte input in 27 four-byte chunks using several .rodata lookup tables, a custom helper routine, and a tiny bytecode VM. Solution: extract the tables and VM bytecode with pyelftools, evaluate the helper concretely with angr, reimplement the VM and rolling state update in Python, and reconstruct the flag directly from the expected chunk values.

$ ls tags/ techniques/

vm static_binary stripped_binary elf_x86_64 table_driven_validator rodata_tables angr pyelftools

table_extractionobjdump_reversingvm_reimplementationconcrete_function_evaluationscripted_validator_recovery

roulette — umdctf

Description

No separate organizer description was present in the provided workspace files.

English summary: the challenge provides a stripped static Linux ELF called roulette. The program asks for a roulette number, but the actual validation logic is a scripted reverse-engineering problem: it checks a 106-byte string through a helper arithmetic routine, several lookup tables in .rodata, a tiny bytecode VM, and a rolling per-chunk state.

Analysis

Recon

The first step was simple triage:

file roulette
checksec --file=roulette
strings roulette

That immediately gave the important high-level picture.

file showed a stripped static ELF64 x86-64 binary.

checksec showed:

Partial RELRO
no canary
NX enabled
no PIE
SHSTK enabled
IBT enabled

strings exposed the user-facing messages and confirmed that this was a normal validator rather than a crypto service or packed binary. The most useful strings were:

lets go gambling!
submit roulette number:
accepted
rejected
JACKPOT! You guessed the winning number.

The environment mattered here. I did not have qemu, gdb, rizin, or readelf, so the solve path had to rely on:

objdump
pyelftools
angr

That ended up being enough.

As a precedent, I also looked at the knowledge-base writeup 20260128_htb_flagcasino. That challenge was not the same, but it was a useful reminder that some gambling-themed reverse tasks are easiest to solve by scripting the validator instead of trying to manually recover every intermediate constant.

Main validation routine

Using objdump -d on the interesting region showed that the main validation routine starts at 0x401690.

Important observations from that function:

Input is read with fgets into a 0x100-byte buffer.
The newline is stripped with strcspn.
The program requires the resulting length to be exactly 0x6a = 106 bytes.
Validation then proceeds in 27 chunks of 4 bytes each.

The last point looks odd at first, because 27 * 4 = 108, not 106. The reason becomes clear once the solver is reconstructed: the recovered byte stream is 108 bytes long, but the last two bytes are 0x00 0x00. The program only expects the first 106 non-NUL user-controlled bytes, so the printable flag string is exactly 106 bytes long and naturally stops before those two trailing zeros.

The important `.rodata` tables

The validator pulls from several fixed tables stored in .rodata:

target table: 0x499ce0
table A: 0x499d60
table B: 0x499de0
table C: 0x499e60
table D: 0x499ee0

Each table contains 27 dwords, one per 4-byte chunk.

The main loop computes two helper inputs from the current chunk index and rolling state:

one branch uses r15 ^ 0x28d88f59, rotates it, reduces modulo tblC[i], then XORs with tblA[i]
the other branch uses (i * 0x45d9f3b) ^ 0x8aa115a0, rotates it, reduces modulo tblD[i], then XORs with tblB[i]

Those values are fed into a helper arithmetic routine at 0x401cd0.

The helper function at `0x401cd0`

The helper is the annoying part of the challenge.

Disassembly shows a deeply nested arithmetic routine with repeated divisions and remainder handling. It is absolutely possible to reimplement it by hand, but it is exactly the kind of code where one wrong signedness assumption or one missed quotient/remainder dependency wastes a lot of time.

That made angr the pragmatic choice here.

Instead of symbolically solving anything hard, I used angr in the simplest possible way: create a blank state with zero-filled unconstrained memory/registers, wrap the helper entry point as a callable, and ask for the concrete return value for each chunk.

This keeps the solver short and avoids writing a fragile manual port of the helper.

The embedded VM

After the helper returns, the validator runs a tiny bytecode VM.

The relevant handler addresses are:

copy at 0x40182b
rol-by-reg at 0x4018a0
mul-imm at 0x401988
add-imm at 0x4019a0
add-reg at 0x4019b8
xor-imm at 0x4019d0
xor-reg at 0x4019e8
shr-imm at 0x401a00

The bytecode starts at 0x499be1. Instructions are 8 bytes each, and dispatch is controlled by the previous record's last byte. The stream terminates with opcode 0xff, and the VM returns local slot 6.

The only slightly unusual detail is the startup behavior: there is an initial forced copy before normal dispatch begins. Once that is replicated correctly, the VM is straightforward to port.

The VM uses eight 32-bit local slots. For each chunk:

loc[0] = helper_out
loc[1] = ebx (the rolling state)
loc[2] = chunk index

The VM output is then XORed with the corresponding dword from the target table.

That gives the expected plaintext 4-byte chunk:

chunk[i] = target[i] ^ vm_out

Concatenating all 27 reconstructed chunks yields 108 bytes total.

Rolling state update

The binary does not validate chunks independently. After each chunk, it updates two rolling 32-bit state values:

r15
rbx

The update logic is:

mix = ((helper_out * 0x45d9f3b) ^ rbx ^ chunk)
rbx = rol32(mix, (i % 5) + 7) + r15 - 0x61c88647
r15 = r15 + 0x7f4a7c15

This matters because rbx becomes the VM seed for the next iteration.

When the solver is implemented correctly, the final reconstructed state matches the binary's last check:

ebx == 0x6286239c

That is a very useful confirmation that the script matches the real validator rather than merely printing something flag-shaped.

Why the result is 108 bytes but the input is 106 bytes

The validator processes 27 groups of 4 bytes, so the recovered stream is naturally 108 bytes long.

However, the accepted user input length is strictly checked to be 106 bytes. The reason is that the last recovered dword ends with two trailing NULs:

... d}\x00\x00

So the true human-entered input is simply the first 106 bytes, which decode directly to:

UMDCTF{I_R3ALLY-want-to-pl4y-the-p0werball,+but-my-d4d-said-no-so-im-b3tting-ill-win-on-POLYMARKETinstead}

Solution

The clean solve path was:

Use objdump to map the validator and identify the chunked structure, helper function, VM handlers, and .rodata addresses.
Use pyelftools to extract the five lookup tables and the VM bytecode directly from .rodata.
Use angr to evaluate the helper function at 0x401cd0 concretely for each chunk.
Reimplement the tiny VM in Python.
Recreate the rolling rbx / r15 state update.
Compute each chunk as target[i] ^ vm_out.
Join all 27 chunks, then print only the first 106 bytes.

#!/usr/bin/env python3
import angr
from angr import options as o
from elftools.elf.elffile import ELFFile

MASK32 = 0xffffffff

def u32(x):
    return x & MASK32

def rol32(x, n):
    x &= MASK32
    n &= 31
    return ((x << n) | (x >> (32 - n))) & MASK32

p = angr.Project('roulette', auto_load_libs=False)
state = p.factory.blank_state(
    add_options={o.ZERO_FILL_UNCONSTRAINED_MEMORY, o.ZERO_FILL_UNCONSTRAINED_REGISTERS}
)
helper = p.factory.callable(0x401cd0, base_state=state)

with open('roulette', 'rb') as f:
    elf = ELFFile(f)
    ro = elf.get_section_by_name('.rodata')
    data = ro.data()
    base = ro['sh_addr']

    def dwords(addr, n):
        off = addr - base
        return [int.from_bytes(data[off + 4*i:off + 4*(i+1)], 'little') for i in range(n)]

    def bytes_at(addr, n):
        off = addr - base
        return data[off:off+n]

    target = dwords(0x499ce0, 27)
    tblA = dwords(0x499d60, 27)
    tblB = dwords(0x499de0, 27)
    tblC = dwords(0x499e60, 27)
    tblD = dwords(0x499ee0, 27)
    bc = bytes_at(0x499be1, 8 * 40)


def exec_handler(loc, op, inst):
    if op in (0, 1):
        dst = inst[0] & 7
        src = inst[1] & 7
        loc[dst] = loc[src]
    elif op == 2:
        dst = inst[0] & 7
        src = inst[1] & 7
        loc[dst] = u32(loc[dst] ^ loc[src])
    elif op == 3:
        dst = inst[0] & 7
        imm = int.from_bytes(inst[3:7], 'little')
        loc[dst] = u32(loc[dst] ^ imm)
    elif op == 4:
        dst = inst[0] & 7
        src = inst[1] & 7
        loc[dst] = u32(loc[dst] + loc[src])
    elif op == 5:
        dst = inst[0] & 7
        imm = int.from_bytes(inst[3:7], 'little')
        loc[dst] = u32(loc[dst] + imm)
    elif op == 6:
        dst = inst[0] & 7
        imm = int.from_bytes(inst[3:7], 'little')
        loc[dst] = u32(loc[dst] * imm)
    elif op == 7:
        dst = inst[0] & 7
        imm = int.from_bytes(inst[3:7], 'little')
        loc[dst] = rol32(loc[dst], imm)
    elif op == 8:
        dst = inst[0] & 7
        src = inst[1] & 7
        loc[dst] = rol32(loc[dst], loc[src])
    elif op == 9:
        dst = inst[0] & 7
        imm = int.from_bytes(inst[3:7], 'little')
        loc[dst] = u32(loc[dst] >> (imm & 31))
    else:
        raise ValueError(op)


def run_vm(helper_out, ebx, idx):
    loc = [0] * 8
    loc[0] = u32(helper_out)
    loc[1] = u32(ebx)
    loc[2] = u32(idx)
    ip = 0

    # Initial forced copy before normal dispatch.
    exec_handler(loc, 1, bc[ip:ip+8])

    while True:
        op = bc[ip + 7]
        ip += 8
        if op > 9:
            return loc[6] if op == 0xff else 0
        exec_handler(loc, op, bc[ip:ip+8])


rbx = 0x8aa115a0
r15 = 0
out = bytearray()

for i in range(27):
    rem1 = rol32(r15 ^ 0x28d88f59, (i % 11) + 5) % tblC[i]
    arg3 = u32(tblA[i] ^ rem1)

    rem2 = rol32((i * 0x45d9f3b) ^ 0x8aa115a0, (i % 13) + 3) % tblD[i]
    arg1 = u32(tblB[i] ^ rem2)

    helper_out = helper(arg1, tblD[i], arg3, tblC[i]).concrete_value & MASK32
    vm_out = run_vm(helper_out, rbx, i)
    chunk = u32(target[i] ^ vm_out)
    out.extend(chunk.to_bytes(4, 'little'))

    mix = u32((helper_out * 0x45d9f3b) ^ rbx ^ chunk)
    rbx = u32(rol32(mix, (i % 5) + 7) + r15 - 0x61c88647)
    r15 = u32(r15 + 0x7f4a7c15)

assert rbx == 0x6286239c
print(out[:106].decode())

Running the script prints:

UMDCTF{I_R3ALLY-want-to-pl4y-the-p0werball,+but-my-d4d-said-no-so-im-b3tting-ill-win-on-POLYMARKETinstead}

$ cat /etc/motd

Liked this one?

Pro unlocks every writeup, every flag, and API access. $9/mo.

$ cat pricing.md

$ grep --similar

Similar writeups

[reverse][free]FlagCasino— hackthebox
[reverse][Pro]Basic— spbctf
[reverse][free]nuclear_codes— umdctf
[pwn][free]ipv8— umdctf
[pwn][free]velvet-table— umdctf