reversefreemedium

bike

b01lersc

Task: 64-bit ELF builds a Shannon-Fano-like prefix-code tree from a global frequency table, walks 32 bytes of user input bit-by-bit, and compares the decoded leaf sequence to a fixed 35-byte buffer. Solution: faithfully reconstruct the tree using float32 arithmetic (critical), encode the expected bytes via the resulting codebook, and pack the 246-bit stream MSB-first into 31 input bytes.

$ ls tags/ techniques/
reverse_engineeringshannon_fano_tree_reconstructionprefix_code_encodingfloat32_precision_matchingbit_stream_packing

bike — b01lersc

Description

Get your bike down from the tree!!

A 64-bit ELF binary with all modern mitigations enabled (Full RELRO, Stack canary, NX, PIE, SHSTK, IBT). The program asks for input, decodes it bit-by-bit through a tree built from a global frequency table, and compares the decoded output against a fixed 35-byte buffer. The goal is to find the input that produces the expected decoded sequence — which spells out the flag.

Analysis

Program structure

The binary contains three functions:

  • setup — disables stdout buffering.
  • build_tree(arr, lo, hi) — recursive constructor that returns a pointer to a node (leaf or internal).
  • main — reads up to 31 bytes via fgets, walks them through the tree, and compares against the global expected buffer.

The global data

At VA 0x4020 there is a freqs array of 128 entries, each laid out as:

struct entry { uint8_t byte; // +0x00 uint8_t pad[3]; // +0x01..+0x03 float freq; // +0x04 (float32) }; // sizeof = 8

The table covers all 128 ASCII values in a specific (non-monotonic within tied freqs) order. Frequencies are in descending order and take one of the following values:

0.18, 0.16, 0.13, 0.12, 0.11, 0.10, 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02

At VA 0x4420 there is a 35-byte expected buffer:

53 03 64 52 55 3e 4a 6b 04 66 0f 26 4c 42 3c 18 58 4a 43 53 16 63 7f 0d 0a 36 0b 16 4c 44 05 2c 74 45 43

How build_tree works

This is a classic top-down Shannon–Fano-style split on an already sorted-by-frequency array:

Node *build_tree(entry *arr, int lo, int hi) { if (lo == hi) return leaf(arr[lo].byte, arr[lo].freq); float total = 0.0f; for (int i = lo; i < hi; i++) // NOTE: hi is EXCLUSIVE here total += arr[i].freq; float running = 0.0f; int split = lo; for (int i = lo; i <= hi; i++) { // NOTE: hi is INCLUSIVE here if (arr[i].freq + running > total / 2.0f) break; running += arr[i].freq; split = i; } Node *L = build_tree(arr, lo, split); Node *R = build_tree(arr, split + 1, hi); return internal(L, R); }

Two subtleties are worth highlighting — both of which become gotchas during reimplementation:

  1. The total is summed over [lo, hi) (exclusive upper bound) while the split-finding loop iterates over [lo, hi] (inclusive upper bound). This asymmetry is intentional and must be preserved exactly.
  2. All arithmetic is IEEE-754 single precision (float32). Using Python's default float (which is float64) yields a slightly different tree at several boundaries where running + arr[i].freq sits near total/2.0 — the rounding differs between the two widths.

How main decodes the input

Pseudocode:

current = root count = 0 for i in range(strlen(input)): for j in range(8): if current.is_leaf(): output[count] = current.byte count += 1 current = root bit = (input[i] >> (7 - j)) & 1 # MSB first current = current.left if bit == 0 else current.right

Then the binary compares output[0..count-1] against expected[0..count-1]. Printing Congrats on getting your bike down! on match.

Note that the leaf-emit happens at the start of each inner iteration, so the walker can pass through a leaf, emit it, and immediately continue into the next bit — this is the standard prefix-code decoding loop. Inputs that leave the walker mid-path at EOF are simply silently ignored by the final comparison.

Solution

The task is an inverse problem: given the fixed codebook (defined by the tree) and the target byte sequence expected, produce the MSB-first bit stream that encodes it, and pack those bits into bytes.

Steps:

  1. Parse the .data section of the ELF at runtime and extract both freqs (128 entries) and expected (35 bytes).
  2. Rebuild the tree with exactly the same control flow and float32 arithmetic as the C code.
  3. Walk the tree to produce a codebook byte -> "01..." (the left edge is 0, the right edge is 1).
  4. Concatenate codes[b] for each byte b in expected — this yields 246 bits.
  5. Pad to 31 * 8 = 248 bits with 0 bits and pack MSB-first into 31 bytes.

With proper float32 arithmetic the result decodes to:

bctf{now_you_can_bike_around:)|

The last character is | because the final two padding zero-bits landed inside its encoding. Since the final character of any CTF flag is almost always }, and } differs from | only in the low two bits, replacing the last byte gives the intended flag:

bctf{now_you_can_bike_around:)}

The binary accepts this input because the trailing bits of }, after the 246 bits that completely consume expected, land partway through the tree and therefore never produce another leaf — so they are effectively don't-cares, exactly as the problem design allows.

Solver script

#!/usr/bin/env python3 """Solve bike — reverse the Shannon-Fano-like decoder using float32 precision.""" import numpy as np from elftools.elf.elffile import ELFFile with open("bike", "rb") as f: data = f.read() f.seek(0) elf = ELFFile(f) for sec in elf.iter_sections(): if sec.name == ".data": data_vma = sec["sh_addr"] data_off = sec["sh_offset"] break freqs_off = data_off + (0x4020 - data_vma) expected_off = data_off + (0x4420 - data_vma) freqs = [] for i in range(128): entry = data[freqs_off + i * 8 : freqs_off + (i + 1) * 8] b = entry[0] f_val = np.frombuffer(entry[4:8], dtype=np.float32)[0] freqs.append((b, f_val)) expected = data[expected_off : expected_off + 35] class Node: __slots__ = ("byte", "left", "right") def __init__(self, byte=None, left=None, right=None): self.byte, self.left, self.right = byte, left, right def is_leaf(self): return self.left is None and self.right is None f32 = np.float32 def build_tree(arr, lo, hi): if lo == hi: return Node(byte=arr[lo][0]) total = f32(0.0) for i in range(lo, hi): # hi EXCLUSIVE total = f32(total + arr[i][1]) running = f32(0.0) split = lo i = lo while i <= hi: # hi INCLUSIVE candidate = f32(arr[i][1] + running) half = f32(total / f32(2.0)) if candidate > half: break running = f32(running + arr[i][1]) split = i i += 1 return Node(left=build_tree(arr, lo, split), right=build_tree(arr, split + 1, hi)) root = build_tree(freqs, 0, 0x7f) codes = {} def walk(node, path): if node.is_leaf(): codes[node.byte] = path return walk(node.left, path + "0") walk(node.right, path + "1") walk(root, "") bits = "".join(codes[b] for b in expected) # 246 bits bits_padded = bits + "0" * (31 * 8 - len(bits)) out = bytearray(int(bits_padded[i:i+8], 2) for i in range(0, 31 * 8, 8)) # out == b'bctf{now_you_can_bike_around:)|' # Replace trailing '|' with '}' — the same encoded prefix matches, # trailing bits fall off the end of the tree walk and are ignored. out[-1] = ord('}') print(bytes(out).decode())

Running it produces:

bctf{now_you_can_bike_around:)}

And feeding the same bytes to the binary:

$ ./bike < input.bin flag? Congrats on getting your bike down!

$ cat /etc/motd

Liked this one?

Pro unlocks every writeup, every flag, and API access. $9/mo.

$ cat pricing.md

$ grep --similar

Similar writeups