cryptofreemedium

AliEnS Challenge Scenario

HackTheBox

Task: attack an AES-ECB oracle with custom string padding and a fresh random key for every request. Solution: exploit the Unicode-versus-UTF-8 length mismatch to shift byte alignment, then recover the appended flag with same-request ECB block equality matching against a dictionary of candidate blocks.

$ ls tags/ techniques/

unicode utf8 aes_ecb chosen_plaintext custom_padding byte_at_a_time block_alignment

same_request_ecb_block_matchingunicode_utf8_length_mismatch_abusemultibyte_alignment_shiftbyte_at_a_time_flag_recovery

AliEnS Challenge Scenario — HackTheBox

Description

In a groundbreaking discovery, a research lab uncovered alien technology utilizing AES encryption with custom padding. They engineered a user-friendly interface to interact with this enigmatic advancement. Now, a challenge is presented: Can you crack it?

We are given an encryption oracle that pads attacker input and the flag with a custom routine, concatenates them, UTF-8 encodes the result, and encrypts it with AES-ECB. The obvious defense is that the service generates a fresh random AES key for every request, so the intended solution is to notice that this still does not stop equality checks between blocks produced inside the same request.

Summary

The break comes from a subtle type mismatch. AAES.pad() computes padding length on a Python str, where len() counts Unicode code points, but the ciphertext is produced only after .encode() converts the string to UTF-8 bytes. By injecting multibyte characters such as é, we can move the flag bytes to arbitrary byte offsets while the padding logic still reasons in character counts.

Once the target flag byte is aligned at the end of a 16-byte block, we place a full dictionary of candidate blocks earlier in the same plaintext. ECB with a random key still deterministically maps equal plaintext blocks to equal ciphertext blocks within that single encryption, so the matching ciphertext block reveals the correct next flag byte.

Analysis

The core code is:

class AAES():
    def __init__(self):
        self.padding = "CryptoHackTheBox"

    def pad(self, plaintext):
        return plaintext + self.padding[:(-len(plaintext) % 16)] + self.padding

    def encrypt(self, plaintext):
        cipher = AES.new(os.urandom(16), AES.MODE_ECB)
        return cipher.encrypt(pad(plaintext, 16))

plaintext = aaes.pad(message) + aaes.pad(FLAG)
print(aaes.encrypt(plaintext.encode()).hex())

At first this looks safe against classical ECB byte-at-a-time attacks because every query uses a new random key. Comparing ciphertext from one request to another is useless: the same plaintext block encrypts to unrelated ciphertext under different keys.

But ECB still has one property that survives random per-request keys:

inside a single encryption call,
under the one random key chosen for that call,
equal plaintext blocks always become equal ciphertext blocks.

That is enough. We do not need cross-request stability. We only need to put:

a target block containing one unknown flag byte, and
many attacker-built candidate blocks,

into the same plaintext of the same request. Then we compare ciphertext blocks from that one response. The key can be random every time and the equality test still works.

Why the Unicode/UTF-8 mismatch matters

AAES.pad() uses len(plaintext) on a Python string. That length is measured in Unicode characters, not encoded bytes. Later, the service does plaintext.encode(), and UTF-8 turns some characters into multiple bytes.

Example:

"A" has string length 1 and byte length 1.
"é" has string length 1 but byte length 2 in UTF-8.

So repeating é changes the byte layout of the final plaintext without changing the custom padding decision in the same way. This gives a controllable byte shift between:

where the program thinks block boundaries should land, and
where AES block boundaries actually land after UTF-8 encoding.

That byte shift lets us move each unknown flag byte into a position where it becomes the last byte of a 16-byte block.

Solution

Step 1: Understand the padded plaintext structure

The service encrypts:

AAES.pad(message) || AAES.pad(FLAG)

with

AAES.pad(x) = x || padding_prefix || "CryptoHackTheBox"

where padding_prefix is chosen so the final string length is a multiple of 16 before the fixed suffix is appended.

Because the padding string is known, the first bytes before the flag are also known. That means after recovering some prefix of the flag, we can build a 15-byte context for the next unknown byte exactly like a normal ECB dictionary attack.

Step 2: Use multibyte characters to shift the real byte alignment

For flag position i, the solver computes:

extra_utf8_bytes = (15 - (i % 16)) % 16

and appends "é" * extra_utf8_bytes to the message. Each é contributes one extra UTF-8 byte beyond what the string-length padding logic accounted for. This moves the start of the flag block by a controlled number of bytes.

The goal is to make the next unknown flag byte land at the end of some AES block.

Step 3: Build a same-request dictionary

For the next unrecovered byte, we form a 15-byte prefix from known data:

prefix = PAD[i + 1:] + recovered if i < 15 else recovered[-15:]
dictionary = [prefix + ch for ch in charset]

Each prefix + ch is exactly one candidate 16-byte block. We concatenate all candidate blocks into the attacker-controlled message, so they occupy the first ciphertext blocks of the response.

Then the payload is:

payload = "".join(dictionary) + ("é" * extra_utf8_bytes)

Now one later block in the same plaintext contains the real target block from AAES.pad(FLAG). If one candidate plaintext block equals that target plaintext block, ECB makes their ciphertext blocks identical in that same response.

Step 4: Compare blocks from the same ciphertext

The solver splits the ciphertext into 16-byte blocks and locates:

the dictionary blocks at the beginning,
the target flag block later in the ciphertext.

Then it searches for the unique matching block:

matches = [
    charset[j]
    for j, block in enumerate(blocks[: len(dictionary)])
    if block == target
]

That matching candidate reveals the next flag character. Repeating this recovers the whole flag.

Step 5: Remote parsing fix

The live service prints the ciphertext and then immediately prints the next prompt. The final solver therefore reads only the first line before calling bytes.fromhex(), otherwise prompt text contaminates the hex parser.

Solver

Local solver path:

tasks/hackthebox/AliEnS_Challenge_Scenario/solve.py

The working solver already in the workspace implements the full exploit. Its key recovery logic is:

def recover_flag(oracle, charset=DEFAULT_CHARSET, max_len=80, stop_suffix="}"):
    recovered = ""

    for i in range(max_len):
        extra_utf8_bytes = (15 - (i % 16)) % 16
        prefix = PAD[i + 1 :] + recovered if i < 15 else recovered[-15:]
        dictionary = [prefix + ch for ch in charset]
        payload = "".join(dictionary) + ("é" * extra_utf8_bytes)

        blocks = split_blocks(oracle(payload))
        target_idx = len(dictionary) + (
            1 + i // 16 if extra_utf8_bytes == 0 else 2 + i // 16
        )
        target = blocks[target_idx]

        for j, block in enumerate(blocks[: len(dictionary)]):
            if block == target:
                recovered += charset[j]
                break

        if recovered.startswith("HTB{") and recovered.endswith("}"):
            return recovered

This is the important idea in one loop:

build all candidate blocks,
force target-byte alignment with multibyte UTF-8 input,
compare dictionary blocks and target block inside the same ciphertext,
append the matching character.

$ cat /etc/motd

Liked this one?

Pro unlocks every writeup, every flag, and API access. $9/mo.

$ cat pricing.md

$ grep --similar

Similar writeups

[crypto][free]xorxorxor— hackthebox
[crypto][free]MadMath— hackthebox
[crypto][free]Rhome— HackTheBox
[crypto][Pro]Enigma— hackerlab
[hardware][free]Project Power Challenge Scenario— hackthebox