mlProhard
leadgate
dicectf2026
Task: modified GPT-2 model (safetensors) fine-tuned to suppress generating a specific string (the flag). Solution: negate the weight perturbation (W_orig - ΔW instead of W_orig + ΔW) to invert suppression into promotion, then greedy decode from 'dice{' prefix to extract the formerly forbidden flag.
$ ls tags/ techniques/
neural_networkgpt2safetensorsfine_tuninginstruction_tuningweight_perturbationsvdtransformerslorasuppression
weight_perturbation_negationsvd_analysisinstruction_tuning_suppression_inversiongreedy_decodingmodel_diff_analysis
🔒
Permission denied (requires tier.pro)
Sign in to access full writeups
Sign in with GitHub to continue. No email required.
$sign in$ grep --similar
Similar writeups
- [ml][Pro]ReLuess Your Inhibitions— kalmarcf
- [reverse][Pro]Cursed Steganography— duckerz
- [ml][free]dualflow— umdctf
- [crypto][Pro]worrier— hxp_39c3
- [crypto][Pro]Chill— volgactf