An Analytical Approach to Modern Binary Deobfuscation

A curated training that teaches you to build, analyze and defeat obfuscated code


Instructor

Arnau Gàmez i Montolio


Availability

Public offerings

Looking for public offerings? Get notified.


Private training

  • Location: In-person / Remote
    Length: 4 days (flexible)
  • Schedule it

Abstract

Code obfuscation has become one of the most prevalent mechanisms aiming to complicate the process of software reverse engineering. It plays a major role on a wide range of domains: from malware threats to protection of intellectual property and digital rights management.

An Analytical Approach to Modern Binary Deobfuscation is a curated training that provides an intensive jump-start into the field of code (de)obfuscation. Over the course of this training, students will receive a comprehensive introduction to the most relevant software obfuscation mechanisms as well as existing deobfuscation techniques to analyze, confront and defeat obfuscated code.


Key learning objectives

  • Obtain a high-level overview of the context and scenarios where code obfuscation is used
  • Gain an in-depth understanding of code obfuscation mechanisms
  • Build obfuscated code, both from scratch and through available tooling
  • Develop an understanding of the main code deobfuscation techniques
  • Learn tooling for analyzing obfuscated code and apply deobfuscation techniques
  • Become familiar with state of the art (de)obfuscation research literature

Contents

  • DAY 1
  • Introduction, context and motivation

    Data-flow based obfuscation
    • Constant unfolding
    • Dead code insertion
    • Encodings
    • Pattern-based obfuscation

    Control-flow based obfuscation
    • Function inlining/outlining
    • Opaque predicates
    • Control-flow flattening

    Mixing data-flow and control-flow obfuscation
    • VM-based obfuscation
    • Hardening VM-based obfuscation

    Exercises
    Project - Manually craft a custom obfuscation VM

  • DAY 2
  • SMT-based analysis
    • A primer on SMT solvers
    • Translate code conditions into SMT solver constraints
    • Program analysis with SMT solvers

    Mixed Boolean-Arithmetic
    • Preliminary concepts
    • MBA rewriting
    • Insertion of identities
    • Opaque constants

    Exercises
    Project - Applied MBA to obfuscate the semantics of VM-handlers

  • DAY 3
  • Symbolic execution
    • Reasoning about code in a symbolic way
    • Working with native code
    • Working with intermediate representations
    • Data-flow analysis and compiler optimizations
    • Extract symbolic formulas
    • Extract path constraints
    • Plugging an SMT solver
    • Attacking obfuscation schemes

    Guided project - Build your own (toy) symbolic execution engine
    Exercises
    Project - Attack obfuscated VM and explore symbolic execution limits

  • DAY 4
  • Program synthesis
    • Code syntax VS Code semantics
    • Specifying program behavior
    • Oracle-based program synthesis
    • Describing semantics through I/O behavior
    • Generating I/O pairs
    • Different synthesis flavors
    • Practical considerations
    • Attacking obfuscation schemes

    Conclusions and research directions
    Guided project - Build your own code semantics synthesizer
    Exercises
    Project - Recover the semantics of MBA-obfuscated VM-handlers

Tools used

Disassemblers
  • IDA Free/Home/Pro
  • Ghidra
  • Radare2
Obfuscation
  • Manual obfuscation
  • O-LLVM
  • Tigress
Symbolic execution
  • Miasm
  • Triton
Program synthesis
  • Syntia
  • Msynth
  • Custom tooling
Other tools
  • Z3
  • Other custom tooling

Teaching methodology

Live classes are designed to be dynamic and engaging, making the students get the most out of the training materials and instructor expertise. A clear presentation of the concepts, accompanied by illustrative examples and demos. For each section, there will be practice time allocated. The students will be provided with several exercises and projects to work on, with the continuous support of the instructor.


Who should attend

Reverse engineers, malware analysts and folks within the anti-cheating and software protection industry. It can also be really beneficial for bug hunters, vulnerability researchers, exploit developers and enthusiast security researchers in general.


Prerequisites

  • Understanding of basic programming concepts
  • Familiarity with x86 assembly, C and Python
  • Knowledge of reverse engineering fundamentals

System requirements

  • A working desktop/laptop capable of running virtual machines
  • 40 GB free hard disk space

Provided to students

  • Access to a VM with all tools, examples and exercises
  • Access to a private chat with instructor and other students

Testimonials

  • "The lectures by Arnau on Mixed Boolean-Arithmetic obfuscation and deobfuscation techniques went very deep, while staying accessible for people without a formal math background. The exercise materials and projects were engaging and a natural practical extension of the theory discussed during the lectures. Arnau was also very responsive and happy to discuss ideas in the Discord channel. Overall a superb experience and I highly recommend you attend one of his trainings!"
    – Duncan Ogilvie, author of x64dbg

  • "The trainer not only fits his domain but also is a superb teacher with slides, materials and exercises of outstanding quality. Attention: be prepared for lots of homework after the training each day! :)"
    – Anonymous

  • "Arnau is a very friendly and knowledgeable person and does an excellent job at articulating difficult topics in a much simpler way."
    – Anonymous

  • "The instructor is one of the few experts in this area of research. Thus, his insights are invaluable. I highly recommend the training to people who are interested in understanding the finer implementations of how Symbolic Execution or Mixed Boolean-Arithmetic works."
    – Anonymous