Language: English
A binary in a new language that suspiciously looks like it's designed to foil your static analysis tools:
A new string format that breaks references and readability, virtual dispatch that masks which function is called where
and a reference counting garbage collector so you can't even tell which object ends up where.
Participants will learn how to leverage Ghidra and P-Code to tackle the challenges that pop up when analyzing compiled high level languages.
The focus will be on the iterative workflow of assisting the decompiler: Understanding why it fails,
assisting it with a custom analysis script that uses the P-Code emitted by the decompiler,
and feeding the resulting information back to Ghidra and the decompiler via the right APIs,
so that the decompiler can continue doing the heavy lifting, and provides better P-Code to tackle the next challenge.
By the end, participants will have a transferable toolbox for adapting Ghidra's decompiler to unfamiliar language runtimes by identifying runtime
patterns, writing P-Code-driven analysis scripts or to feeding recovered types and dispatch targets back to the decompiler.
Ghidra's decompiler is powerful, but it needs help with unfamiliar language runtimes. Common issues are a wall of unnamed functions, new string formats, opaque indirect calls through some kind of dynamic dispatch, or noise from reference counting. And the decompiler doesn't know what any of it means.
This workshop teaches the basic toolbox for fixing that. Participants work through a series of modules, each targeting a specific failure mode of the decompiler and resolving it with a Ghidra script. Parsing metadata, to setup classes and layouts. Install call fixups to eliminate GC noise. Identify sources of type information, and feed it into the dataflow analysis. Analyze dataflow through P-Code to resolve virtual dispatch targets. Each module visibly improves the decompiler's output before moving to the next — and each builds on the results of the previous one.
The emphasis is not on the specific runtime but on the tools and techniques that transfer: understanding why the decompiler produces bad output, identifying what information it's missing, extracting that information from P-Code, and feeding it back through the right Ghidra APIs so the decompiler can do the heavy lifting.
With LLMs it has become easy to generate Ghidra scripts, but you still need to know what to ask for — which API is the right one, what the decompiler actually needs, and where it expects the information to be.
Prerequisites: Familiarity with Ghidra's UI and basic reverse engineering. No prior P-Code experience with needed. Bring a laptop with Ghidra 11+ and Java 21.
