Recon 2026

8 Years of Reverse-Engineering Interpreters: Techniques, Automation, and One Framework
2026-06-19 , Grand Salon Opera
Language: English

Over the past eight years we have systematically reverse-engineered nearly ten interpreter and VM binaries, including Lua, Python, Ruby, PHP, VBScript, JScript, PowerShell, and V8, to extract their internal structures and automate that extraction at scale. This talk presents 11 concrete analysis techniques, organized around 6 foundational binary analysis approaches, for recovering interpreter internals from stripped binaries. The techniques include multiple detection logics for VM component recovery that identify their exact locations in memory, and a progressive deduction algorithm for ISA recovery that iteratively eliminates opcode ambiguity across hundreds of test traces. Together they power STAGER, our automated dynamic analysis system built on top of Intel Pin. STAGER completes a full analysis of one interpreter in at most a couple of hours, which is an order-of-magnitude improvement over manual reverse engineering that typically takes days to weeks, and keeps pace with the frequent version updates of real-world interpreter binaries. We will release STAGER as open-source at the conference.

The security payoff is direct. We use STAGER output to build script-level API tracers, which hook the interpreter's own built-in API functions (e.g., eval), enabling behavioral monitoring across diverse interpreter targets. We further leverage branch VM instruction identification and conditional flag detection to build a multi-path explorer, and use recovered ISA mappings to perform dynamic bytecode instrumentation; together these enable fine-grained analysis of evasive script malware that actively resists conventional debugging. We also combine STAGER output with fuzzing harnesses for vulnerability discovery in interpreter runtimes, and demonstrate bytecode-based process injection techniques for red team operations that bypass diverse security mechanisms. These applications are grounded in real targets and will be shown in a live demo.

Beyond the techniques themselves, we share hard-won lessons from nearly ten real-world targets: how compiler register allocation breaks memory-based variable tracking and how to compensate with register-level static analysis, how to handle interpreters layered atop other interpreters (e.g., PowerShell on .NET CLR) where execution traces interleave two VM layers, and how to suppress or work around JIT compilation interference, including the aggressive JIT behavior seen in V8. Accuracy results across all targets, including honest failure cases where our approach hits fundamental limitations, are presented per technique.

Three concrete takeaways for attendees:
1. A working mental model of interpreter internals as attack and analysis surface, grounded in nearly ten real-world targets.
2. The 11-technique framework, including VM component localization logics and a progressive ISA deduction algorithm, directly applicable to diverse interpreter binaries.
3. STAGER (open-source release) and the methods to adapt it to new interpreter targets.

Toshinori Usui is an associate distinguished researcher and security principal at NTT Social Informatics Laboratories, with 10+ years of experience in binary analysis, malware analysis, and offensive security. Toshinori has presented his research at top-tier hacker and academic conferences such as Black Hat USA Briefings, REcon, RAID, and ACSAC. He is also a CTF lover focused on reversing and pwn, formerly belonging to Sutegoma2 and binja and currently Team Enu. Toshinori received his Ph.D. in 2021 and has some security certificates, including GREM and GCFE.