The process that decompilers use to find functions within a Structured Object File Format binary. Not all functions will be found by the File Reader, so we need a specific process to find all functions.

Finding Called Functions by Code Walking

After running the File Reader, we get access to the code entry point and hopefully a list of symbols via the symbol table. We Code Walk from the code entry point, and every symbol entry point, saving all found functions into a call list.

callList = entry_address
while callList not empty
   address = pop from callList
   functionList += new function starting at address
   while true
       if instr at address is jump
           workList += jump destination
       if instr at address is call
           callList += call destination
       if instr at address is ret
           if workList empty
               break
       if instr is not conditional
           address = pop from workList
           continue
       address += instr length
   end while
end while

Finding Function Pointers

We might miss function pointers in the initial step as we are analyzing statically. We can find function pointers through:

  • Function addresses stored in memory
  • Virtual Functions accessed through vtable
  • Extracting functions called by Trampoline Code
  • Function pointers called by a register with a value we know
  • Using a library signature engine to identify series of bytes associated with a specific librarys function call
  • Checking sequences of bytes typically used by compilers to setup stack frame i.e:
push ebp
move ebp, esp
sub esp, #
  • Assume the first instruction after a RET is the entry of a new function (this is risky)