The process that decompilers use to find functions within a Structured Object File Format binary. Not all functions will be found by the File Reader, so we need a specific process to find all functions.
Finding Called Functions by Code Walking
After running the File Reader, we get access to the code entry point and hopefully a list of symbols via the symbol table. We Code Walk from the code entry point, and every symbol entry point, saving all found functions into a call list.
callList = entry_address
while callList not empty
   address = pop from callList
   functionList += new function starting at address
   while true
       if instr at address is jump
           workList += jump destination
       if instr at address is call
           callList += call destination
       if instr at address is ret
           if workList empty
               break
       if instr is not conditional
           address = pop from workList
           continue
       address += instr length
   end while
end while
Finding Function Pointers
We might miss function pointers in the initial step as we are analyzing statically. We can find function pointers through:
- Function addresses stored in memory
- Virtual Functions accessed through vtable
- Extracting functions called by Trampoline Code
- Function pointers called by a register with a value we know
- Using a library signature engine to identify series of bytes associated with a specific librarys function call
- Checking sequences of bytes typically used by compilers to setup stack frame i.e:
push ebp
move ebp, esp
sub esp, #- Assume the first instruction after a RETis the entry of a new function (this is risky)