I’ve got decision paralysis. BUT, I think i’ve finally found the course.

https://www.youtube.com/watch?v=DNPjBvZxE3E&list=PLg8UXNUgg11OdSQPUfFFtdPOclD1C2wsI

Comments are written using ; semicolon

Sections

Harken back to may 28, we learn there are 3 sections.

  • Data for declaring constants like strings, magic numbers, terminating strings.

  • BSS section to reserve memory for uninitialized data that will arrive in the future

  • Text for writing the actual code.

Exporting

For 64bit:

  1. nasm -f elf64 -g file.asm file.o

  2. ld -m elf_x86_64 -o file file.o

    1. There’s also an alternative way by using the gcc compiler

ld also requires there be a _start: to actially link object file to executable file.

For 32bit:

  1. Nasm -f elf32 -g file.asm file.o

  2. ld -m elf_i386 -o file file.o

Differences vary vastly between 32bit and 64bit. One example being registers are called different. eax in 32bit is rax in 64bit. Actually u can still use eax in 64bit, since you know, backwards compatibility.

Registers(again)

Hardware implemented variables. There are transistors in the cpu that hold data. Either 32bit or 64bit wide but they are hardcoded and always exist. If you want to perform operations on data, then you must load them inside these registers.

AH is upper 8 bits of EAX, AL is lower 8 bits of AX. An important thing to note of the image above, AH,AL,BH,BL,CH… are all dependant on their parent EAX,EBX,ECX,EDX to get their value. If you are using AH or AL or… for anything specific, then changing EAX,EBX… will alter them. I believe this is bad practice to even modify AH or AL, but idk anything about practices to begin with.

AX,BX,CX,DX are the first 16 bits and are NOT accessible. unfortunately.

ECX is called the counter register. We tend to use it as loop iterator variable. There are hardware implementations that make this register increment the fastest, so use this register always whenever you are looping.

EAX is called the accumulator register. It should be used to keep the result of your arithmetic.

ESI(source index) and EDI(destination index) are generally used to copy large datasets.

ESP(stack pointer) will be moved every time you pop or push

EBP(base pointer) will point to bottom of stack

Instructions/Operands

To view all operands, go here: http://ref.x86asm.net/coder32.html

operand namepurposeexampleexplanation
mov(move)Copies value of a source to destination.

mov dest, src
mov eax , 3Eax is now 3
movzx(move with zero extend)Similar to movzx in replacing destination value with source value. Zero out unused space in the register. Mov does this anyways, however it does not have the adaptability that movzx has. Movzx is used mainly when moving smaller values into larger registersmovzx eax, 3eax is now 3. All unused bitspaces are now 0

Example: if eax was 11111111 before, and after we did movzx eax, 3, then eax would wipe out all unused binaries to zero and just keep 00000011(binary of 3)
movzx eax, byte ptr [ebx]If ebx is an array, then move the first byte of the array to eax. BUT if eax is 4 bytes in size(32bit), then we zero out the other unused bytes
movsx(move with sign extend)Same concept as movzx except this is for signs. Sign extend up to the end of the register to be able to hold the current value. Usually the leftmost bit is for storing sign.mov ebx, -3

movzx eax, bl
In this case in order to turn eax into a negative value, you must load the negative value first into a register that has the same byte size.
movzx eax, byte -3This is how you load a negative value without first loading it into another register.
cmp(compare)subtracts second argument from the first and then changes flags accordingly. Usually paired with a jump operand to perform conditional structure.cmp eax,ebxeax - ebx if unsigned, may or may not change carry flag

eax - ebx if signed, may or may not change the overflow flag

if eax - ebx = 0 then zero flag will change, if not then zero flag wont change.
Bitwise operations. These check the actual bis to see if values are same
andCheck 2 values. And dest, src. Any differences between the dest and src will lead to the corresponding bit in dest to perform go through the and logic. dest will hold the new value.mov eax, 30

mov ebx, 78

and eax,ebx
So we compare binary of eax and binary of ebx.

eax = 00011110

ebx = 01001110

Lets work through each binary compare 1 by one. Starting left to right 8 times.

0 and 0. This results in 0

0 and 1. This results in 0

0 and 0. This results in 0

1 and 0. This results in 0

1 and 1. This results in 1

1 and 1. This results in 1

1 and 1. This results in 1

0 and 0. This results in 0

So final answer is 00001110.

Know that the logic of and only allows there to be 2 ones in order for it to return one.
orCheck 2 values in their differences in binary. 1 if any of the compared values have 1 in it. 0 if none of compared values have 1 in itor eax,ebxSo similar to and, just that it uses or logic instead
xorCheck 2v values in their differences in binary. 1 if only one of the compared values have 1 in it. 0 if none or both have 1 in it.xor eax,ebxSimilar to and, uses xor logic instead
xor eax,eaxZeroes out eax. Faster operation that clear or delete operands. This is how it goes:

eax = 11010010

If we eax it by itself we compare

11010010

11010010

1 and 1 = 0

1 and 1 = 0

0 and 0 = 0

1 and 1 = 0

0 and 0 = 0

0 and 0 = 0

1 and 1 = 0

0 and 0 = 0

So result is 00000000
testUpdate zero flag if the result of an AND operation is zero. THIS DOES NOT CHANGE THE REGISTER GIVEN, ONLY THE FLAGmov eax, 10

mov ebx, 5

test eax,ebx
Perform the and check between eax and ebx

eax = 1010

ebx = 0101

The result is 0000. This triggers the zero flag to be on
mov eax,10

mov ebx,6

test eax,ebx
eax = 1010

ebx = 0110

Result is 0010. This is not zero, this is 2. We dont trigger any flag
Arithmetic operations
addAdds to the first argument, the value of the secondadd eax, ebxeax = eax + ebx
add eax, 6eax = eax + 6
add edi, ‘0’Turns edi into the ascii version of itself
sub(subtract)Subtracts from the first arguments, the value of the secondsub eax, ebxeax = eax - ebx
sub eax, 7eax = eax - 7
mul(multiply)Multiply eax by another register. Intentional or unintentionally, it changes edx register aswell.mov ax, 15

mul bx
dx:ax = ax * bx

ax will equal ax*bx granted the product can fit within ax’s byte size.

If not, and its larger, then dx will store the larger value of the product.
mov eax, 15

mul ebx
edx:eax = eax*ebx
div(divide)Divide the eax by a register given. Changes edx register as wellmov ax, 15

div bx
ax Rdx = dx:ax/bx

dx:ax will store the entire value of ax

dx:ax/bx will be stored in ax

Remainder of the division will be stored in dx
mov rax, 15

div rbx
rax Rrdx = edx:eax/ebx
idiv(integer division)Divide eax by the argument given. Store the quotient(rounded down) result in eax, and the remainder in edxmov eax, 14

mov ebx, 5

idiv ebx
eax = 14/5(rounded down) = 2.8(round down) = 2

edx = 14 mod 5 = 4
inc(increment)Add 1 to registerinc eaxeax = eax + 1
dec(decrement)Subtract 1 from registerdec eaxeax = eax - 1
Jump operations
jmp(jump)Jump to a specific label. Used to skip lines and create non-linear program flowjmp labelChanges EIP to the memory address of wherever the label is. Next execution will execute label code.
je(jump if equal)

jz(jump zero)
Jump to a specific label if the previous command above resulted in zero flag being 1. Traditionally used after a compare operand

jz is exact same as je
cmp eax, 5

je label
If the value in eax is equal to 5, then jump to label
jne(jump if not equal)

jnz(jump not zero)
Jump to a specific label if the previous command above resulted in zero flag being 0. Traditionally used after a compare operand

Jnz exact same as jne
cmp eax, 23

jne label
If value if eax is not equal to 23, then jump to label
jc(jump carry)

jb(jump below)

jnae(jump if not above or equal)
jump if carry flag is set from the previous command. This means that the previous operation, first argument smaller than second.

Unsigned arithmetic operation
add eax,ebx

jc label
Pretend eax is 8 bits and after adding ebx, it overflows to a 9bit value. Jump to label if eax value larger to hold within its bitsize
cmp eax,ebx

jc label
if eax and ebx are unsigned integers and eax is smaller than ebx, jump to label
jnc(jump no carry)

ja(jump above)

jnbe(jump not below or equal)
Jump if no carry flag is set from previous command. Means that first argument larger than second.

Unsigned arithmetic operation
add eax,11

jnc label
Assuming eax has more bitsize than 2, this wont cause an overflow.
cmp eax,ebx

jnc label
Say that eax = 10 and ebx = 5. Eax would be bigger than ebx, thus jump to label.
jo(jump overflow)Jump if overflow flag is set. Signed arithmetic operationmov eax, 111111111

jo label
Pretend eax is 8bits. Moving a 9bit value in eax would be too big so overflow flag is set.
jno(jump not overflow)Jump if overflow flag is not set.mov eax, 89239

jo label
If eax remains in its own bitsize, then we jump to label
jg(jump greater)

jnle(jump not less or equal)
Check between 2 values if the first is greater than the second, then jump. Checks zero flag and sign flag. Signed operationcmp eax,ebx

jg label
Since cmp just subtracts ebx from eax, if eax >0 then eax is greater than ebx. Then we signal the jump. So first check zero flag to ensure its not same, then check sign flag to see it is not negative.
jl(jump less)

jnge(jump not greater or equal)
Check between 2 values if the first is smaller than the second, then jump. Checks zero flag and sign flag. Signed operationcmp eax,ebx

jl label
eax-ebx < 0, then jump to label
js(jump sign)Jump only if sign flag is 1. This means last operation resulted in a negative valuemov eax,-1

js label
eax = -1. eax is negative, so jump to next label
Call operations. Jump operations that save return values. Uses stack push and pop
callSaves current eip/rip to the stack. Then go jump to a given labelcall labelCall is doing the following:

push eip

jmp label
ret(return)Load the previous memory address saved in stack into the eip/ripretSimply returning to next line after call.

pop eip
Shift operations. Shifting binary left or right to multiply by 2 or divide by 2
shr(shift right)Move binary of register x times to the right. Divides by 2 for every x.shr eax,1If eax was 00001111

Then eax would now be 00000111
shl(shift left)Moves binary register x times to left. Multiplies by 2 for every xshl eax,2eax is 00011000

Then eax is 01100000
shl eax,2eax is 0111000

If we move left 2 times then it would be 9bit value

111000000

So, we move the extra bit into edx and flick on the carry flag. Eax stores only 8 bits of 11000000
sar(shift arithmetic right)Keeps sign bit during a shift to right. This means the leftmost digit stays the samemov eax,0b11110000

sar eax,2
eax = 11110000

Keep note of leftmost value. It is one.

Carry out the shift as if it was normal.

eax = 00111100

Now this is finished. Bring back the leftmost value of one.

eax = 10111100
sal(shift arithmetic left)Keeps sign bit during shift to left.mov eax,0b10110000

sal eax,3
eax = 10110000

Keep note of leftmost value

Shift left as if normal 3 times

110000000

It is now 9 bits.

Move extra into edx and flick on the carry flag. Eax stores 8 bits of 10000000

Now bring the leftmost value back, oh wait it already is there.

eax = 10000000

edx = 00000001
ror(rotate right)Rotates all bits right. Around the world. Rightmost becomes leftmostmov eax, 0b11000001

ror eax,1
eax = 11000001

Rotating right will shift every value right one and loop back if needed.

eax = 11100000
rol(rotate left)Rotates all bits left. Leftmost becomes rightmostmov eax, 0b11100100

rol eax,1
eax = 11100100

Rotate left and shift all values left. Loop back if needed

eax = 11001001
System call. Give control to kernel to perform kernel actions
intInvoke the kernel. Kernel reads the registers for what to do.mov eax,1

mov ebx,0

int 0x80
Int 80h: system call

The 1 in eax tells system call that this will be a system exit call

The 0 in ebx is the return code. We return 0, exit without a hitch

Dereferencing

[] square brackets are a tool here. [] square brackets will get you that data. [address] is dereferencing and will give you your oh so delectable data.

Hex to Binary

0x declares hex in this case. 0b declares binary in this case

In binary 0b1 is the same as 0x1 0b11 is same as 0x3 in hex 0b111 same as 0x7 in hex 0b1111 is the same as 0xF in hex

So we can simplify every 4 binary as 0xF.

All binary can be converted to hexadecimal. There’s no special method. 0xF can be used to represent 0b1111.

0xFF is 0b11111111 an 8bit value

0xFFF is 0b111111111111 a 12bitvalue

0xFFFF is 0b1111111111111111 a 16bitvalue

Bitmasks

This is a method to mask out certain bits. For example, with the binary:

10101010

Say we only want the 3rd and 4th bits like this:

00110000

We must use AND operand to compare bitwise each bit. It would look like:

00100000

Say we have a 32bit register as follows:

eax = 1010101011001100001001010000111

Remember, AX,BX,CX,DX are not accessible, and maybe we want upper 16 bits. We would do a bitmask with same bitsize.

1010101011001100001001010000111 masks with

11111111111111110000000000000000 or 0xFFFF0000

Thus leading to result being

1010101011001110000000000000000

There is a simpler method to what we wanted above, and that is just to SHR 16 bits.

Interrupt

An interrupt is raising a flag for the kernel to come help perform a task that is only doable in kernel mode. Every mouse click is an interrupt, every keyboard click is an interrupt.

Interrupts are called through operation INT 80h. They will look at registers, eax, ebx, ecx and edx for the arguments on what exactly the kernel is supposed to do.

eax is the certain order you want to do.

ebx is the return value or file descriptor/destination

ecx is char value

edx is size of char

https://www.tutorialspoint.com/assembly_programming/assembly_system_calls.htm

There is a list of interrupt codes that can happen as a result of eax values.

Remember, 80h is the interrupt we are allowed to use for system calls. You will rarely see another interrupt vector other than 80h.

Interrupts modify register value after interrupt as well. So look at them when its finished.

Data Type declaration

All of the following data types will be declared in the section .data portion. Note that ARM uses .section, Intel uses just section

all datatypes will follow this order: variablename db value

variablename will hold the memory address for the value

db is allocating the memory space. Dynamically for strings and integers its very smart

Value is whatever value you give it.

DatatypeValue formatexampleExplanationHow to push to stack
string“value”



And you must also declare length variable aswell
mymessage db “Hello World!”

lenmymessage equ $ - mymessage
db declare byte will declare bytes for every character in the following string.



mymessage is the memory address for the ‘H’ in “Hello World!”. To find the length in bytes of the entire string, we do lenmymessage equ $ - mymessage which means:

length = current address - mymessage address



Example: if ‘H’ is stored at 0xFF00, and the final ‘!’ is stored at 0xFF0B. Then subtract it like 0xFF0B - 0xFF00 = B
push variablename
hiworld db “Hello World!”, 0xA

len equ $ - hiworld
0xA represents the new line \n in NASM.
Intvalueage db 16age memory space holds the decimal value 16push [variablename]

Boilerplate

section .data

section .bss

global _start

section .text

_start:

Hello world

  • We define a string and its length in the section .data segment.

  • System write interrupt using all 4 registers.

  • eax is sys_write code

  • ebx,1 is standard out.

  • Move address of string into ecx

  • length of string into edx

  • Interrupt 0x80 to system write

  • System exit interrupt aswell

Lets compile by viewing here: learning bog

Then run by doing gdb -q helloworld. Then type r

Num to ascii string

If we want to be able to print out a number, we must turn it into the ascii version of itself first. To do this, you must add ‘0’ to the register. Example is:

add edi, ‘0’

To return the string back to a number, just subtract ‘0’.

sub edi, ‘0’

Loop logic

Comments kind of explain but I will a second time

  • Goal of the program is to do a for loop between 0-5. Print each index each iteration

  • ln 9. Clear out the edi register first. This will be our iterator

  • ln 13. Turn edi register into ascii so we can print it

  • ln 15-19. Print the edi register

  • ln 22. Return edi register to integer so we can increment it

  • ln 24. Increment the edi register

  • ln 27-28. If edi is less than 5, then keep continuing the loop

C programming practices

For pushing and popping to stack:

  • They like to push and pop in reverse order. This means that in functions like printf(“%d %d %d”) the last %d will be pushed first and first will be pushed last

  • For branching off into functions, it will push all registers eax,ebx,ecx,edx. Then when function is over, pop them back in

  • This is done to simply reset the stack to the current esp. We dont want to worry about the memory addresses that we dont use. Looks like the stack is empty

  • Oftentimes when asking for user input, we move inwards into the stack like esp+28