카테고리 없음

Intro To Win32 Assembly, Using NASM 2/3

유니시티황 2014. 2. 2. 02:22

Common Instructions

common_instructions_cc.jpg

The Registers

General-Purpose Registers
EAX - Accumulator Register 
EBX - Base/Address Register 
ECX - Count Register 
EDX - Data Register 
ESI - Source Index 
EDI - Destination Index 
ESP - Stack Pointer 
EBP - Base Pointer 
Status/Control Registers
Segment Registers (limited access): 
CS - Code Segment 
DS - Data Segment 
SS - Stack Segment 
ES - Extra Segment 
FS - (Doesn't Really Stand For Anything) Segment 
GS - (Doesn't Really Stand For Anything) Segment 
Control Registers (limited access): 
CR0 
CR2 
CR3 
CR4 
Debug Registers (limited access): 
DR0 
DR1 
DR2 
DR3 
Other Registers (no direct access): 
EIP - Instruction Pointer 
EFLAGS - Flags Register (more on this later)


Instruction Pointer Register - What Is This EIP?
There are several registers in the Intel processor; some of them are general-purpose registers, and some are status and/or control registers. EIP is one of the latter. In Intel 8086 (16-bit), it's used to be called IP, but starting from Intel 386 (32-bit), it's called EIP. EIP is the instruction pointer register; it points to the next instruction to execute. 
But one thing to know is that you can't change this register directly. 

Flags Register - What Are Flags?
Flags are just bits that indicate true (if set) or false (if clear). The EFLAGS (or FLAGS, in 8086) register contains flags for different purposes. Also, a lot of the instructions modify flags. Flags are also used for tests and comparisons (ie "if" statements). 

Instruction Set Reference
If you need reference for any instruction, you can perform a Google-search for "<instruction_name> intel instruction" (without the quotes), and go to the page that looks most relevant. 

You can also look at this page, for reference to some common instructions. 

For a complete reference of the Intel instruction set, refer to the Intel Architecture Software Developer's Manual volume 2, Instruction Set Reference (Document Download Page). 

Assembly Language - Instruction Usage Format
The format of instruction usage for assembly language is as follows: 
<label>: <mnemonic> <operand1>, <operand2>, <operand3> ; <comment> 

An Intel instruction can have 0 to 3 operands. As you can see, there are 3 parts: label, instruction, comment. You can have only the label, or only the comment, or only the instruction, or a combination of the three - so long as they are in order (ie the label comes before the instruction) and there's only one of each (no more than one label, no more than one instruction, etc.). 

EAX and AX, EBX and BX, Etc. - Register Parts
EAX is a double-word sized register. AX is the lower-order word of EAX. When we look at a register, the low-order part of it is on the right, while the high-order part of it is on the left (this information helps with using the SHR and SHL instructions). 
AL is the low-order byte of AX, and AH is the high-order byte of AX. 
It's not that easy to access the high-order word of EAx, though. 
Same goes for EBX, ECX, and EDX. Low-order word of EBX is BX, and so on. 

The above only applies for the four registers EAX, EBX, ECX, and EDX. 

What about the other four general-purpose registers? 
- SI is the low-order word of ESI. 
- DI is the low-order word of EDI. 
- SP is the low-order word of ESP. 
- BP is the low-order word of EBP. 

Under Intel 8086 (16-bit), you only have the lower-word parts, and smaller (ie AX, AL, AH); you don't have the double-word registers (ie no EAX, no ESP, etc.).

Addressing Under 8086 - Effective Addresses
Under Intel 8086, you can only use the BX and BP registers for effective addressing. 

The parts of an effective address (for 8086) are: 
base + index + offset 

Where base can be either BX or BP, index can be either SI or DI, and offset is an immediate value. 

The following is not allowed: 
mov ax, [cx] 
mov ax, [bx+cx] 

The following is allowed: 
mov ax, [bx] 
mov ax, [bx+si] 
mov ax, [bx+di+8] 
mov ax, [bp+si-4] 

Addressing Under 386 - Effective Addresses
Under Intel 386, you can use any general-purpose register for memory references. 

The format for an effective address is as follows: 
base + (index * scale) + displacement 

Where: 
- base is any of the 8 general-purpose registers. 
- index can be any of the 8 general-purpose registers except ESP. 
- scale can be 1, 2, 4, or 8. 
- displacement is an immediate value. 

For more information about effective addressing, refer to the Intel Architecture Software Developer's Manual volume 1, Basic Architecture (Document Download Page). 

Register Structure - Where Goes What?
The following is the structure of the EAX register, but same applies for EBX, ECX, and EDX: 
eax_structure_cc.jpg 

ESI, EDI, ESP, and EBP are similar, but they just don't have easily-accessible byte parts as the first four have (ie AL, AH, etc.). 

Memory Storage Structure - Little-Endian Byte Order
The bits and bytes are ordered correctly when they're in the registers (such as EAX). But what about when they're stored in memory? 

Intel uses little-endian byte ordering, which means that the least-significant byte comes first (as opposed to big-endian byte ordering, where the bytes are ordered in a storage medium in the right order). 

When you save EAX, for example, to a memory location, let's say 32, AL is saved to 32, AH to 33, and the rest of EAX to 34. When you save AX to 32, AL is still saved to 32, and AH is still saved to 33; that is, in a way, a nice thing, because what if you want to just get the lower-order word of the integer, you just use AX, instead of EAX, and the effective address still stays the same. 

Intel Architecture - The Stack
The ESP register contains the memory address of the current stack. 

The last thing pushed to the stack is the first thing to be popped off the stack. 

One thing to note, though, is that the stack grows down, instead of growing up. So if you push two bytes to the stack, the stack pointer will decrease by two. And then if you pop four bytes off the stack, the stack pointer will increase by four. 

Programming Under Windows - Subsystems
There are two major subsystems for Windows programs. 
If the program's subsystem is "Console", a console window will appear, or the program would use the current command prompt console window (if started from command prompt), when the program starts. 
Otherwise, if the program's subsystem is "Windows", no console window will appear. The type of programs we'll make use the windows subsystem, so we won't start out with a console window. 
But we can still ask Windows for a console, if we want one, by using the Win32 API AllocConsole() function; we will, however, have to tell Windows when we're done using the console, with the FreeConsole() function. 

An example of a console subsystem: 
console_program_cc.PNG