assembly fundamentals
TRANSCRIPT
Assembly Language Fundamentals
Starting with an Example
2
TITLE Add and Subtract (AddSub.asm); Adds and subtracts three 32-bit integers; (10000h + 40000h + 20000h)INCLUDE Irvine32.inc.codemain PROC
mov eax,10000h ; EAX = 10000hadd eax,40000h ; EAX = 50000hsub eax,20000h ; EAX = 30000hcall DumpRegs ; display registersexit
main ENDPEND main
Title/header
Include file
Code section
3
Meanings of the Code
Assembly code Machine codeMOV EAX, 10000h B8 00010000(Move 10000h into EAX)
ADD EAX, 40000h 05 00040000(Add 40000h to EAX)
SUB EAX, 20000h 2D 00020000(SUB 20000h from EAX)
3
Operand in instruction
4
Fetched MOV EAX, 10000h
4
…
ALU
MemoryRegisterEAXEBX
address
00
…
MOV EAX, 10000h
SUB EAX, 20000h
data
PC
0000011B8 00010000
IR
B8000100
00
ADD EAX, 40000h
05000400
5
Execute MOV EAX, 10000h
5
…
ALU
RegisterEAXEBX
data
PC
0000011B8 00010000
IR
00010000
00
…
MOV EAX, 10000h
SUB EAX, 20000h
B8000100
00
ADD EAX, 40000h
05000400
Memory
address
6
Fetched ADD EAX, 40000h
6
…
ALU
RegisterEAXEBX
data
PC
000100005 00040000
IR
00010000
Memory
address
00
…
MOV EAX, 10000h
SUB EAX, 20000h
B8000100
00
ADD EAX, 40000h
05000400
7
Execute ADD EAX, 40000h
7
…
ALU
RegisterEAXEBX
data
PC
000100005 00040000
IR
0001000000050000
Memory
address
00
…
MOV EAX, 10000h
SUB EAX, 20000h
B8000100
00
ADD EAX, 40000h
05000400
Chapter Overview
• Basic Elements of Assembly Language• Integer constants and expressions• Character and string constants• Reserved words and identifiers• Directives and instructions• Labels• Mnemonics and Operands• Comments
• Example: Adding and Subtracting Integers• Assembling, Linking, and Running Programs• Defining Data• Symbolic Constants• Real-Address Mode Programming
8
Reserved Words, Directives
• TITLE: • Define program listing title• Reserved word of directive
• Reserved words• Instruction mnemonics,
directives, type attributes, operators, predefined symbols
• See MASM reference in Appendix A
• Directives:• Commands for assembler
9
TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC
mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit
main ENDPEND main
Directive vs Instruction
• Directives: tell assembler what to do• Commands that are recognized and acted upon by the assembler, e.g. declare
code, data areas, select memory model, declare procedures, etc.• Not part of the Intel instruction set• Different assemblers have different directives
• Instructions: tell CPU what to do• Assembled into machine code by assembler• Executed at runtime by the CPU• Member of the Intel IA-32 instruction set• Format:
LABEL (option), Mnemonic, Operands, Comment
10
Comments
•Single-line comments• begin with semicolon (;)
•Multi-line comments• begin with COMMENT
directive and a programmer-chosen character, end with the same character, e.g.
COMMENT ! Comment line 1
Comment line 2!
11
TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC
mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit
main ENDPEND main
Include Files
• INCLUDE directive:• Copies necessary
definitions and setup information from a text file named Irvine32.inc, located in the assembler’s INCLUDE directory (see Chapt 5)
12
TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC
mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit
main ENDPEND main
Code Segment
• .code directive:• Marks the beginning of the
code segment, where all executable statements in a program are located
13
TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC
mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit
main ENDPEND main
Procedure Definition
• Procedure defined by:• [label] PROC• [label] ENDP
• Label:• Place markers: marks the
address (offset) of code and data
• Assigned a numeric address by assembler
• Follow identifier rules• Data label: must be unique,
e.g. myArray• Code label: target of jump and
loop instructions, e.g. L1:
14
TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC
mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit
main ENDPEND main
Identifiers
• Identifiers:• A programmer-chosen
name to identify a variable, a constant, a procedure, or a code label
• 1-247 characters, including digits
• not case sensitive• first character must be a
letter, _, @, ?, or $
15
TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC
mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit
main ENDPEND main
Instructions
[label:] mnemonic operand(s) [;comment]
• Instruction mnemonics:• help to memorize• examples: MOV, ADD, SUB,
MUL, INC, DEC• Operands:
• constant• constant expression• register• memory (data label,
register)
16
TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC
mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit
main ENDPEND main immediate valuesDestinatio
noperand
Sourceoperand
I/O• Not easy, if program by
ourselves• Will use the library
provided by the author• Two steps:
• Include the library (Irvine32.inc) in your code
• Call the subroutines• call DumpRegs:
• Calls the procedure to displays current values of processor registers
17
TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC
mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit
main ENDPEND main
Remaining• exit:
• Halts the program• Not a MSAM keyword, but
a command defined in Irvine32.inc
• END main:• Marks the last line of the
program to be assembled• Identifies the name of the
program’s startup procedure
18
TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC
mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit
main ENDPEND main
Example Program Output
• Program output, showing registers and flags
19
EAX=00030000 EBX=7FFDF000 ECX=00000101 EDX=FFFFFFFFESI=00000000 EDI=00000000 EBP=0012FFF0 ESP=0012FFC4EIP=00401024 EFL=00000206 CF=0 SF=0 ZF=0 OF=0
Alternative Version of AddSub
20
TITLE Add and Subtract (AddSubAlt.asm); adds and subtracts 32-bit integers.386.MODEL flat,stdcall.STACK 4096ExitProcess PROTO, dwExitCode:DWORDDumpRegs PROTO.codemain PROC
mov eax,10000h ; EAX = 10000hadd eax,40000h ; EAX = 50000hsub eax,20000h ; EAX = 30000hcall DumpRegsINVOKE ExitProcess,0
main ENDPEND main
Explanations
• .386 directive:• Minimum processor required for this code
• .MODEL directive:• Generate code for protected mode program• Stdcall: enable calling of Windows functions
• PROTO directives:• Prototypes for procedures• ExitProcess: Windows function to halt process
• INVOKE directive:• Calls a procedure or function• Calls ExitProcess and passes it with a return code of zero
21
Suggested Program Template
22
TITLE Program Template (Template.asm); Program Description:; Author:; Creation Date:; Revisions: ; Date: Modified by:INCLUDE Irvine32.inc.data
; (insert variables here).codemain PROC
; (insert executable instructions here)exit
main ENDP; (insert additional procedures here)
END main
What's Next
• Basic Elements of Assembly Language• Example: Adding and Subtracting Integers• Assembling, Linking, and Running Programs• Defining Data• Symbolic Constants• Real-Address Mode Programming
23
Assemble-Link-Execute Cycle
• Steps from creating a source program through executing the compiled program
http://kipirvine.com/asm/gettingStarted/index.htm
24
SourceFile
ObjectFile
ListingFile
LinkLibrary
ExecutableFile
MapFile
Output
Step 1: text editor
Step 2:assembler
Step 3:linker
Step 4:OS loader
Listing File
• Use it to see how your program is compiled• Contains
• source code• addresses• object code (machine language)• segment names• symbols (variables, procedures, and constants)
• Example: addSub.lst
25
Listing File
00000000 .code00000000 main PROC
00000000 B8 00010000 mov eax,10000h00000005 05 00040000 add eax,40000h0000000A 2D 00020000 sub eax,20000h0000000F E8 00000000E call DumpRegs
exit 00000014 6A 00 * push +000000000h 00000016 E8 00000000E * call ExitProcess 0000001B main ENDP
END main
26
address memorycontent
Data Types
BYTE 8-bit unsigned integerSBYTE 8-bit signed integerWORD 16-bit unsigned integerSWORD 16-bit signed integerDWORD 32-bit unsigned integerSDWORD 32-bit signed integerFWORD 48-bit integer (Far pointer in protected mode)QWORD 64-bit integerTBYTE 80-bit (10-byte) integerREAL4 32-bit (4-byte) short realREAL8 64-bit (8-byte) long realREAL10 80-bit (10-byte) extended real
27
Defining BYTE and SBYTE Data• Each of following defines a single byte of storage:
28
value1 BYTE 'A' ; character constantvalue2 BYTE 0 ; smallest unsigned bytevalue3 BYTE 255 ; largest unsigned bytevalue4 SBYTE -128 ; smallest signed bytevalue5 SBYTE +127 ; largest signed bytevalue6 BYTE ? ; uninitialized byte
• MASM does not prevent you from initializing a BYTE with a negative value, but it is considered poor style
• If you declare a SBYTE variable, the Microsoft debugger will automatically display its value in decimal with a leading sign
Defining Byte Arrays
• Examples that use multiple initializers:
29
list1 BYTE 10,20,30,40list2 BYTE 10,20,30,40 BYTE 50,60,70,80 BYTE 81,82,83,84
list3 BYTE ?,32,41h,00100010blist4 BYTE 0Ah,20h,‘A’,22h
Defining Strings
• An array of characters• Usually enclosed in quotation marks• Will often be null-terminated• To continue a single string across multiple lines, end each line with a comma
30
str1 BYTE "Enter your name",0str2 BYTE 'Error: halting program',0str3 BYTE 'A','E','I','O','U'greeting BYTE "Welcome to the Encryption Demo program " BYTE "created by Kip Irvine.",0menu BYTE "Checking Account",0dh,0ah,0dh,0ah,
"1. Create a new account",0dh,0ah,"2. Open an existing account",0dh,0ah,"Choice> ",0
Is str1 an array? End-of-line sequence:• 0Dh = carriage return• 0Ah = line feed
Using the DUP Operator
• Use DUP to allocate (create space for) an array or string • Syntax:
counter DUP (argument)• Counter and argument must be constants or constant expressions
31
var1 BYTE 20 DUP(0) ; 20 bytes, all equal to zerovar2 BYTE 20 DUP(?) ; 20 bytes, uninitializedvar3 BYTE 4 DUP("STACK"); 20 bytes,
; "STACKSTACKSTACKSTACK”
Defining WORD and SWORD
• Define storage for 16-bit integers• or double characters• single value or multiple values
32
word1 WORD 65535 ; largest unsigned valueword2 SWORD –32768 ; smallest signed valueword3 WORD ? ; uninitialized, unsignedword4 WORD "AB" ; double charactersmyList WORD 1,2,3,4,5 ; array of wordsarray WORD 5 DUP(?) ; uninitialized array
Defining Other Types of Data
• Storage definitions for 32-bit integers, quadwords, tenbyte values, and real numbers:
33
val1 DWORD 12345678h ; unsignedval2 SDWORD –2147483648 ; signedval3 DWORD 20 DUP(?) ; unsigned arrayval4 SDWORD –3,–2,–1,0,1 ; signed arrayquad1 QWORD 1234567812345678hval1 TBYTE 1000000000123456789AhrVal1 REAL4 -2.1rVal2 REAL8 3.2E-260rVal3 REAL10 4.6E+4096ShortArray REAL4 20 DUP(0.0)
Adding Variables to AddSub
34
TITLE Add and Subtract, Version 2 (AddSub2.asm); This program adds and subtracts 32-bit unsigned; integers and stores the sum in a variable.INCLUDE Irvine32.inc.dataval1 DWORD 10000hval2 DWORD 40000hval3 DWORD 20000hfinalVal DWORD ?.codemain PROC
mov eax,val1 ; start with 10000hadd eax,val2 ; add 40000hsub eax,val3 ; subtract 20000hmov finalVal,eax ; store the result (30000h)call DumpRegs ; display the registersexit
main ENDPEND main
Listing File00000000 .data00000000 00010000 val1 DWORD 10000h00000004 00040000 val2 DWORD 40000h00000008 00020000 val3 DWORD 20000h0000000C 00000000 finalVal DWORD ?
00000000 .code00000000 main PROC00000000 A1 00000000 R mov eax,val1 ; start with 10000h00000005 03 05 00000004 R add eax,val2 ; add 40000h0000000B 2B 05 00000008 R sub eax,val3 ; subtract 20000h00000011 A3 0000000C R mov finalVal,eax; store result00000016 E8 00000000 E call DumpRegs ; display registers
exit00000022 main ENDP
35
C vs Assembly
36
.dataval1 DWORD 10000hval2 DWORD 40000hval3 DWORD 20000hfinalVal DWORD ?.codemain PROC
mov eax,val1add eax,val2sub eax,val3mov finalVal,eaxcall DumpRegsexit
main ENDP
main(){
int val1=10000h;int val2=40000h;int val3=20000h;int finalVal;
finalVal = val1 + val2 - val3;
}
What's Next
• Basic Elements of Assembly Language• Example: Adding and Subtracting Integers• Assembling, Linking, and Running Programs• Defining Data• Symbolic Constants
• Equal-Sign Directive• Calculating the Sizes of Arrays and Strings• EQU Directive• TEXTEQU Directive
• Real-Address Mode Programming
37
Equal-Sign Directive
• name = expression• expression is a 32-bit integer (expression or constant)• may be redefined• name is called a symbolic constant• Also OK to use EQU
• good programming style to use symbols
38
COUNT = 500…mov al,COUNT
Calculating the Size of Arrays• Current location counter: $• Size of a byte array
• Subtract address of list and difference is the number of bytes
• Size of a word array• Divide total number of bytes by 2 (size of a word)
39
list BYTE 10,20,30,40ListSize = ($ - list)
list WORD 1000h,2000h,3000h,4000hListSize = ($ - list) / 2
EQU Directive• Define a symbol as either an integer or text expression• Cannot be redefined• OK to use expressions in EQU:
• Matrix1 EQU 10 * 10• Matrix1 EQU <10 * 10>
• No expression evaluation if within < >• EQU accepts texts too
40
PI EQU <3.1416>pressKey EQU <"Press any key to continue",0>.dataprompt BYTE pressKey
TEXTEQU Directive
• Define a symbol as either an integer or text expression• Called a text macro• Can be redefined
41
continueMsg TEXTEQU <"Do you wish to continue (Y/N)?">rowSize = 5.dataprompt1 BYTE continueMsgcount TEXTEQU %(rowSize * 2) ; evaluates expressionsetupAL TEXTEQU <mov al,count>.codesetupAL ; generates: "mov al,10"
Summary
• Integer expression, character constant• Directive – interpreted by the assembler• Instruction – executes at runtime• Code, data, and stack segments• Source, listing, object, map, executable files• Data definition directives:
• BYTE, SBYTE, WORD, SWORD, DWORD, SDWORD, QWORD, TBYTE, REAL4, REAL8, and REAL10
• DUP operator, location counter ($)
• Symbolic constant• EQU and TEXTEQU
42