crafting a ‘demo’ program

Crafting a ‘demo’ program

A ‘walk-through’ of the program development cycle for an example

in assembly language

Our purpose

• We want to illustrate the steps that a Linux program needs to take when modifying the normal ‘canonical mode’ terminal behavior

• We want to write it in assembly language

• Our Project #2 involves something similar

• Here we want to ‘Keep It Simple’ (KISS)

• But yet we want to show the essentials

• We might see new Pentium instructions

Just a tiny change

• Users can normally ‘cancel’ a program

• They can do it by typing <CONTROL>-C

• It’s important for stopping “infinite loops”

• The system sends a ‘termination’ signal

• This avoids the need for a system ‘reboot’

• But we can ‘reprogram’ this tty capability

• We just turn off a bit in the ‘c_lflag’ field

Our ‘nocbreak.s’ demo

• Step 1: get the terminal’s initial settings

• Step 2: save a copy of these settings

• Step 3: modify the ISIG bit in ‘c_lflag’ field

• Step 4: install the ‘modified’ tty settings

• Step 5: let user do some keyboard input

• Step 6: reinstall original terminal settings

• Step 7: Quit (i.e., return control to Linux)

Step 1: Get ‘tty’ settings

• We can use the ‘tcgetattr()’ function• It’s part of the system’s runtime library• Use ‘man’ command to see how it’s called• Here’s its function prototype:

int tcgetattr( int fileno, struct termios &tty );• We can call it using assembly language:

– Push the arguments (in right-to-left order)– Call the function: call tcgetattr– Discard the arguments from the stack

Here’s the code

.section .data

ttywrk: .space60 # for ‘termios’ object

.section .text

pushl $ttywrk # push the address

pushl $0 # push device-ID

call tcgetattr # call runtime library

addl $8, %esp # discard arguments

Step 2: copy the object

• We can setup a loop to perfortm copying

• Loop can copy structure one byte at a time

• Total number of bytes is loop-count (60)

• Put source-address into a cpu register

• Put dest’n-address into a cpu register

• Advance addresses as each byte is copied

• Use ‘loop’ opcode to decrement-and-jump

Here’s the data

.section .data

ttysav: .space 60 # original structure

ttywrk: .space 60 # our working copy

And here’s the code

.section .textmovl $ttywrk, %esi # setup source addrmovl $ttysav, %edi # setup dest’n addrmovl $60, %ecx # setup loop-count

nxmv: # label the loop-bodymovb(%esi), %al # copy src byte to ALmovb%al, (%edi) # copy AL to dest’nincl %esi # advance src-addrincl %edi # advance dst-addrloop nxmv # finish coping bytes

Step 3: modify the flag-bit

• We know where the ‘c_lflag’ field is

• It’s starts 12 bytes into ‘termios’ structure

• We got this info from our ‘ttyinfo.cpp’ demo

• Similarly we can find that ISIG bit is bit #1

• We want to “reset” this bit (i.e.,clear it to 0)

• We could use a bitwise AND operation

• But Pentium offers us another way (BTR)

Here’s the code

.equ ISIG, 0 # symbolic constant

.section .data

ttywrk: .space 60 # for termios object

.section .text

movl $12, (%edx) # offset for ‘c_lflag’

btr #ISIG, ttywrk(%edx) # resets bit #1

Brief digression

• Other Pentium bit-manipulations:

BTS (bit-set)

BTR (bit-reset)

BTC (bit-complement)

BT (bit-test)• These operations all have this “side effect”:

– the previous bit-value gets transferred to the CF-bit (Carry Flag) within the Pentium’s EFLAGS register

• Why? So you can use JC (or JNC) afterward

Step 4: Install new behavior

• We can use the ‘tcsetattr()’ function

• Use ‘man tcsetattr’ to see how its called

• Requires three function arguments:– Device’s ID-number (i.e., 0 for keyboard)– A flag-value, to specify buffer-flushing– The address of the new ‘termios’ object

• As usual, these arguments have to be pushed in reverse (i.e., right-to-left) order

Here’s the function-call

.section .textpushl $ttywrk # address of the objectpushl $TCSAFLUSH # flag-valuepushl $0 # keyboard’s device-IDcall tcsetattr # call to runtime libraryaddl $12, %esp # rebalance stack

# NOTE: Similar code is used later in step 6

Step 5: Try new tty behavior

• We want to let the user type some input

• In particular, we want to test <CTRL>-C

• We’ve changed the normal tty handling

• Prove <CTRL>-C won’t stop the program

• Find out what the new response will be

• We need program to ‘read’ from keyboard

• Can use ‘read()’ from the runtime library

How ‘read()’ works

• Function’s prototype shows 3 arguments:– Device ID-number (e.g., 0 for the keyboard)– Address for an input-buffer (we create buffer)– Maximum number of bytes that will be read

• In canonical mode, the ‘read()’ call won’t return until either the user hits <ENTER> or the maximum number of bytes have been transferred into the input-buffer

So here’s the ‘read()’ call

.section .datainchar: .space 1 # room for 1 byte

.section .textpushl $1 # maximum bytespushl $inchar # buffer’s addresspushl $0 # keyboard’ IDcall read # call to C libraryaddl $12, %esp # discard

arguments

Testing for <EACAPE>-code

• We needed a way to stop the program

• Can’t quit by using <CONTROL>-C now

• Our solution: quit by hitting <ESCAPE>

• So program needs to test for its ascii-code

• ASCII-code for ESCAPE-key equals 0x1B

• Our loop includes a compare-and-branch

Testing for the ‘exit’ condition

.section .datainchar: .space 1 # buffer for user

input

.section .textagain:

…cmpb$0x1B, inchar # user typed ESC?jne again # no, reenter loop

# otherwise, fall through to next instruction

A ‘tweak’ for esthetics

• When we tested our ‘nocbreak’ demo, we did not like the screen’s appearance

• Our program’s final output was ‘garbled’ by the subsequent command-shell prompt

• We wanted to make the output prettier

• So we added a additional code-fragment

• A ‘newline’ control-code gets printed after each keypress by the user (using ‘write()’)

In-class exercises

• Programmers can choose among several ways of accomplishing a particular task

• Example: there’s more than one way to copy a 60-byte data-structure from one place in memory to another

• We don’t have to do it one-byte-at-a-time

• We don’t have to use both %esi and %edi

• Try doing the copying in some other ways

Using a common array-index

• Here’s an idea for a different copying scheme.section .textxorl %esi, %esi # array-indexmovl $30, %ecx # word-count

nxwm:movw ttywrk(%esi), %ax # fetch wordmovw %ax, ttysav(%esi) # store wordaddl $2, %esi # next word indexloop nxwm

Using a ‘scaled’ array-index

# we use a ‘scaled index’ to do array-addressing.section .text

xorl %esi, %esi # clear to zeromovl $30, %ecx # loop-count

nxwd:movw ttywrk( , %esi, 2), %axmovw %ax, ttysav( , %esi, 2)incl %esi # increment indexloop nxwd:

Exercise

• Try to devise the ‘most efficient’ method you can think of for copying the 60-bytes

• But what does ‘most efficient’ mean?– Using the fewest assembly statements?– Using the fewest cpu regisers?– Executing the fewest loop-iterations?

• Will your “solution” be the same no matter what you think “most efficient” means?

crafting a ‘demo’ program

Documents

termios tty

copy al

setup loopcountnxmv

copy src byte

function arguments

termios object

modified tty settingsstep

setup destn addrmovl