algorithm and code - main | networked software...

31
The Technion - Israel Institute of Technology Electrical Engineering Computer Networks Laboratory Project Report Subject: Double FTP Client Author: Jonathan Charbit Project Supervisor: Ilan Hazan 1

Upload: buikiet

Post on 29-May-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

The Technion - Israel Institute of TechnologyElectrical Engineering

Computer Networks Laboratory

Project Report

Subject:

Double FTP Client

Author: Jonathan Charbit

Project Supervisor: Ilan Hazan

Summer 5759

1

Abstract

Project specifications as defined by the Technion’s Network & Communication lab were to design a FTP client, which could retrieve a file from different servers, or even from the same one using two different pathes.

The idea was to get different parts of the same file, using FTP command “REST”, as specified in RFC 959.

Here we implemented a very simple version under UNIX environment written in C programming language : two parallel processes are run, each one retrieving the first and second part of the selected file.

2

Index

Abstract 2

Index 3

Introduction 4

Theoretical background 5

1.The File Transfer Protocol 5

2.Terminology 7

User guide 11

1.First mode of operation 11

2.Second mode of operation 12

Program design 13

A. Routine description 13

1.Main routines 13

2.Other routines 18

B. Data types 21

Conclusion and suggestions 22

3

Introduction

The basic idea was to establish two FTP connections to retrieve the two parts of a same file.

The way that we have selected to implement the program was to get all the information about the second server at level of the program execution, i.e. before the program is running, the user has to specify where to retrieve the second part of the file, and the details about the second connection.

At the beginning of the program, a struct variable is set up that will be used to run the second process (when using the 2give command, see Program design). The second process is independent: it has its own data and control connections.

4

Theoretical background

1.The File Transfer Protocol

The File Transfer Protocol, FTP, is the primary Internet standard for file transfer: from server to client (retrieve) or from client to server (put). FTP was written specifically for computers running TCP/IP and the protocol is using two TCP connections in the same time:

_ control connection, for communication of FTP commands (from client to server) and FTP reply (from server to client)

_ data connection for transfer of data (file transfer or list of files in current directory)

FTP objectives were to promote sharing of files (computer programs or data), to encourage indirect or implicit use of remote computers, to shield a user from variations in file storage systems among hosts, and to transfer data reliably an efficiently.

Though usable directly by a user at a terminal, FTP was designed mainly to be used by programs. It supports several commands that allow bidirectionnal transfer of both binary and text files between computers, where the requesting computers acts as a client, and the second one as a server.

A user account is required on the remote machine. Some servers allow anonymous connections.

The user protocol-interpreter ( PI ) initiates the control connection, under Telnet protocol. After the connection being established, FTP commands are sent by user-PI to the server process via the control connection. In the second direction, FTP replies are sent by server process to user-PI. The communication channel from the user-PI to the server-PI is established as a TCP connection from the user to the standard server FTP port. The user protocol interpreter is responsible for sending FTP commands and interprets the FTP replies of the server.

5

When user wants to retrieve a file, he has to send to server the value of a free port to open on it the data connection. This is done by FTP command “PORT port_number” (the command is sent on control connection, like others FTP commands). Then user has to listen on the specified port and to wait for data transferring from server. When the transfer ends, the user-PI has to close the data connection. If user wants to close data connection before the whole file has been sent (which is done by the first process in our program, because it retrieves the first part of the file), user-PI has to close the data connection and send (on control connection) the FTP command “ABOR” (stand for abort).

Not like the control connection, the data connection is not permanent : for each transfer of data (file transfer or file list of current directory), a new data connection has to be opened.

At the end of the FTP connection, it is the responsibility of the user to request the closing of the control connection (via FTP command “QUIT”), while it is the server that closes it effectively.

6

2.Terminology

ASCIIIn FTP, ASCII characters are defined to be the lowest half of an eight-

bit code set (i.e. the most significant bit is zero).

Control ConnectionThe communication path between the USER-PI and SERVER-PI for

the exchange of FTP commands and replies. This connection follows the Telnet protocol.

Data ConnectionA full duplex connection over which data is transferrred, in a specified

mode and type. The data transferred may be a part of a file, an entire file or a number of files (also list of files in current directory is sent over data connection). The path may be between a server-DTP and a user-DTP, or between two server-DTP’s.

Data PortThe passive data transfer process “listens” on the data port for a

connection from the active transfer process in order to open data connection.

DTPThe data transfer process establishes and manages the data

connection. The DTP can be passive or active.

EOFThe end-of-file condition that defines the end of a file being

transferred.

7

FTP commandsA set of commands that can be sent from user to server and control

information flowing from user to client in both direction.

FileAn ordered set of computer data (including programs), of arbitrary

length, uniquely identified by a pathname.

PathnamePathname is defined to be the character string, which must be input to

a file system by a user in order to identify a file. Pathname normally contains device and/or directory names, and file name specification. FTP does not yet specify a standard pathname convention. Each usr must follow the file naming conventions of the file systems involved in the transfer.

PIThe protocol-interpreter; the user and server sides of the protocol

have distinct roles implemented in a user-PI and a server-PI.

ReplyA reply is an acknowledgment (positive or negative) sent from server

to user in response to FTP command. The general form of a reply is a completion code (including error codes) followed by a text string. The codes are for use by programs and the text is usually intended for human users.

Server-DTPThe data transfer process, in its normal “active” state, establishes the

data connection with the “listening” data port. It sets up parameters for transfer and storage and transfers data on command from its PI. The DTP can be placed in a “passive” state to listen for, rather than initiate a connection on the data port.

8

Server-FTP processA process or set of processes which perform the function of file

transfer in cooperation with a user-FTP process and possibly another server. The functions consist of a protocol interpreter (PI) and a data transfer process (DTP).

Server-PI The server protocol interpreter “listens” on FTP port for a connection from a user-I and establishes a control connection. It receives standard FTP commands from the user-PI, sends replies, and governs the server-DTP.

TypeThe data representation type used for data transfer and storage. Type

implies certain transformations between the time of data storage and data transfer.

User A person or a process on behalf of a person wishing to obtain file

transfer service. The human user may interact directly with a server-FTP process, but use of a user-FTP process is preferred since the protocol design is weighted towards automata.

User-DTPThe data transfer process “listens” on the data port for a connection

from a server-FTP process. If two servers are transferring data between them, the user-DTP is inactive.

User-FTP processA set of functions including a protocol interpreter, a data transfer

process and a user interface which together perform the function of file transfer in cooperation with one or more server-FTP processes. The user

9

interface allows a local language to be used in the command-reply dialogue with the user.

User-PIThe user protocol interpreter initiates the control connection from its

FTP port to the server-FTP process, initiates FTP commands, and governs the user-DTP if that process is part of the file transfer.

10

User guide

Two modes of operation are available:

simple FTP multiple FTP

1.First mode of operation

In the first mode of operation, a simple FTP client is running with this list of commands implemented:

curdir : prints the remote current directory dir : prints list of the files in the remote current directory giveme file_name: retrieves file file_name on server lcd directory: changes local current directory to directory ldir : prints list of the files in the local current directory mkdir new_dir : create a directory named new_dir in the local file system quit : stops the program rhelp : prints list of the server supported FTP commands type data_type: changes the data representation type to data_type

To run in first mode, just type:

client server_name

All the commands are case sensitive and should be typed in lowercase.

11

2.Second mode of operation

When using the second mode, parallel transferring from two servers is available.

To run in second mode, one has to give all the information about the second server:

client first_server_name second_server_name username password

path file_name file_size

When you get the “mftp>” prompt, just type 2give, and the two processes are running.

12

Program design

A. Routine description

1.Main routines:

The program is beginning in the file my_client.c inside the main() function.

If a second server is specified by user, the struct variable file_location (see data types) is set up.

Then, ConnectToServer() is called.

ConnectToServer (server_name):

Server_name : string argument, name of the server to connect to.

creates a socket (only the control connection is created) connects to server (and get a reply) calls to Login() (with NULL as login and pass arguments). if success, calls to GetAndInterpret() in a loop till the quit command is

typed by user close the control connection

GetandInterpret(sock):

Sock: struct my_socket * argument, see datatypes

This is the principal function of the program, where command typed by user is processed.

By calling GetCmdandParam(), the command line is decomposed into command name and arguments.

Here the list of all the commands; the principal ones are described in detail:

13

curdir : prints the remote current directory dir : prints list of the files in the remote current directory giveme file_name: retrieves file file_name on server lcd directory: changes local current directory to directory ldir : prints list of the files in the local current directory mkdir new_dir : create a directory named new_dir in the local file system quit : stops the program rhelp : prints list of the server supported FTP commands type data_type: changes the data representation type to data_type 2give: retrieves two parts of the file in two different processes

curdir:

sends on the control connection the FTP command “PWD” (for Present Working Directory) and print the reply on screen (by calling ReadAndPrint).

dir:

opens the data connection by calling OpenDataConnection() with the struct sock

sends, on the control connection, the port number for the data connection by calling SendPort()

sends, on the control connection, the FTP command “LIST” reads on data connection and prints the file list, by calling WriteList()

giveme file_name:

calls GetFile() with the filename get from the command line as second parameter, 0 as third parameter meaning that the transfer begins at the first byte of the file, and ILLIMITED as last parameter to say that the transfer ends only at the end of the file.

lcd directory :

Executes the system call chdir with parameter directory.

ldir:

14

Executes the shell command ls –l.

mkdir new_dir:

uses system calls to create new_dir on local working directory

quit:

sends FTP command “QUIT” to server prints reply return QUIT value

rhelp:

sends FTP command “QUIT” to server prints reply which contain the command list (the command list is sent by

server as a reply on control connection)

type data_type:

sends FTP command “TYPE” with appropriate parameter to server prints reply

2give:

splits in two processes father process : calls GetFile() with appropriate parameters : the file

name and size are taken from the struct file_location * my_file (see data types) set up in the beginning of the program. The third parameter is 0, meaning that the transfer begins at the first byte of the file; fourth parameter is the number of blocks to transfer: it is calculates in a way that no “hole” is left between the two processes (sometimes it might causes a “double copying “ of some bytes).

Blocks_to_transfer=file_size/(2*FILE_BLOCK_SIZE) + 2

15

The “ +2” comes to prevent “holes” because Blocks_to_transferis is an integer variable, and some problems may occur when divising.

son process : calls to ConnectToNewServer() to create new sockets for control and data connections and retreive the second part of the file.

Now we continue the main routine description.

GetFile(sock, file, beginning, blocks_to_transfer):

Sock: my_socket * argument, see data typesFile: string argument, file name to transferBeginning: number of bytes to restart from (set to zero if transfer from the beginning of the file)blocks_to_transfer: number of blocks to transfer, size of each block is FILE_BLOCK_SIZE bytes (set to ILLIMITED if transfer till the end of the file)

calls OpenDataConnection() creates (or opens if existing) file on the local file system sends FTP command “REST” (stand for restart) with the value of

beginning moves file pointer in the local file system in the right place reads on data connection and prints on file on a loop (each loop is

transferring a block) till blocks_to_transfer are transferred or end of file is reached

prints total transferred bytes if stops before EOF, sends FTP command “ABOR” (for abort) closes file pointer and data connection socket returns transferred bytes

ConnectToNewServer(my_file):

My_file: file_location* argument, see data types

16

creates socket for the new control connection connects to new server, using username and password from my_file (set

up at the beginning of the program) calls to GetFile() with appropriate parameters: the file name and size are

taken from the struct file_location * my_file (see data types) set up in the beginning of the program. Third argument is (size+FILE_BLOCK_SIZE)/2. Fourth argument is ILLIMITED.

2.Other routines:

17

ReadAndPrint(sd, gen_code):

Sd: integer argument, number of the control connection socketGen_code: integer variable, set up to FIRST_ITERATION needed for recursive calls

ReadAndPrint() is a recursive function. Every call gets one reply from the server and prints it. The number of reply the server is going to send is not known at the time the first reply is sent. FTP protocol specifies that the server should put a “marker” at the beginning of the last reply. ReadAndPrint() is calling itself until this marker is seen on the beginning of the reply.

To identify the first call, gen_code is set to FIRST_ITERATION.

GetCmdandParam(cmd, param, cmd_line):

Cmd: string that contains ,at the end of the routine, the name of the command typed by userParam: array of strings, that contains ,at the end of the routine, the parameters of the command lineCmd_line: command line typed by user

GetCmdandParam() decomposes cmd_line in words : first word in cmd, and next words in param.

Login(sd, login, pass):

Sd: integer argument, number of the control connection socket

18

Login: string argument, username for the connection (useful for the second connection)Pass: string argument, password for the connection (useful for the second connection)

gets username from user and sends USER command on control connection

gets password from user and sends PASS command on control connection

if login and pass are not NULL, they are sent to server

OpenDataConnection(sock):

Sock: my_socket * argument, see data types

creates socket for data connection chooses a port for data connection, randomally listen on this port

SendPort(sock):

Sock: my_socket * argument, see data types

gets data connection port number sends FTP command “PORT” with the appropriate number

SendRest(sock, num):

Sock: my_socket * argument, see data typesNum: number of bytes to restart from

19

sends FTP command “REST” with the appropriate number prints reply

WriteList(sock):

Sock: my_socket * argument, see data types

reads on data connection and prints to screen on a loop until no more data is sent by server

prints reply (from server on control connection)

B. Data types:

20

For socket ids, a struct variable has been defined.

struct my_socket {int ctl; /* socket id for the control connection */int data; /* socket id for the data connection */

}

A struct variable has been defined for information about the file to retrieve by the help of the command “2give” (with two parallel running processes).

struct file_location {char* server_name; /*server name for the second connection*/char* file_name; /* name of the file to retreive*/char* login; /*username for the second connection*/char* password; /*password for the second connection*/char* path; /*path of the file to retreive*/int size; /* size of the file to retreive*/

}

Conclusion and Suggestions

21

This project was only the first steps in the implementation of a very wide idea: parallel transfer of different parts of a same file.

Many advanced implementations could be designed in future:

increasing the number of parallel connections implementation using threads instead of different processes insertion of time measurement to compare performances instead of attribute to each connection a part of the file before running

time, ones could design a program in which each connection takes care of a “File block”, and when the whole “File block” has been transferred go for the next “File block”

Example with N connections and File block size :1000 Kbytes

Connection #1: 1-1000Connection #2: 1001-2000....Connection #N: (N-1)*1000 +1- N*1000

The first of the N connections that complete its “File Block” will transfer the N+1th “File Block” : N*1000 +1 – (N+1)*1000… and so on…

File block size could be defined differently and dynamically for each connection, in view of connection performance

22