nest: network storage flexible commodity storage appliances
Post on 17-Jan-2016
31 Views
Preview:
DESCRIPTION
TRANSCRIPT
John Bent, Venkateshwaran Venkataramani,Miron Livny, Andrea Arpaci-Dusseau, and
Remzi Arpaci-Dusseau
University of Wisconsin, Madison
NeST: Network StorageFlexible Commodity Storage
Appliances
www.nestproject.com
NeST: Network Storage
Flexible, commodity based, software-only storage appliances
Building a NeST should be as easy as:
Finding a networked machine“Dropping” some software on itSelf-configuring to best utilize machine
www.nestproject.com
NeST resource management
Dynamic user account creation provides storage for migratory grid users
Storage reservations and quota systemsintelligent scheduling of data intensive applications
Matchmaking match storage opportunities & storage consumers
www.nestproject.com
Condor NeSTsBetter, smarter checkpoint serversCheckpoints are just another data fileData replicated/migrates to different NeSTsCondor jobs access data from closest NeSTFlexible policy support for managing disk
and network resources
www.nestproject.com
New worlds, New problems
Diverse hardware, software platforms- Netapp, EMC advantage
• fewer platforms, control over OS
- Our approach• Automate configuration to each host system
– H/W (Disks): use file system or self-manage– S/W (File System): use either read/write or mmap
Implication: Flexibility at a cost?Key is design of the software
www.nestproject.com
NeST structureModules for communication, transfer & storage
Protocol layer • Pluggable protocols allow diverse protocols to be
mapped into common control flows
Transfer layer• Different concurrency architectures to maximize system
throughput on diverse platforms
Storage layer• Provides abstract interface to disks
www.nestproject.com
NeST Structure
Protocol Layer
GFTP NeST WiND HTTP NFS
ControlLogic
Concurrency Architecture
NonblockingMulti-process
Multi-threaded
Storage Layer
Raw disk Local FS RAID Memory
www.nestproject.com
Many Protocols, Single Server
Single point of control
- Storage quotas/guarantees can be supported
- Bandwidth can be controlled & QoS provided
Single administrative interface- Set policies, Manage user accounts
Maintainable S/W- Shared code base reduces replication,
increases maintainability
www.nestproject.com
Protocol layer implementation
Each protocol listens on well-defined portCentral control accepts connectionsProtocol layer reads from connection and
returns generic request objectAnalagous to Linux V-nodes
– Add new protocol by writing a couple of methods
www.nestproject.com
Virtual protocol classclass nestProtocol {public: virtual nestRequest* receiveRequest( fd_set *ready ) = 0;
virtual ssize_t Receive( char *buffer, ssize_t n )= 0;virtual ssize_t Send( char *buffer, ssize_t n ) = 0;virtual bool sendAck( ) = 0;virtual bool sendErrorMessage( int error ) = 0;virtual bool sendDirectoryList( NestFileInfo *list );virtual bool sendPwd( char *pwd );virtual bool sendTransferInfo( class nestRequest *req );virtual bool sendFilesize( int retval, long size) = 0;virtual bool sendQuotaInfo( struct UserUsage* usage );virtual bool hasSocket() = 0;virtual int getSocket() = 0;
};
www.nestproject.com
Virtual protocol classPlus some static methods static int listen( int port );
static nestProtocol* accept( int socket, sockaddr_in *address );
static NestReplyStatus initiateThirdParty( nestClient *client, nestProtocol **connection );
www.nestproject.com
Grid FTP virtual protocol
Good news?– Hard work already done– Already implemented libraries have
support for parallel streams, authentication, etc.
Bad news?– Slight mismatch to integrate callback
model into synchronous API
www.nestproject.com
Grid FTP - listen// static initialization routineint nestProtocolGridFtp::listen( int port ) { globus_result_t result; globus_ftp_control_server_t *server; globus_module_activate( GLOBUS_FTP_CONTROL_MODULE ); server = (globus_ftp_control_server_t *) Malloc( sizeof( struct globus_ftp_control_server_s ) ); result = globus_ftp_control_server_handle_init( server ); if ( result != GLOBUS_SUCCESS ) { goto ERROR; } short unsigned Port = (short unsigned) port; result = globus_ftp_control_server_listen( server, &Port,
listenCallback, NULL ); if ( result != GLOBUS_SUCCESS ) { goto ERROR; }
if ( Spipe( pipes ) < 0 ) { goto ERROR; }}
www.nestproject.com
Grid FTP - receiveRequestnestRequest* nestProtocolGridFtp::receiveRequest( fd_set *ready ){
// ignore the fd_set char msg[MESSAGESIZE]; snprintf( msg, MESSAGESIZE, "%s", "Gftp receiveRequest called: " ); nestRequest *req = this->request; if ( req != NULL ) { snprintf( &(msg[strlen(msg)]), MESSAGESIZE - strlen(msg), "%s\n", "Actually have a request to give.\n" ); } else { snprintf( &(msg[strlen(msg)]), MESSAGESIZE - strlen(msg), "%s\n", "No request to give.\n" ); } nest_debug( 1, msg ); return req; }
www.nestproject.com
Grid FTP - sendFilesizebool nestProtocolGridFtp::sendFilesize( int retval, long size){ assert( reqType == FILESIZE_REQUEST ); switch( retval ) { case NEST_SUCCESS: snprintf( buffer, FILEBUFFER, "213 %ld\r\n", size ); break; case NEST_REMOTE_FILE_NOT_FOUND: snprintf( buffer, FILEBUFFER, "550 File not found\r\n" ); break; default: assert( 0 ); } return sendMessage( buffer );}
www.nestproject.com
Grid FTP - sendMessagebool nestProtocolGridFtp::sendMessage( const void *vptr ) { globus_result_t result; waitIdle(); setWait( true );
result = globus_ftp_control_send_response(handle, (char *)vptr, commandCallback, this );
if ( result != GLOBUS_SUCCESS ) { fprintf( stderr, "globus_ftp_control_send_response error: %s\n", resultToString( result ) ); free( handle ); return false; } return true;}
www.nestproject.com
Grid FTP - commandCallback
void nestProtocolGridFtp::commandCallback( void * arg, struct globus_ftp_control_handle_s * handle, globus_object_t * error){ nest_debug( 1, "command_callback called - set wait false\n" ); if( error != GLOBUS_SUCCESS ) {
fprintf(stderr,">> command_callback: error %s\n", globus_object_printable_to_string( error ) );
return; } // set the wait flag to false nestProtocolGridFtp *connection = (nestProtocolGridFtp *)arg; connection->setWait( false );}
www.nestproject.com
Concurrency architecture
Three difficult goals– Low latency– High bandwidth– Multiple simultaneous clients
No single portable solution– Multiple models provide solutions on a range
of different platforms• Multi-threaded, multi-process, single process
nonblocking
www.nestproject.com
Flexible Concurrency
Concurrency architectureNonblocking Multi-process Multi-threaded
Control logic creates transfer object– Virtual connection from the protocol layer– Virtual connection from the storage layer– Transfer object passed to concurrency layer– Control logic informed when transfer ends
www.nestproject.com
Concurrency ModelsNonblocking model
– Performs only non-blocking I/O – Selects on file-descriptors / sockets from each transfer object
Pre-allocated pool of processes– Descriptors are passed over a pipe– Transfer object recreated on other side– Each process does a blocking transfer
Pre-allocated pool of threads– Transfer object enqueued by server– Dequeued by an available thread
www.nestproject.com
Concurrency Models and GFTP
Nonblocking model– Not yet supported. – Grid FTP libraries do not expose socket on which to select.
Pre-allocated pool of processes– Not yet supported.– Haven’t figured out how to send a GridFTP connection over a pipe
Pre-allocated pool of threads– No problem. Fully integrated.
Other models could work . . .
www.nestproject.com
Storage LayerThree needed areas of flexibility
– File systems interfaces• Example: read()/write() or mmap()
– Abstract storage models• RAID, JBOD, etc.• Memory storage model also a possibility
– Provide file system interface to remote memory– Useful for buffering systems like Kangaroo
• Virtual storage model akin to virtual protocol layer
– User account administration• Creation and removal• Quotas and guarantees for users and groups
top related