third eela tutorial for managers and users e-infrastructure shared between europe and latin america...
Post on 18-Jan-2018
224 Views
Preview:
DESCRIPTION
TRANSCRIPT
Third EELA Tutorial for Managers and Users
www.eu-eela.org
E-infrastructure shared between Europe and Latin America
CE + WN installation and configuration Vanessa HamarUniversidad de Los Andes – Mérida, VenezuelaRio de Janeiro 26-30, 2006
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Outline
• What is a Computing Element (CE) ?• What is a Torque Server ?• What is a Worker Node?• How to install and configure a Computing Element with
Torque Server.• How to install and configure a Worker Node with
Torque
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
What is CE?
• The CE is a service representing a computing resource.
• Its main functionality is job management (job submission, job control, etc.).
• For job submission, the CE can work in:– push modelpush model (where the job is pushed to a CE for its execution).
– pull modelpull model (where the CE asks the WMS for jobs).
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
• TORQUETORQUE (Tera-scale Open-source Resource and QUEue management) is a resource management providing control over batch jobs and distribuited compute resource.
• The Torque System is composed by a:– pbs_serverpbs_server which provides the basic batch services
such as receiving/creating a batch job or protecting the job against system crashes.
– job_schedulerjob_scheduler which contains the site's policy used to decide which job must be executed.
– pbs_mompbs_mom which places the job into execution. It is also responsible for returning the job’s output to the user.
What is Torque?
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
What is a Worker Node?
• The Worker Node (WN) is a set of clients required to run jobs sent by the CE via the Local Resource Management System. It currently includes the:
– gLite I/O Client, – the Logging and Bookkeeping Client, – the R-GMA Client and – the WMS Checkpointing library.
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing CE + Torque Server
WN + Torque
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
• Start from a fresh install of SLC 3.0.X
• Installation via– Starting from gLite release 3.0 the installation via gLite installer
scripts is not supported.– APT http://glite.web.cern.ch/glite/packages/APT.asp
rpm -qa | grep aptInstall apt if necessary:
rpm -ivh http://linuxsoft.cern.ch/cern/slc30X/i386/SL/RPMS/apt-0.5.15cnc6-8.SL.cern.i386.rpm
• Installation will install all dependencies, including– other necessary gLite modules– external dependencies
Installing pre-requisites
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing pre-requisites
• JAVA is not included in distribution. Install it separately (>= 1.4.2_08)http://java.sun.com/j2se/1.4.2/download.html
chmod +x j2sdk-1_4_2_08-linux-i586-rpm.bin./j2sdk-1_4_2_10-linux-i586-rpm.binrpm -ivh j2sdk-1_4_2_10-linux-i586.rpmPreparing... ###########################################
[100%] 1:j2sdk ###########################################
[100%]
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing pre-requisites
• Depending on the packages set you selected when installing the operating system, it may be possibile that lam package is installed on your WN. Please remove lam.apt-get remove lam
• There is a known installation conflict between the 'torque-clients' rpm and the 'postfix' mail client (Savannah. bug #5509). If you are going to install Torque, uninstall postfix packageapt-get remove postfix
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing pre-requisites
• Check the FQDN hostname
– Ensure that the hostnames of your machines are correctly set. Run the command:
hostname -f
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing pre-requisites
• Install glite-yaim and gilda_ig-yaim packages on your nodes
• Download and install latest version of glite-yaim-3.0.0 -* on all your grid nodes:
http://glitesoft.cern.ch/EGEE/gLite/APT/R3.0/rhel30/RPMS.Release3.0/
rpm -hiv glite-yaim-3.0.0-16.noarch.rpm Preparing...
########################################### [100%] 1:glite-yaim
########################################### [100%]
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
• Download and install the latest version of gilda_ig-yaim-3.0.0 -* on all your grid nodes:
http://grid018.ct.infn.it/apt/gilda_app-i386/utils
[root@eelatut37 root]# rpm -hiv gilda_ig-yaim-3.0.0-11.noarch.rpm Preparing...
########################################### [100%] 1:gilda_ig-yaim
########################################### [100%]
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
• Request host certificates for the CE to a CA– https://gilda.ct.infn.it/CA/mgt/restricted/srvreq.php
• Copy host certificate (hostcert.pem and hostkey.pem) in /etc/grid-certificates.
• Change the permisions– chmod 644 hostcert.pem– chmod 400 hostkey.pem
• If you plan to use certificates released by unsupported EGEE CA’s, be sure that their public key and CRLs (usually distributed with a rpm) are installed.– The CRL of the VO GILDA are available from https://gilda.ct.infn.it/RPMS/ca_GILDA-1.0-1.i386.rpm
Installing pre-requisites
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing pre-requisites
• Syncronization among all gLite nodes is mandatory. Install ntp if not already available for your system:– apt-get install ntp
• Add your time server in /etc/ntp.conf– restrict <time_server_IP_address> mask 255.255.255.255 nomodify notrap
noquery – server <time_server_name> – (you can use ntp-1.infn.it – IP 193.206.144.10)
• Edit /etc/ntp/step-tickers adding your(s) time server(s) hostname• If you are running a firewall, you will have to allow inbound
comminication on the NTP port:– -A INPUT -s <NTP-serverIP-1> -p udp --dport 123 -j ACCEPT
• Activate the ntpd service with the following commands: ntpdate <your ntp server name> service ntpd start chkconfig ntpd on
– You can check ntpd’s status with: ntpq -p
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing CE+Torque Server via apt
• Add gLite apt repository:– Put one this line in a file (e.g. glite.list) inside the
/etc/apt/sources.list.d directory
– apt-get update – apt-get upgrade
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing CE+Torque Server via apt
• All the configuration values to sites have to be configured in a site configuration file using key-value pairs.
• This file is shared among all the different gLite node types. So edit once and keep it in a safe place
• Create a copy of /opt/glite/yaim/examples/site-info.def template (coming from the lcg-yaim RPM) to your reference directory for the installation (e.g. /root):– cp /opt/glite/yaim/examples/gilda_ig-site-info.def /root/my-site-info.def
• A good syntax test for your site configuration file is to try to source it manually running the command:– source my-site-info.def
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing CE+Torque Server via apt
• vi /opt/glite/yaim/examples/gilda_wn-list.conf
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing CE+Torque Server via apt
• Install the node
/opt/glite/yaim/scripts/gilda_ig_install_node gilda_ig-site-info.def GILDA_ig_CE_torque
• Configure the node
/opt/glite/yaim/scripts/gilda_ig_configure_node gilda_ig-site-info.def GILDA_ig_CE_torque
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
• If the installation is performed successfully, the following components are installed:
– gLite in /opt/glite – Condor in /opt/condor-x.y.x (where x.y.z is the
current condor version) – Globus in /opt/globus – Tomcat in /var/lib/tomcat5 – Torque in /var/spool/pbs
Installing CE+Torque Server via apt
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing CE+Torque Server via apt
• Edit /etc/ssh/sshd_config/etc/ssh/sshd_config and add the following lines at the end:
HostbasedAuthentication yes IgnoreUserKnownHosts yes IgnoreRhosts yes
• Restart the server with:
/sbin/service sshd restart/sbin/service sshd restart
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing CE+Torque Server via apt
• On the CE generate an updated version of /etc/ssh/ssh_know_hosts/etc/ssh/ssh_know_hosts by running:
/opt/edg/sbin/edg-pbs-knownhosts
• Copy that file into all the WorkerNodes.
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
Installing WN Server via apt
•Install the node
•/opt/glite/yaim/scripts/gilda_ig_install_node gilda_ig-site-info.def GILDA_ig_WN_torque
•Configure the node
•/opt/glite/yaim/scripts/gilda_ig_configure_node gilda_ig-site-info.def GILDA_ig_WN_torque
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
References
• https://gilda.ct.infn.it/docs/GILDAsiteinstall-3_0_0.html
Third EELA Tutorial for Managers and Users
E-infrastructure shared between Europe and Latin America
top related