lab-pdc (1)
TRANSCRIPT
Parallel and Distributed Computing
COMPUTER LABORATORY MANUAL
Parallel and Distributed Computing(CS – 332)
Spring Semester
DEPARTMENT OF COMPUTER SOFTWARE ENGINEERINGMilitary College of Signals
National University of Sciences and Technologywww.mcs.nust.edu.pk
Page 1
Parallel and Distributed Computing
PREFACEThis lab manual has been prepared to facilitate the students of software engineering in studying and analysing various components of a compiler. The compiler is software that converts high level code to low –level machine code or in other words converts a program file to an executable one. The stages of the compiler are scanning, parsing, semantic analysis and code generation. The lab sessions are designed to improve the abilities of the students by giving hands on experience. Tools and languages are given with each lab session.
PREPARED BYLab manual is prepared by Dr. Hammad Afzal using the material from Lab Manuals prepared by Dr. Faisal Bashir and Lab Engr Fazalullah. The first two labs are re-produced using the Lab Manuals of Object Oriented Programming (prepared by Dr. Hammad Afzal and Lab Engr Umer Mehmood). The whole manual is created under the general supervision of Head of Department Dr. Naveed Iqbal Rao in year 2014.
GENERAL INSTRUCTIONSa. Students are required to maintain the lab manual with them till the end of the semester.b. All readings, answers to questions and illustrations must be solved on the place provided. If more
space is required then additional sheets may be attached.c. It is the responsibility of the student to have the manual graded before deadlines as given by the
instructord. Loss of manual will result in re submission of the complete manual.e. Students are required to go through the experiment before coming to the lab session. Lab session
details will be given in training schedule.f. Students must bring the manual in each lab.g. Keep the manual neat clean and presentable.h. Plagiarism is strictly forbidden. No credit will be given if a lab session is plagiarised and no re
submission will be entertained.i. Marks will be deducted for late submission.j. Error handling in a program is the responsibility of the Student.
VERSION HISTORYDate Update By Details
Jan, 2014 Dr. Hammad Afzal First Version Created
Page 2
Parallel and Distributed Computing
MARKS
Exp #
Date Conducted
Experiment Title Max. Marks
Marks Obtained
Instructor Sign
1234567891011121314
Grand Total
Page 3
Parallel and Distributed Computing
List of Experiments
Lab 1. Socket Programming using TCP----------------------------------------------------- Page 5
Lab 2. Socket Programming using UDP ---------------------------------------------------- Page 10
Lab 3. Practicing Client Server Applications----------------------------------------------- Page 14
Lab 4. Concurrency and Threads in Java--------------------------------------------------- Page 16
Lab 5. Advanced Threads in Java----------------------------------------------------------- Page 22
Lab 6. Concurrency with Semaphores ----------------------------------------------------- Page 26
Lab 7. JAVA RMI------------------------------------------------------------------------------ Page 31
Lab 8. Design and Develop a Remote Method Invocation API --------------------------Page 38
Lab 9. XML Document Validation----------------------------------------------------------- Page 41
Lab 10.Web Services-------------------------------------------------------------------------- Page 48
Lab 11. Development of Ontologies using Protégé-----------------------------------------Page 51
Lab 12. Development of Ontologies using Protégé -II-------------------------------------Page 55
Lab 13. Distributed Databases and Map-Reduce-I----------------------------------------- Page 59
Lab 14. Distributed Databases and Map-Reduce-II---------------------------------------- Page 62
Page 4
Parallel and Distributed Computing
LAB 1: Socket Programming - TCP Objective
To demonstrate how connection oriented sockets (TCP) are created and used. Moreover, ServerSocket class, its methods and implementation of a multithreaded server is explained.
Tools
Programming Language: JavaOperating System: Ubuntu
Theory.
Designing the solution
Get socket info of remote web servers to which //your client is attached.
Page 5
Parallel and Distributed Computing
import java.net.*;
import java.io.*;
public class getSocketInfo {
public static void main(String[] args) {
for (int i = 0; i < args.length; i++) {
try {
Socket theSocket = new Socket(args[i], 80);
System.out.println("Connected to " + theSocket.getInetAddress()
+ " on port " + theSocket.getPort() + " from port "
+ theSocket.getLocalPort() + " of " +
theSocket.getLocalAddress());
} // end try
catch (UnknownHostException e) {
System.err.println("I can't find " + args[i]);
}
catch (SocketException e) {
System.err.println("Could not connect to " + args[i]);
}
catch (IOException e) {
System.err.println(e);
}
Task
1. Try entering No argument to main method: Which exception is thrown? [2]
2. Modify the above code so that user is prompted on GUI based dialog box to enter the IP address. Write the additions here [2]
Page 6
Parallel and Distributed Computing
Sample TCP Client
import java.net.*;
import java.io.*;
public class TCPClient {
public static void main (String args[]) {
// arguments supply message and hostname of destination
Socket s = null;
try{
int serverPort = 7896;
s = new Socket(args[1], serverPort);
DataInputStream in = new DataInputStream ( s.getInputStream());
DataOutputStream out =new DataOutputStream( s.getOutputStream());
out.writeUTF(args[0]); // UTF is a string encoding
String data = in.readUTF();
System.out.println("Received: "+ data) ;
}catch (UnknownHostException e){
System.out.println ("Sock:"+e.getMessage());
}catch (EOFException e){System.out.println("EOF:"+e.getMessage());
}catch (IOException e){System.out.println("IO:"+e.getMessage());}
finally {if(s!=null) try {s.close();}catch (IOException e)
{System.out.println("close:"+e.getMessage());}}
} }
Task3. Create a sample TCPClient as shown above.4. How many arguments do you need to give as input to main in order to have it run
successfully? What are those arguments? [2]
5. Modify the above client so that it reads a text file and send it to server. Write down only those line of codes that you need to change. [4]
Page 7
Parallel and Distributed Computing
import java.net.*;
import java.io.*;
public class TCPServer {
public static void main (String args[]) {
try{
int serverPort = 7896;
ServerSocket listenSocket = new ServerSocket (serverPort);
while(true) {
Socket clientSocket = listenSocket.accept ();
Connection c = new Connection(clientSocket);
}
} catch(IOException e) {System.out.println("Listen :"+e.getMessage());}
}
}
class Connection extends Thread {
DataInputStream in;
DataOutputStream out;
Socket clientSocket;
public Connection (Socket aClientSocket) {
try {
clientSocket = aClientSocket;
in = new DataInputStream( clientSocket.getInputStream());
out =new DataOutputStream( clientSocket.getOutputStream());
this.start();
} catch(IOException e)
{System.out.println("Connection:"+e.getMessage());}
}
public void run(){
try { // an echo server
String data = in.readUTF();
out.writeUTF(data);
} catch(EOFException e) {System.out.println("EOF:"+e.getMessage());
} catch(IOException e) {System.out.println("IO:"+e.getMessage());}
finally{ try {clientSocket.close();}catch (IOException e){/*close
failed*/}}
}
}
Page 8
Parallel and Distributed Computing
Task6. What is the purpose of creating new connection in a new thread ? What will happen if new
thread is not used for Connection? [3]
7. Create an EchoServer that echo’s the client message back to client i.e., the client sends a msgs to the server and server simply reply back the same msg to the client. You are required to create a multithreaded server using ServerSocket class. [Implementation] [7]
Web Resources
1. Transmission Control Protocol: http://en.wikipedia.org/wiki/Transmission_Control_Protocol2. Sockets: http://docs.oracle.com/javase/tutorial/networking/sockets/3. Echo Server: http://bansky.net/echotool/
Page 9
Parallel and Distributed Computing
LAB 2: Socket Programming - UDP Objective
Implement concurrent echo client-server application using UDP Sockets.
Tools
Programming Language: JavaOperating System: Ubuntu
Theory.
Designing the solution
Page 10
Parallel and Distributed Computing
UDP Client
import java.io.*;
import java.net.*;
class UDPClient
{
public static void main(String args[]) throws Exception
{
BufferedReader inFromUser = new BufferedReader (new
InputStreamReader(System.in));
DatagramSocket clientSocket = new DatagramSocket ();
InetAddress IPAddress = InetAddress.getByName ("localhost");
byte[] sendData = new byte[1024];
byte[] receiveData = new byte[1024];
String sentence = inFromUser.readLine();
sendData = sentence.getBytes();
DatagramPacket sendPacket = new DatagramPacket(sendData, sendData.length,
IPAddress, 9876);
clientSocket.send(sendPacket);
DatagramPacket receivePacket = new DatagramPacket(receiveData,
receiveData.length);
clientSocket.receive(receivePacket);
String modifiedSentence = new String(receivePacket.getData());
System.out.println("FROM SERVER:" + modifiedSentence);
clientSocket.close();
}
}
Task1. What is the difference between DatagramSocket and Socket? [2]
Page 11
Parallel and Distributed Computing
2. Why do we not need to have DatagramSocket fixed with an IP address at time of creation? [2]
3. Modify the above program so that it gets System time before sending the packet to server. It adds the timestamp along with the message.Write your added functionality here. [4]
UDP Serverimport java.io.*;
import java.net.*;
class UDPServer
{
public static void main(String args[]) throws Exception
{
DatagramSocket serverSocket = new DatagramSocket (9876);
byte[] receiveData = new byte[1024];
byte[] sendData = new byte[1024];
while(true)
{
DatagramPacket receivePacket = new DatagramPacket (receiveData,
receiveData.length);
serverSocket.receive (receivePacket);
String sentence = new String (receivePacket.getData());
System.out.println ("RECEIVED: " + sentence);
InetAddress IPAddress = receivePacket.getAddress();
int port = receivePacket.getPort();
Page 12
Parallel and Distributed Computing
String capitalizedSentence = sentence.toUpperCase();
sendData = capitalizedSentence.getBytes();
DatagramPacket sendPacket =
new DatagramPacket(sendData, sendData.length, IPAddress, port);
serverSocket.send(sendPacket);
}
}
}
Task4. Write down the above code and execute it. 5. Modify the code so that it reads the timestamp from the message sent by client. It calculates
the current system time and find the time taken by packet to be transferred. Write down the lines of code you need to modify. [4]
6. Calculate bandwidth (Data sent per unit time) using the calculations in Task 4 [2]
7. Create a Server that provides Time Service at any port (e.g., 5099). The clients using Telnet facility should be able to access the current time from the server.[Implementation] [6]
Web Resources
1. UDP: http://en.wikipedia.org/wiki/User_Datagram_Protocol2. Networks and Sockets: http://docs.oracle.com/javase/tutorial/networking/sockets/3. Time Server: http://www.worldtimeserver.com/
Page 13
Parallel and Distributed Computing
LAB 3: Practicing Client Server Applications Objective
To practice and implement the concepts learnt in Labs about TCP/UDP Socket Programming.Tools
Programming Language: JavaOperating System: Ubuntu
Theory.
Task
1. Write a program that searches ports between user given range on “localhost” to check for open ports.
[5]
Task2. Create a server that provides factorial of any number to its client. The client connects to the
server and then asks for number from the user. The Server provides a implementation and calculates the factorial and returns the result to the client.
[5]
Page 14
Parallel and Distributed Computing
Task: Chat Application
3. Develop a GUI based chat software with the following features. [3]
4. The user should be able to select an avatar for him/her. [1]
5. The server should maintain list of users that are logged in. [3]
6. The user should be able to send message to other logged in users. [3]
Web Resources
1. Transmission Control Protocol: http://en.wikipedia.org/wiki/Transmission_Control_Protocol
2. Networking and Sockets: http://docs.oracle.com/javase/tutorial/networking/sockets/
3. UDP: http://en.wikipedia.org/wiki/User_Datagram_Protocol
4. Port Scanner: http://en.wikipedia.org/wiki/Port_scanner
Page 15
Parallel and Distributed Computing
LAB 4: Concurrency and Threads in Java Objective
Implement concurrent echo client-server application using TCP Sockets.Tools
Java Network Programming.Operating System: Ubuntu
Theory.
ConcurrencyStreaming audio application must simultaneously read the digital audio off the network, decompress it, manage playback, and update its display. Software that can do such things is known as concurrent software The Java platform is designed from the ground up to support concurrent programming, with basic concurrency support in the Java programming language and the Java class libraries. Basic concurrency support and summarizes some of the high-level APIs in the java.util.concurrent packages. In concurrent programming, there are two basic units of execution:
Processes Threads
In the Java programming language, concurrent programming is mostly concerned with threads. However, processes are also important.
Time Slicing• Processing time for a single core is shared among processes and threads• This sharing of time is performed through an OS feature called time slicing• Concurrency is possible even on simple systems, without multiple processors or execution
cores IPC• To facilitate communication between processes, most operating systems support Inter Process
Communication (IPC) resources, such as pipes and sockets• IPC is used not just for communication between processes on the same system, but processes
on different systems. Threads• Threads are sometimes called lightweight processes• Both processes and threads provide an execution environment, but creating a new thread
requires fewer resources than creating a new process• Threads exist within a process — every process has at least one• Threads share the process's resources, including memory and open files. • Multithreaded execution is an essential feature of the Java platform• Every application has at least one thread — or several• If you count "system" threads that do things like memory management and signal handling• But from the application programmer's point of view, you start with just one thread, called the
main thread.
Thread object• Each thread is associated with an instance of the class Thread• There are two basic strategies for using Thread objects to create a concurrent application
Page 16
Parallel and Distributed Computing
• To directly control thread creation and management, simply instantiate Thread each time the application needs to initiate an asynchronous task
• To abstract thread management from the rest of your application, pass the application's tasks to an executor.
Defining and Starting a Thread• An application that creates an instance of Thread must provide the code that will run in that
thread. There are two ways to do this:
Provide a Runnable object • The Runnable interface defines a single method, run, meant to contain the code
executed in the thread• The Runnable object is passed to the Thread constructor, as in the HelloRunnable
example:
public class HelloRunnable implements Runnable
{
public void run()
{
System.out.println("Hello from a thread!" + getName());
}
public static void main(String args[])
{
(new Thread(new HelloRunnable())).start(); Task
1. Write down the above code snippet. What is the output of the program? [2]
2. What happens when you rename the method “run” to “myrun”. Will the program still run? Write down your observation. [2]
Page 17
Parallel and Distributed Computing
• Defining and Starting a Thread • Subclass Thread • The Thread class itself implements Runnable, though its run method does nothing• An application can subclass Thread, providing its own implementation of run, as in the
HelloThread example:
public class HelloThread extends Thread
{
public void run()
{
System.out.println("Hello from a thread!");
}
public static void main(String args[]) {
(new HelloThread()).start();
} }
Counter.java : A subclass of Thread that counts up to a limit with random pauses in between each count.
public class Counter extends Thread {
private static int totalNum = 0;
private int currentNum, loopLimit;
public Counter(int loopLimit) {
this.loopLimit = loopLimit;
currentNum = totalNum++;
}
private void pause(double seconds) {
try { Thread.sleep(Math.round(1000.0*seconds)); }
catch(InterruptedException ie) {}
}
/* When run finishes, the thread exits. */
public void run() {
for(int i=0; i<loopLimit; i++) {
System.out.println("Counter " + currentNum
+ ": " + i);
// pause(Math.random()); // Sleep for up to 1 second
Page 18
Parallel and Distributed Computing
}
}
} /*
CounterTest.java : Instantiates Counter class and starts threads.
public class CounterTest {
public static void main(String[] args) {
Counter c1 = new Counter(5);
Counter c2 = new Counter(5);
Counter c3 = new Counter(5);
c1.start();
c2.start();
c3.start();
}
}
Task3. Write and run the above code. What is the output you observe? [3]
Example: Another Example of Threads
public class ExampleThread extends Thread
{
private String name;
private String text;
private final int REPEATS = 5;
private final int DELAY = 200;
public ExampleThread( String aName, String aText )
{
name = aName;
text = aText;
}
public void run()
{
Page 19
Parallel and Distributed Computing
try
{
String threadName = Thread.currentThread().getName();
long threadID = Thread.currentThread().getId();
int threadPri = Thread.currentThread().getPriority();
String ThreadString = Thread.currentThread().toString(); //name,
priority and threadgroup
// Thread.currentThread().setPriority(MAX_PRIORITY);
for ( int i = 0; i < REPEATS; ++i )
{
System.out.println( name + " says \"" + text + "\" Thread Name:"
+ threadName + " ID: " + threadID + " Priority: " + threadPri + " " +
ThreadString );
Thread.sleep( DELAY );
}
}
catch( InterruptedException exception )
{
System.out.println( "An error occured in " + name );
}
finally
{
// Clean up, if necessary
System.out.println ( name + " is quiting..." );
}
}
}
public class ThreadTest
{
public static void main( String[] args )
{
ExampleThread et1 = new ExampleThread( "Thread #1", "Hello World!" );
ExampleThread et2 = new ExampleThread( "Thread #2", "Hey Earth!" );
// Thread t1 = new Thread( et1 );
// Thread t2 = new Thread( et2 );
// t1.start();
// t2.start();
//et1.setPriority(10);
Page 20
Parallel and Distributed Computing
et1.start();
et2.start();
// et1.interrupt();
//t1.interrupt();
}
}
Task4. Write the output of ExampleThread Example. Explain the output in 3 sentences. [3]
5. Uncomment the code
// Thread.currentThread ().setPriority(MAX_PRIORITY);Observe and comment on the output. [3]
6. Write a Counter class, similar to the one given in example, but it should implement the interface Runnable rather extend the Thread Class. Implementation [3]
7. Uncomment the comments in code of ThreadTest. What are the differences you observe? Comment.
[4]
Web Resources
1. Java Class Thread: http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Thread.html
2. Concurrency: http://download.oracle.com/javase/tutorial/essential/concurrency/
3. http://download.oracle.com/javase/1.4.2/docs/api/java/lang/Thread.html
Page 21
Parallel and Distributed Computing
LAB 5: Advanced Threads in Java Objective
To practice the advanced concepts of Threads in Java.Tools
Java Network Programming.Operating System: Ubuntu
Theory.
Synchronization• The Java programming language provides two basic synchronization idioms:
• synchronized methods• synchronized statements
• To make a method synchronized, simply add the synchronized keyword to its declaration•
public class SynchronizedCounter {
private int c = 0;
public synchronized void increment() { c++; }
public synchronized void decrement() { c--; }
public synchronized int value() { return c; } }
Example 1 shows how synchronized blocks can be used on objects to coordinate access to them by multiple threads.
class Thread4 extends Thread{ static String[] msg = { "Java", "is", "fast,", "dynamic,", "and", "comphrensive." }; public Thread4(String id) { super(id); } public static void main(String[] args) { Thread4 t1 = new Thread4("t1: "); Thread4 t2 = new Thread4("t2: "); t1.start(); t2.start(); boolean t1IsAlive = true; boolean t2IsAlive = true; do { if (t1IsAlive && !t1.isAlive()) { t1IsAlive = false; System.out.println("t1 is dead."); } if (t2IsAlive && !t2.isAlive()) { t2IsAlive = false;
Page 22
Parallel and Distributed Computing
System.out.println("t2 is dead."); } } while (t1IsAlive || t2IsAlive); } void randomWait() { try { Thread.currentThread().sleep((long) (3000 * Math.random())); } catch (InterruptedException e) { System.out.println("Interrupted!"); } } public void run() { synchronized (System.out) { for (int i = 0; i < msg.length; i++) { randomWait(); System.out.println(getName() + msg[i]); } } }}
Example 2 shows how synchronized methods and object locks are used to coordinate access to a common object by multiple threads.
class Thread3 extends Thread { static String[] msg = { "Java", "is", "fast,", "dynamic,", "and", "comphrensive." }; public static void main(String[] args) { Thread3 t1 = new Thread3("t1: "); Thread3 t2 = new Thread3("t2: "); t1.start(); t2.start(); boolean t1IsAlive = true; boolean t2IsAlive = true; do { if(t1IsAlive && !t1.isAlive()) { t1IsAlive = false; System.out.println("t1 is dead."); } if(t2IsAlive && !t2.isAlive()) { t2IsAlive = false; System.out.println("t2 is dead."); }
Page 23
Parallel and Distributed Computing
} while (t1IsAlive || t2IsAlive); } public Thread3(String id) { super(id); } void randomWait() { try { Thread.currentThread().sleep((long)(3000*Math.random())); } catch(InterruptedException e) { System.out.println("Interrupted!"); } } public void run() { SynchronizedOutput.displayList(getName(), msg);//thread name t1 or t2 }}
class SynchronizedOutput {
// if the 'synchronized' keyword is removed, the message // is displayed in interleaved fashion public static void displayList(String name, String list[] ) { for(int i=0; i<list.length; i++) { Thread3 t = (Thread3) Thread.currentThread(); t.randomWait(); System.out.println(name + list[i]); } } }
Tasks
1. What is the effect of removing the keyword synchronize in example 1. [2]
2. Implement the Parallelized version of Producer Consumer Problem. The buffer size should be
fixed (more than 1). [Implementation] [7]
3. Implement the Parallelized version of Producer Consumer Problem. The buffer size should be
exactly 1. [Implementation] [5]
Page 24
Parallel and Distributed Computing
4. Implement a Bank Account System which should give functionality to desposit and withdraw.
Both functions should be idempotent and synchronized. [Implementation] [6]
Web Resources
1. Java Class Thread: http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Thread.html
2. Concurrency: http://download.oracle.com/javase/tutorial/essential/concurrency/
3. http://download.oracle.com/javase/1.4.2/docs/api/java/lang/Thread.html
4. http://www.javabeginner.com/learn-java/java-threads-tutorial
5. http://en.wikipedia.org/wiki/Synchronization
6. http://download.oracle.com/javase/tutorial/essential/concurrency/sync.html
Page 25
Parallel and Distributed Computing
LAB 6: Concurrency with Semaphores Objective
We shall learn Java Semaphores and apply them in various practical problems.
Tools
Java Network Programming.Operating System: Ubuntu
THEORY
Critical Section is a segment of code that only one thread at a time is allowed access. For example, a
critical section might manipulate a particular data structure or use some resource that supports at most
one client at a time. By placing a lock around this section, you exclude other threads from making
changes that might affect the correctness of your code. Locks (such as Mutex) are used to protect the
critical section.
Semaphores:
Semaphores are used for mutual exclusion and thread synchronization. Instead of busy waiting and
wasting CPU cycles, a thread can block on a semaphore (the operating system removes the thread
from the CPU scheduling or ``ready'' queue) if it must wait to enter its critical section or if the
resource it wants is not available.
Semaphores which allow an arbitrary resource count are called counting semaphores, while
semaphores which are restricted to the values 0 and 1 (or locked/unlocked, unavailable/ available) are
called binary semaphores.
Wait and Signals:
One important property of these semaphore variables is that their value cannot be changed except by
using the wait() and signal() functions.
Semaphores are operated by two operations, historically denoted as V (also known as signal()) and P
(or wait()). Operation V increments the semaphore S and operation P decrements it.
A simple way to understand wait() and signal() operations is:
wait(): Decrements the value of semaphore variable by 1. If the value becomes negative, the process
executing wait() is blocked, i.e., added to the semaphore's queue.
Page 26
Parallel and Distributed Computing
signal(): Increments the value of semaphore variable by 1. After the increment, if the pre-increment
value was negative (meaning there are processes waiting for a resource), it transfers a blocked process
from the semaphore's waiting queue to the ready queue.
java.lang.Object java.util.concurrent.Semaphore
All Implemented Interfaces: Serializable
public class Semaphore extends Object implements Serializable
Semaphores are available in Java as java.util.concurrent.Semaphore.
Constructor
public Semaphore(int permits)Creates a Semaphore with the given number of permits and nonfair fairness
setting.
Method Details
The methods acquire() and release() are used here instead of P (Wait) and V(Signal).
Each acquire() blocks if necessary until a permit is available, and then takes it. Each release() adds a
permit, potentially releasing a blocking acquirer.
public void acquire() throws InterruptedException
Acquires a permit from this semaphore, blocking until one is available, or the thread is
interrupted.
Acquires a permit, if one is available and returns immediately, reducing the number of
available permits by one.
If no permit is available then the current thread becomes disabled for thread scheduling
purposes and lies dormant until one of two things happens.
public void release()
Releases a permit, returning it to the semaphore.
Page 27
Parallel and Distributed Computing
Releases a permit, increasing the number of available permits by one. If any threads are trying
to acquire a permit, then one is selected and given the permit that was just released. That thread
is (re)enabled for thread scheduling purposes.
Counting Semaphore Example in Java (Binary Semaphore)
Semaphore with one permit is known as binary semaphore because it has only two state permit
available or permit unavailable. Binary semaphore can be used to implement mutual exclusion or
critical section where only one thread is allowed to execute. Thread will wait on acquire() until
Thread inside critical section release permit by calling release() on semaphore.
Here is a simple example of counting semaphore in Java where we are using binary semaphore to
provide mutual exclusive access on critical section of code in java:
import java.util.concurrent.Semaphore;
public class SemaphoreTest {
Semaphore binary = new Semaphore(1);
public static void main(String args[]) {
final SemaphoreTest test = new SemaphoreTest();
new Thread(){
@Override
public void run(){
test.mutualExclusion();
}
}.start();
new Thread(){
@Override
public void run(){
test.mutualExclusion();
}
}.start();
}
private void mutualExclusion() {
try {
binary.acquire();
Page 28
Parallel and Distributed Computing
//mutual exclusive region
System.out.println(Thread.currentThread().getName() + " inside mutual
exclusive region");
Thread.sleep(1000);
} catch (InterruptedException i.e.) {
ie.printStackTrace();
} finally {
binary.release();
System.out.println(Thread.currentThread().getName() + " outside of
mutual exclusive region");
}
}
}
The Output of the above program is given below:
Thread-0 inside mutual exclusive region
Thread-0 outside of mutual exclusive region
Thread-1 inside mutual exclusive region
Thread-1 outside of mutual exclusive region
Tasks
1. Copy the program given in PROGRAMS (3) in your IDE and run. Observe the output and write here [2]
2. In program in exercise 1. Remove the codebinary.acquire(); andbinary.release();from the methodprivate void mutualExclusion()Observe the output. Is the output same as in Exercise 1? If not, why? [3]
Page 29
Parallel and Distributed Computing
3. Show the trace of a simulation in Exercise 2, highlighting the following cases: - [5]a. Normal execution.b. Producer is blocked.c. Consumer is blocked.
Major Task
4. Create your own class MySemaphore that should be able mato have functionality of binary
semaphore. [5]
5. MySemaphore should implement all methods as given in original Semaphore. [5]
Web Resources1. http://www.javabeginner.com/learn-java/java-threads-tutorial
2. http://en.wikipedia.org/wiki/Synchronization
3. http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/Semaphore.html
Page 30
Parallel and Distributed Computing
LAB 7: Java RMI OBJECTIVE
How java RMI works How RMI programs can be categorized How RMI classes are compiled and executed
Theory
Remote Procedure CallBirrell and Nelson (1984)
“To allow programs to call procedures located on other machines.”
Effectively removing the need for the Distributed Systems programmer to worry about all the details
of network programming (i.e. no more sockets).
– It abstracts the communication interface to the level of a procedure call.
– Instead of working directly with sockets, the programmer has the illusion of calling a local
procedure, when in fact the arguments of the call are packaged up and shipped off to the
remote target of the call.
– RPC systems encode arguments and return values using an external data representation, such
as XDR.
Remote Method Invocation RMI– is an extension of local method invocation that allows an object
living in one process to invoke the methods of an object living in another process.
Remote Objects
• Objects that can receive remote method invocations are called remote objects and they
implement a remote interface.
Client Server Model
• Client side: Send a request to server to execute a particular method of an object. A typical
client application gets a remote reference to one or more remote objects in the server and then
invokes methods on them.
• Server: Objects define an interface which defines the methods of objects to be used. So with
interface it will be identified that method has been called properly or not. : A typical server
application creates a number of remote objects, makes references to those remote objects
accessible, and waits for clients to invoke methods on those remote objects.
Service and Remote InterfacePage 31
Parallel and Distributed Computing
• Service Interface: In client server model, each server provides a certain set of procedures to
the clients.
• Remote Interface: Specifies functions of an object accessible to the outside world. Can pass
objects as arguments & return object as results.
• RMI provides the mechanism by which the server and the client communicate and pass
information back and forth.
• Such an application is sometimes referred to as a distributed object application.
Distributed object applications need to:
Locate remote objects Communicate with remote objects Load class byte-codes for objects that are passed as parameters or return values
Page 32
Parallel and Distributed Computing
Task:We will create a Calculator Service that will provide some basic arithmetic functions to its client. We therefore, need to define an interface, its implementation, the server and the clients.
First Step:Create Calculator Interface
public interface Calculator extends java.rmi.Remote {
public long add(long a, long b)
throws java.rmi.RemoteException;
public long sub(long a, long b)
throws java.rmi.RemoteException;
public long mul(long a, long b)
throws java.rmi.RemoteException;
public long div(long a, long b)
throws java.rmi.RemoteException;
Tasks:1. Write the above code in IDE and compile it. [1]2. What is the purpose of adding extends java.rmi.Remote. [2]
3. Will the code still compile if we remove throws java.rmi.RemoteException. Write your observation. [2]
Page 33
Parallel and Distributed Computing
public class CalculatorImpl extends java.rmi.server.UnicastRemoteObject
implements Calculator {
// Implementations must have an
//explicit constructor
// in order to declare the
//RemoteException exception
public CalculatorImpl() throws java.rmi.RemoteException {
super();
}
public long add(long a, long b)
throws java.rmi.RemoteException {
return a + b;
}
public long sub(long a, long b)
throws java.rmi.RemoteException {
return a - b;
}
public long mul(long a, long b)
throws java.rmi.RemoteException {
return a * b;
}
}
Tasks:
4. Write the above code in IDE and compile it. [1]
Page 34
Parallel and Distributed Computing
5. What is the purpose of adding extends java.rmi.server.UnicastRemoteObject. [3]
6. The above code doesn’t compile successfully. What are the errors? What are the corrections do you need to make? Make the corrections and compile again. [4]
Calculator Client
import java.rmi.Naming;
import java.rmi.RemoteException;
import java.net.MalformedURLException;
import java.rmi.NotBoundException;
public class CalculatorClient {
public static void main(String[] args) {
try {
Calculator c = (Calculator)
Naming.lookup("rmi://localhost/CalculatorService");
System.out.println( c.sub(4, 3) );
System.out.println( c.add(4, 5) );
System.out.println( c.mul(3, 6) );
System.out.println( c.div(9, 3) );
}
catch (MalformedURLException malFurl) {
System.out.println();
System.out.println("Mal Formed URL");
Page 35
Parallel and Distributed Computing
System.out.println(malFurl);
}
catch (RemoteException re) {
System.out.println();
System.out.println("RemoteException");
System.out.println(re);
}
catch (NotBoundException nbe) {
System.out.println();
System.out.println("NotBoundException");
System.out.println(nbe);
}
catch (java.lang.ArithmeticException ae) {
System.out.println();
System.out.println("java.lang.ArithmeticException");
System.out.println(ae);
}
}
}
Tasks:7. Write the above code in IDE and compile. [1]8. Modify the code so that client and server processes run on different machines
[Implementation] [2]
import java.rmi.Naming;
public class CalculatorServer
{
public static void main(String args[]) {
System.out.println("Calculator Server Running ...");
try {
Calculator c = new CalculatorImpl();
Naming.rebind("rmi://localhost:1099/CalculatorService", c);
} catch (Exception e) {
System.out.println("Trouble: " + e);
}
}
}
TasksPage 36
Parallel and Distributed Computing
9. On the same lines as discussed in the calculator service example create a Power service. It will provide two methods to its remote users: a power method and a square method. The service should be registered to the RMI registry with “POWER SERVICE” name. [5]
Web Resources1. http://download.oracle.com/javase/tutorial/rmi/index.html
2. http://www.oracle.com/technetwork/java/javase/tech/index-jsp-136424.html
3. http://www.eg.bucknell.edu/~cs379/DistributedSystems/rmi_tut.html
Page 37
Parallel and Distributed Computing
LAB 8: Major Assignment – Design and Develop a Remote Method Invocation API
Objective
Design and Develop a complete Remote Method Invocation API. You should be able to create different modules of API by getting help from Java RMI. (Note: You can submit assignment during the week but at least 60% should be completed in
Lab today)
Theory and Tasks
The RMI Software: layer of s/w b/w app-level objects & communication & remote reference
modules:
– Proxy
– Dispatcher
– Skeleton
• The complete functionality is depicted in Figure below.
•
Proxy
Role is to make RMI transparent to the clients by behaving like a local object to the invoker. But
instead of executing an invocation it forwards it in a message to the remote object.
Implementation of Remote Interface on Client Side
When client binds to a distributed system object an (virtual) implementation of the objects interface
called a proxy is loaded into the clients address space.
▫ There is one proxy for every Remote Object for which a process holds the ROR.
▫ Proxy implements them quite differently. Each method of proxy marshals a reference to
the target object, its own method id and its arguments into a request message and sends it
to the target. Then it waits for the reply message. Un-marshal it and returns the results to
the invoker.
Page 38
object A object BskeletonRequestproxy for B
Reply
CommunicationRemote Remote referenceCommunication module modulereference module module
for B’s class& dispatcherremoteclient server
Parallel and Distributed Computing
Tasks
1. Create a class that depicts the functionality of datastructure message. It should have following components. [3]
2. In order to make an object remote, you have to create Remote Object Reference (ROR) of the object. Develop a functionality that creates the ROR for each object that is to be remotely accessed. The format of ROR used in Java RMI is given below. [3]
Binder
Client programs require a mean of obtaining the remote object reference for at-least one of the remote
objects hosted by the server. Binder in a DS is a separate service that maintains a table containing
mappings from textual names to remote object references. An instance of it runs on every server
which holds the remote objects. Used by the servers to register their remote objects by name and by
the clients to look up the remote object references.
//computer name: port/object name
Tasks
3. Create a Naming Server (Binder) that can bind the services (serer side objects) with a Service Name. [Implementation] [2]
4. The Naming server should provide the ability for lookup. [Implementation] [2]5. Server side should be able to create Remote Object. [Implementation] [2]
Skeleton
Class of a remote object has a skeleton which implements the methods in the remote interface.
Incoming invocation messages are first passed to skeleton which un-marshals them to proper remote
method invocations at the objects interface at the server, i.e. it un-marshals the arguments in the
request message and invoke the corresponding method in the servant. Page 39
messageTyperequestId
objectReferencemethodIdarguments
int (0=Request, 1= Reply)int
RemoteObjectRefint or Methodarray of bytes
Internet addressport number time object numberinterface of remote object32 bits 32 bits 32 bits 32 bits
Parallel and Distributed Computing
It waits for the reply. Then marshals the result together with any exceptions if any in a reply message
and send it back to the clients proxy.
Actual object resides on the server machine where it offers the same interface as it does on the client
machine.
• Server has one dispatcher and skeleton for every remote object.
Dispatcher
Dispatcher receives a request message from the communication module.
It uses the method-id to select the appropriate method in the skeleton passing on the request message.
Tasks [Implementation]
6. Design and develop your own Skeleton and Dispatcher. [3]7. Design and develop the communication modules. [3]8. You should be able to simulate the functionality of Calculator Service (in previous lab) using
your own designed API. [4]
Web Resources1. http://download.oracle.com/javase/tutorial/rmi/index.html
2. http://www.oracle.com/technetwork/java/javase/tech/index-jsp-136424.html
3. http://www.eg.bucknell.edu/~cs379/DistributedSystems/rmi_tut.html
Page 40
Parallel and Distributed Computing
LAB 9: XML document validation Objective
To learn how to design XML Schema and XML instance document
Tools
GUI-IDE Tool NetBeans 6.0
Theory
XML validation is the process of checking a document written in XML (eXtensible Markup
Language) to confirm that it is both well-formed and also "valid" in that it follows a defined structure.
A well-formed document follows the basic syntactic rules of XML, which are the same for all XML
documents. A valid document also respects the rules dictated by a particular DTD or XML schema,
according to the application-specific choices for those particular .
An XML schema defines the structure of the elements and attributes in an XML document. For an
XML document to be valid based on an XML schema, the XML document has to be validated against
the XML schema.
In this article, JAXP parsers are used to validate an XML document with an XML schema. In JAXP,
DocumentBuilder classes are used to validate a XML document. XML schema validation is
illustrated with an XML document comprising of a catalog.
Preliminary Setup
To validate an XML document with the Xerces2-j parser, the Xerces2
To validate a XML document with the JAXP parser, its DocumentBuilder classes need to be in the
classpath.
Overview
In this tutorial, an example XML document named catalog.xml is used.<?xml version="1.0" encoding="UTF-8"?>
<!--A OnJava Journal Catalog-->
<catalog
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation = "file://c:/Schemas/catalog.xsd"
title="OnJava.com" publisher="O'Reilly">
Page 41
Parallel and Distributed Computing
<journal date="April 2004">
<article>
<title>Declarative Programming in Java</title>
<author>Narayanan Jayaratchagan</author>
</article>
</journal>
<journal date="January 2004">
<article>
<title>Data Binding with XMLBeans</title>
<author>Daniel Steinberg</author>
</article>
</journal>
</catalog>
Task
1. Create an xml file as shown above. [1]2. What does noNamespaceSchemaLocation mean? [2]
The example XML document is validated with an example XML schema file, catalog.xsd. The
elements in this schema document are in the XML schema namespace of
http://www.w3.org/2001/XMLSchema.
<?xml version="1.0" encoding="utf-8"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="catalog">
<xs:complexType>
<xs:sequence>
<xs:element ref="journal" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute name="title" type="xs:string"/>
<xs:attribute name="publisher" type="xs:string"/>
</xs:complexType>
Page 42
Parallel and Distributed Computing
</xs:element>
<xs:element name="journal">
<xs:complexType>
<xs:sequence>
<xs:element ref="article" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute name="date" type="xs:string"/>
</xs:complexType>
</xs:element>
<xs:element name="article">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element ref="author" minOccurs="0"
maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="author" type="xs:string"/>
</xs:schema>
In the following sections, we'll discuss validation of the example XML document, catalog.xml, with
the example schema document,catalog.xsd.
Validation of an XML Document with the JAXP Parser
To begin, import the DocumentBuilderFactory and DocumentBuilder classes.
The DocumentBuilder class is used to obtain a org.w3c.dom.Document document from an XML
document, while the DocumentBuilderFactory class is used to obtain a DocumentBuilder parser.
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
To validate with a DocumentBuilder parser, set
the System property javax.xml.parsers.DocumentBuilderFactory:
System.setProperty("javax.xml.parsers.DocumentBuilderFactory",
"org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
Next, you need to create a DocumentBuilderFactory.Page 43
Parallel and Distributed Computing
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
An instance of DocumentBuilderFactory is found by applying the following rules and taking the first
one that succeeds:
To parse a XML document with a namespace, set the setNamespaceAware() feature to true. By
default, thesetNamespaceAware() feature is set to false.
factory.setNamespaceAware(true);
Set the setValidating() feature of the DocumentBuilderFactory to true to make the parser a validating
parser. By default, the setValidating() feature is set to false.
factory.setValidating(true);
Set the schemaLanguage and schemaSource attributes of the DocumentBuilderFactory.
The schemaLanguage attribute specifies the schema language for validation.
The schemaSource attribute specifies the XML schema document to be used for validation.
factory.setAttribute (
"http://java.sun.com/xml/jaxp/properties/schemaLanguage",
"http://www.w3.org/2001/XMLSchema");
factory.setAttribute (
"http://java.sun.com/xml/jaxp/properties/schemaSource",
SchemaUrl);
Create a DocumentBuilder parser.
DocumentBuilder builder = factory.newDocumentBuilder ();
This returns a new DocumentBuilder, with the parameters configured in
the DocumentBuilderFactory. Create and register anErrorHandler with the parser.
Validator handler=new Validator();
builder.setErrorHandler(handler);
Page 44
Parallel and Distributed Computing
Validator is a class that extends the DefaultHandler class. The DefaultHandler class implements
the ErrorHandlerinterface. The Validator class is listed in the previous section. Parse the XML
document with the DocumentBuilder parser. The different parse methods are parse(InputStream
is), parse(File f), parse(InputSource is), parse(InputStream is,String systemId), and parse(String uri).
builder.parse (XmlDocumentUrl);
Validator, an ErrorHandler of the type DefaultHandler, registers errors generated by the validation.
Tasks
3. Design a schema for student list. A student has information such as name, semester,roll no, email-ids, phone-nos, etc. [3]
Page 45
Parallel and Distributed Computing
4. Write an XML instance document for the designed schema given in above task. [7]
Page 46
Parallel and Distributed Computing
5. Validate this instance Document against the schema. [Implementation] [7]
Web Resources1. http://www.onjava.com/pub/a/onjava/2004/09/15/schema-validation.html
2. http://www.w3schools.com/Schema/default.asp
3. http://www.w3.org/XML/Schema
Page 47
Parallel and Distributed Computing
LAB 10: Web service, WSDL based, from Java source
Objective
WSDL based: Implement ArithmeticService that implements add, and subtract operations
Tools
GUI-IDE Tool NetBeans 6.0
Theory.
Web service is a method of communications between two electronic devices over the World Wide
Web. It is a software function provided at a network address over the web with the service always on
as in the concept of utility computing.
The W3C defines a Web service as:
a software system designed to support interoperable machine-to-machine interaction over a
network. It has an interface described in a machine-processable format (specifically WSDL).
Other systems interact with the Web service in a manner prescribed by its description using
SOAP messages, typically conveyed using HTTP with an XML serialization in conjunction
with other Web-related standards.
The W3C also states:
We can identify two major classes of Web services:
REST-compliant Web services, in which the primary purpose of the service is to manipulate
XML representations of Web resources using a uniform set of stateless operations; and
Arbitrary Web services, in which the service may expose an arbitrary set of operations.
Web Service Architecture
Page 48
Parallel and Distributed Computing
The Web Services Description Language is an XML-based interface description language that is
used for describing the functionality offered by a web service. A WSDL description of a web service
(also referred to as a WSDL file) provides a machine-readable description of how the service can be
called, what parameters it expects, and what data structures it returns. It thus serves a purpose that
corresponds roughly to that of a method signature in a programming language.
WSDL is often used in combination with SOAP and an XML Schema to provide Web services over
the Internet. A client program connecting to a Web service can read the WSDL file to determine what
operations are available on the server. Any special datatypes used are embedded in the WSDL file in
the form of XML Schema. The client can then use SOAP to actually call one of the operations listed
in the WSDL file using for example XML over HTTP.
In this lab, we shall implement a Web Service and its Client.
Tasks [Implementation]
Creation of ArithmeticService Web Service [5]
1. Create a Project of type Web application. Give it name Arithmetic.
i) Right click on Project folder, select New, select a Web service. A dialog box will appear.
Specify Name of Web service (ArithmeticService), package name (websvc), and select
option “Create Web service from scratch”
ii) Java source file can be seen in Source view or Design view. From design view, you will be
able to add operations. While adding operations, you have to specify name of operation, return
type, names and types of input parameters.
2. Go in source view, and provide definition of Web service operations.
Creation of web service Client [Implementation] [5]
3. Create a new project of type Java application. Give it name ArithmeticClient.
i. Right click on project folder and select New Web service client.
ii. A dialog box will appear asking location of WSDL and client.
For WSDL specify
http://localhost:8080/Arithmetic/ArithmeticServiceService?WSDL
and for client specify
websvcclient in package option.
Make sure Style is JAX-WS.
Page 49
Parallel and Distributed Computing
4. Right click in source code of Main class. Select option “Web service client resource”
Call web service operation.
A new dialog box will appear asking for selecting name of operation. Select “add”
operation.
Major Task [Implementation]
5. On similar lines as for Arithmetic Service, design and implement a TrigonometricService that
implements sin, and cos operations. [5]
6. Create a Client Application that calls the service created in Step-5 [5]
Web Resources1. http://en.wikipedia.org/wiki/Web_service
2. http://www.w3.org/TR/wsdl
3. http://www.w3.org/TR/ws-arch/
Page 50
Parallel and Distributed Computing
LAB 1 1: Development of Ontologies using Protégé
Objective
1. Introduction to Protégé2. Development of Ontologies using Protege
Tools
Stanford ProtégéTheory.
Protege Free, open-source ontology editor and knowledge-base framework Based on Java, is extensible, and provides a plug-and-play environment Supported by a strong community of developers and academic, government and corporate
users Pure OWL Framework Supports both OWL1.1 and OWL 2.0 Direct connection with OWL Reasoners
Pellet FaCT++ HermiT
Classes Sets that contain individuals Thing
Class representing the set containing all individuals All classes are subclasses of Thing
Properties Binary relations between two individuals (Object Property) or one individual and a
datatype (Datatype Property)Individuals
Represent objects within the Ontology (members of classes) Subclasses and Superclasses
A subclass is a subcollection of objects For example,
The class of laptop computers forms a subcollection of the class containing all (types of) computers.
In the same way the class of all (types of) computers is a superclass of the class of laptop computers.
Page 51
Parallel and Distributed Computing
OWL does not use the Unique Name Assumption (UNA)◦ This means that different names may refer to the same individual◦ E.g. the names “Matt” and “Matthew” may refer to the same individual (or they may
not) Cardinality restrictions rely on ‘counting’ distinct individuals
◦ Therefore it is important to specify that either “Matt” and “Matthew” are the same individual, or that they are different individuals
OWL Classes are assumed to ‘overlap’◦ Individuals of a class A can also be individuals of class B◦ Therefore one cannot assume that an individual is not a member of a particular class
simply because it has not been asserted to be a member of that class To ‘separate’ a group of classes
◦ One must make them disjoint from one another◦ If A is disjoint from B, then an individual of class A cannot also be an individual of
class B
Reasoners are programs that interpret the description logic of the ontology and are able to assist in the structure of the ontology.
A class is a category/type/set of an individual within a domain; Example:
◦ Cat1 is an individual of class Animal.◦ A particular course such as CS101 would be an individual of class Course.
Task: Page 52
Parallel and Distributed Computing
Using protégé, add two classes Course and Student. [5]
Constructing Individuals◦ Creating an individual is a two step process. [5]
First, Create an Individual Second, Specify type/class of the individual.
◦ Create S1, S2 as two individuals of Student class◦ Create C1, C2 as two individuals of Course class ◦
Page 53
Parallel and Distributed Computing
Using the Reasoner In Protégé, from menu, select Reasoner > start Reasoner From the DL Query tab enter class expression queries into the query window and write the
results in table. [5]
Class expressions ResultCourseStudentCourse and StudentCourse or Studentnot Coursenot(not Course)
Web Resources1. http://semanticweb.org/wiki/Main_Page
2. http://en.wikipedia.org/wiki/Semantic_Web
3. http://en.wikipedia.org/wiki/Resource_Description_Framework
4. http://protege.stanford.edu/
Page 54
Parallel and Distributed Computing
LAB 1 2: Using Protégé – Part II Objective
1. Development of Ontologies using Protege
Tools
Stanford Protégé
Theory.
Specifying Disjointness Specify that the classes Student and Course are disjoint Specifying Disjointness in Protege:
◦ Proceed to the classes tab and select both the Course and the Student classes.◦ Now click the Disjoint Classes button to make the Student and the Course classes
disjoint.
Consistency checking◦ Test whether a class could have instances
Classification◦ A classifier takes a class hierarchy and places a class in the class hierarchy◦ Task of turning implicit definitions already present in the hierarchy as explicit
Selecting… Go to the “reasoner” menu and select “Fact++” as your reasoner Running… In the same menu, click “Start Reasoner…” or simply type Ctrl-R
Page 55
Parallel and Distributed Computing
Task:Create a university Course Class Hierarchy [5]
Object Properties◦ Relationships between two individuals◦ Correspond to relationships in UML◦ For Example
Person1 hasFriend Person2 Datatype Properties
◦ Relationships between an individual and data values◦ The term datatype is used to denote the type of a datum. ◦ Correspond to attributes in UML◦ For Example
Person1 hasName Smith Domain and Range
◦ Properties link individuals from the domain to individuals or datatypes from the range Characteristics
◦ Specify the meaning of properties Restrictions
◦ Explained latter Super Properties
◦ Properties can be further refined as sub-properties inheriting the domain, range, characteristics and restrictions
Page 56
Parallel and Distributed Computing
Task:Creating Object Properties in a University Ontology [8]
Add an object property isTeacherOf that can be used to link course to professor Similarly, create a property called isTaughtBy that can be used as an inverse link to link a
course to teacher of courese. ◦
Creating Datatype Properties in a University Ontology [7]
Domain◦ Classes of Individuals
Range◦ XML Schema Datatype value (http://www.w3.org/TR/xmlschema-2/)◦ RDF literal◦ XML literal
Cannot have Inverted Properties
Page 57
Parallel and Distributed Computing
Web Resources1. http://semanticweb.org/wiki/Main_Page
2. http://en.wikipedia.org/wiki/Semantic_Web
3. http://protege.stanford.edu/
4. http://en.wikipedia.org/wiki/Resource_Description_Framework
Page 58
Parallel and Distributed Computing
LAB 13: Distributed Databases-I Objective
To learn and apply the advanced concepts of Distributed Databases
Tools
Netbeans IDE.Operating System: Ubuntu
Theory.
Data
Data or information is the currency of virtual world. Changing trends in technology implies a tilt of
interest towards the digital storage and processing of data. Information that was processed or
managed by dozens of people is now handled by a single computer. Human hours are replaced with
the frequency hertz of a processing unit. In this era of digital information it is important to keep pace
with changes of technology, keeping your data safe and available is the dream which is about to
become true. Generally we define data as:
“Facts and statistics collected together for reference or analysis”
and a database is defined as:
“A structure set that holds data and make it accessible via various ways”
Software Applications
“Applications are the computer programs that perform a unified task”
For example: Opera (Web Browser), VLC (Audio Software), Adobe Photoshop (Graphic
Software)
Database ApplicationsApplications that stores and manages data are the database applications. A simple example of
database application is MySQL.
Web ApplicationsWeb applications is a set of web pages hosted by a dedicated machine called server. Web
application runs in browsers.
Web browser is a desktop application while any chrome extension falls in the category of web
applications
Analytical ApplicationsApplications that measure the performance of business are called analytical applications.
These applications are used to produce analytical reports, sometimes used to create predictive
Page 59
Parallel and Distributed Computing
analysis; mostly estimating the trends and its ripple propagation in various business layers.
There are certain properties of analytical applications that differentiate them from the
traditional database applications.
1. Analytical queries are less predictable. For database applications the query structure depends on the type of system and the interface that allows user to interact with database, but in case of analytical applications query has to change dynamically as the variable in focus changes.
2. Analytical queries are mostly read oriented, they hardly write anything on the database but what they do write are the analytical reports.
3. Analytical applications mostly focus on attributes instead of entities. Averages, aggregates, maximum, minimum and all other implication relations are the analytical operations. All these functions are called upon a single attribute of all entities.
Analytical Functions in SQLCount, Min, Max, First, Last, Sum, Variance etc
Distributed Databases
“A distributed database is a database in which storage devices are not all attached to a common
processing unit”
Database Management tools
“Software that manages and provides access mechanism for the database is called Data management
tool”
Distributed Database Management tools
“Software that manages and provides an access mechanism for the database that is spread over a
network is called distributed database management system”
Page 60
Parallel and Distributed Computing
Structured Data
“Structured data is the one that can be modeled and classified in the form of a data model or related
tables”
Unstructured Data
“Data that cannot be represented in the form of a model, pre-defined manner or relational table is
called unstructured data”
Tasks
Create an advanced Search Engine-I [10]
1. Create a multi-threaded and distributed Web Crawler. That means, a. A central module should start crawling the web and distribute the URLs to crawlers running
on other machines.b. The central crawler should be able to manage the client crawler processes, recollect the results
(list of URLs and downloaded pages)
Create a (Cloud-like) Distributed File Storage [10]
2. You must be familiar with the Online File Storage/Sharing and Syncing systems like Google Drive or DropBox. Create your own distributed file storage.
a. The interface should be graphical.b. The user should be able to log on to the system and see the online files in his profile.c. You must implement an authentication mechanism so that the user has access to only those
files he has rights to.d. User should be able to share file with other users.e. User should be able to sync file with its online copy (on simple menu based commands)
Web Resources1. http://searchoracle.techtarget.com/definition/distributed-database2. http://en.wikipedia.org/wiki/Distributed_database3. http://en.wikipedia.org/wiki/Distributed_file_system4. http://technet.microsoft.com/en-us/library/cc753479(v=WS.10).aspx
Page 61
Parallel and Distributed Computing
LAB 14: Distributed Databases and Map-Reduce Objective
1. To learn and apply the advanced concepts of Distributed Databases and map-reduce
Tools
Netbeans IDE.Operating Systems: Ubuntu
Theory.
Big Data
“Data that grows beyond the process capability of a data management tool is called big data”
Big data limit for 2012 was 2.5 exabytes (1 exabyte=1018 bytes)
The famous 3V’s by Gartner Analyst.
Volume means the huge amount of data.
Velocity means the enormous rate of data generation.
Variety means the heterogeneous type of data.
These 3 Vs are the closest a man can get to understanding big data.
Map Reduce
Map-reduce is a most widely used algorithm when it comes to handling big data. This algorithm is
responsible for distributing work among the nodes and querying data. It works on the distributed data
sets structured and unstructured both (works more happily on unstructured data).
This algorithm has two key functions Map and Reduce. Map function finds the relevant data set’s
location in the distributed environment and the reduce function applies the search criteria on the data.
Algorithm works exclusively on key-value pairs. Whatever input is given to the map function it
considers it as a key-value pair and produces a key-value pair as a result. The data type of the output
key-value pair can be different from the input key-value pair.
Page 62
Parallel and Distributed Computing
Figure Map Reduce
Figure 3 is a visual description of map-reduce. On the very left the stack of databases is the data
source, the map functions reads required data from the data store and prepares it for processing. Then
map results are combined and fed to the reduce function which then computes the final results and
writes them to the repository. It looks neat but it has a few flaws too.
A famous explanatory example of map-reduce is Word count. Let’s say we have a file that has 2 lines
in it.
Hello World Bye World
Hello People Goodbye People
Figure Input to Map reduce
Map function reads the document sentence by sentence and breaks it into words. It counts the number
of times that word appeared in a sentence, fills the word-value pair (called the intermediate pair) and
returns it for the use of reduce function.
Now a map function m1 is given line 1 to process and map function m2 is given line 2 to process.
The results from the two map functions is
Map Function m1 Map Function m2
Input: “Hello World Bye World” Input: “Hello People Goodbye People”
Intermediate Pairs:
< ‘Bye’, 1 >
< ‘Hello’, 1 >
< ’World’, 2 >
Intermediate Pairs:
< ‘Goodbye ’, 1 >
< ‘Hello’, 1 >
< ’People’, 2 >
Figure Input/Output Map Function
Page 63
Parallel and Distributed Computing
Reduce function gets intermediate pairs as input from a number of map functions (all returning word
count of different sentences). This function counts the word occurrence in full document and returns
final word-value pair (called output pairs).
Data from m1 and m2 is read by a reduce function R, which computes the final results as shown
below.
Reduce function R
Input: < ‘Bye’, 1 >, < ‘Hello’, 1 >, < ’World’, 2 >, < ‘Goodbye ’, 1 >, < ‘Hello’, 1 >, <
’People’, 2 >
Output Pairs:
< ‘Bye’, 1 >
< ‘Hello’, 2 >
< ’World’, 2 >
< ‘Goodbye ’, 1 >
< ’People’, 2 >
Figure Input/Output Reduce Function
Map and reduce are the core functions of the algorithm, there are various other helping functions as
well like there is a function named ‘combine’ It collects results from map functions working on a
similar process and passes it onto the reduce function. Function names vary from implementation to
implementation.
Project-Task
Map-Reduce [10]
1. Write a MapReduce Application which processes weather data.
a. List out the hottest years from the available data (for Islamic Capitals).
b. Use the weather data available from the internet or prepare it referring the input
discussed in the lecture.
c. Process it using a pseduo distribution mode on Hadoop platform.
Create an advanced Search Engine-II [10]
2. Create a multi-threaded and distributed Web Crawler. That means,
a. Index the URLs and the keywords. You MUST implement it using Ma-Reduce algorithm.b. Create a Query Engine that should be able to get input from user and return the result.
Page 64
Parallel and Distributed Computing
Web Resources1. http://hadoop.apache.org/
2 http://en.wikipedia.org/wiki/Hadoop
3. http://searchoracle.techtarget.com/definition/distributed-database
4. http://en.wikipedia.org/wiki/Distributed_database
5. http://en.wikipedia.org/wiki/Distributed_file_system
6. http://technet.microsoft.com/en-us/library/cc753479(v=WS.10).aspx
Page 65