csea linked list node3

29
Faculty Talk Dr. Inside: Blondie24 Cover Story

Upload: iit-guwahati

Post on 15-Mar-2016

230 views

Category:

Documents


5 download

DESCRIPTION

Magazine of the Computer Science and Engineering Association-Student Body Department of Computer Science and Engineering,Indian Institute of Technology Guwahati,India

TRANSCRIPT

Page 1: CSEA Linked List Node3

Faculty TalkDr.

Inside:

Blondie24

Cover Story

Page 2: CSEA Linked List Node3

CONT

ENTS

editorial03

Faculty Talk04Dr. Vijay S. Iyengar, Visiting Professor, CSE

06Gunjan Bansal, 2nd year, B. Tech. CSE

09Pranav Kumar, 3rd year, B. Tech. CSE

Cloud Computing: What does the Future Hold ?12Gautam Sewani, 4th Year, B. Tech. CSE

COVER STORY

Bash History Tips and Tricks19Sanmukh, 3rd Year, B. Tech. CSE

GEEK CORNER

Intelligent Drug Discovery21Om Prasad Patri, 3rd Year, B. Tech. CSE

DIGITAL MIND

Krishna Kishore24NOSTALGIA

the turing test26

d ?

Blondie24 : Playing at the edge of AI

Ingredients of ARP PoisoningDIGITAL MIND

NATURAL SELECTION

android25

Manish Goyal16 Mukund R17

GOOD TIMESBONJOUR

winners of turing test (node 2)28

Page 3: CSEA Linked List Node3

Computer Science and Engineering Association, IIT GuwahatiLinked List 03

The third node of Linked List is here. It strategically follows the offi cial launch of Windows 7, Ubuntu 9.10 and Google Wave. Courtesy these much awaited products, I have something to write on in this section of Linked List (which most of you conveniently prefer to skim through). The Redmond’s newcomer defi es Darwin’s theory of evolution in some sense that it does not inherit the dilly-dallying of its predecessor. Be it performance, stability, security, or user interface, Win7 has it all improved. So those who have been waiting for something out of the box from Microsoft do try the institute’s evaluation copy of Win7, and there will be no going back to Vista!

On the other hand, the new desktop edition of the popular open source distribution is also available now. Ubuntu 9.10 features a redesigned, faster boot and login experience, a revamped audio framework, and improved 3G broadband connectivity, all of which contribute to a fi rst-class user experience. Based on Linux 2.6.31, Karmic Koala offers GNOME 2.28 and Ext4 as defaults, and adds “cloud” features and improved installation. Linux in this user-friendly avtaar is expected to force you to think twice before you empty your pockets to purchase a licensed copy of a paid operating system.

Next on my list is the product from the marketing genius, the Google Wave. It invokes the nostalgia of the days when I was seeking a Gmail invite; though this time I got an invite fairly easily. For those who have been missing it out till now, Google Wave is an awesome real-time service for sharing docs, sending emails and much more. In-fact it is the most anticipated product of the year and people are already desperate for an invite. One of the strongest premise on which Google Wave has been built is to integrate and aggregate the online user’s social media/network needs. I am sure many of you would be crazy to get your hands high in this WAVE that is Google! Those of you with the most correct entries for the node 3 Turing Test (or those willing to offer me some change from the routine mess cuisine) may get lucky!

Do not be misguided that an elegant publicity of certain geeky products is all this node has. This page is one place where I enjoy the complete freedom to blabber without any sort of intervention from my dear group of editors. Indeed, most of my jabbering may not feature at all in the magazine! What features is left for you to explore!

Om Prasad PatriPublication Secretary, CSEA

Editorial

Page 4: CSEA Linked List Node3

Dr.

Faculty Talk

Vijay S. IyengarDr. Vijay S. Iyengar worked as Research Staff Member in IBM’s T. J. Watson Research Center, Yorktown Heights, NY, USA for 25 years where he held various technical leadership and management positions. He is currently at IIT Guwahati as a visiting professor in the CSE Department. He shares with us his experience and interests.

How are you fi nding your stint at IIT Guwahati? Do you fi nd any major change in how IITs used to be in your under-grad times?

I am teaching a class after a long time and I find it excit-ing to try and leverage my industrial research experience. The Data Mining course had to be developed from scratch and this was challenging because of the varied academic background of my students. My research dis-cussions with faculty mem-bers and students are great but we are constrained by my short stay here. Aside from aca-demics, I am enjoying the beau-tiful campus with the rich bird life (and the elusive leopard!)Participating in the cross-country race was pretty cool too.

I was at IIT Madras from 1973

to 78, which was a different era! Apart from the obvious techno-logical changes, students today are much more “worldly aware” (notice I did not say “worldly wise”). I am delighted to see that some stu-dents are tackling advanced, real world problems even in a B.Tech. project. But I find level of student

enthusiasm and effort in my class to be much lower than I had expected.

What differences did you fi nd in the academic culture of the US and India?

I am used to students having a

much higher sense of seriousness and purpose. Maybe this is because so many students in the US work hard to put themselves through college. On the research side, many US universities are successful in tackling projects that address the hard problems the industry is facing. I would love to see an increase in such projects here. I firmly believe that this will not only lead to new and relevant theoretical concepts but also produce postgraduate students who are better problem solvers and more rounded.

When and how did you develop interest in Data Mining?

The IBM Deep Blue team was looking at computationally intensive data analysis applications after their victory over Kasparov. They gave me the opportunity to join in that effort. After 15 years in various areas of CAD, I was eager to try something completely different. Switching areas was really hard. It was a learning experience on the job and the learning for me continues even after ten years.

with

“It is a fallacy to assume that personal

fi nancial goals cannot be a� ained in the technical work-place. Top technical people can demand

and get salaries that are higher than many

in the management ladder.”

04

Page 5: CSEA Linked List Node3

Which project related to applica-tion of Data Mining did you fi nd to be the most exciting?

The project to find fraud and abuse in corporate travel and business entertainment expenses is my fa-vorite for various reasons. I had a great partner from the “product” group with deep domain experience. We had access to huge amounts of data and to the end users (audi-tors and business controls person-nel). We were able to go from prob-lem definition to an offering in the marketplace within a year. To top it off, the results from a real large scale application at a client indicat-ed improvements in precision by a factor of 5 compared to their earlier approaches. The technology used was an adaptation of Spatial Scan Statistics tuned to the characteris-tics of this problem.

Do you think AI will ever reach a level that computers can replace humans in most activities?

You are asking the wrong guy. I consider myself a problem solver in the engineering sense and not a vi-sionary. I can imagine the com-bination of learning technologies and computational engines taking on more and more tasks. Embed-ded intelligence is going to be more pervasive. A good example today is robot assisted surgery where sur-geons are able to improve their per-formance and the quality of care thanks to the intelligent robotic systems.

Take your pick: Industrial Re-search Lab or University Aca-demia? Why?

The choice was a no-brainer for me – the Industrial Research Lab where I spent 25 years of my life. The following factors were important for me.

I am more excited by the real life usage of the technologies I helped develop than by writing papers. The industry gave me access to real life problems, oodles of data, domain experts, application platforms and clients.

I also have the wanderlust in terms of areas of research. This was possible for me to satisfy in an industrial research lab. Ideally industrial labs should drive some of the market disruptions. At the least they must be nimble and responsive to market shifts and so the problems keep changing.

I was also able to easily collaborate and learn from people who were experts in other domains. In my opinion, this is the biggest weakness in Academia. How many university projects do you see that are inter-disciplinary in nature? I believe they could have a huge impact.

A lot of students in IITs are opt-ing for an MBA/managerial career rather than sticking to their core technical areas. Your opinion?

I can’t comment on any individual’s goals and aspirations. I hope they are passionate about their career choice and strive to be the best in it. But, I doubt that IITs can fulfill their purpose if predominantly their graduates pursue MBA/management positions. Clearly,

we need top quality engineers in all the disciplines to develop, manufacture and service in the global marketplace. If not in India, the global industry will find the engineers it needs elsewhere. Also, it is a fallacy to assume that personal financial goals cannot be attained in the technical workplace. Top technical people can demand and get salaries that are higher than many in the management ladder. They are also given technical freedom, project and strategic responsibility and command respect in the organization (all factors for likely job satisfaction). But to succeed in the technical ladder you need to pursue postgraduate studies and really know your stuff. Mediocrity does not get rewarded (for long) in either the technical or the management ladders.

What’s next? Future plans?

We will be splitting our time be-tween the US and India. I will be taking on industry/academic posi-tions that fit this constraint. Bio-informatics is a new area of inter-est for me. I am also eager to pursue some of my other interests like ani-mal welfare, conservation and car-tooning. I dream of getting some of my cartoons published.

[Interviewed by]

Abhishek Anand and Gautam Sewani,for CSEA

Faculty Talk

Computer Science and Engineering Association, IIT GuwahatiLinked List 05

Page 6: CSEA Linked List Node3

Terminology Used

NIC: Network Interface Card or simply the Ethernet card in our caseMAC address: Media access control address or the physical address of the NIC which was meant to be globally unique but is easily spoofed. Used for commu-nica� on within LANIP address: Internet Protocol address (of course all are familiar with this) used while communica� ng across networks i.e. des� na� on MAC can’t be learned.

ARP: Address Resolu� on Protocol which is used for mapping IP->MAC i.e. fi nd MAC address correspond-ing to a par� cular IP address by sending ARP request packets on LAN (Reverse ARP is used for the opposite)

NODE CACHE SHOWING IP->MAC/PHYSCIAL ADDRESS MAPPING

FTP: File Transfer Protocol (Protocol used to transfer fi les between 2 nodes)DOS: Denial of Service a� ack

MITM: Man In The middle a� ack

LAN: Local Area NetworksVLAN: Virtual Local Area Network .This can be com-pared with subneted network but the diff erence here is that the switch (layer 2 switches) can be made to handle data from diff erent subnets (i.e. the diff erent ports can be virtually on diff erent sub networks). Com-munica� on between diff erent sub networks is s� ll done by a router. The layer 3 switches provide much more features in here.

OSI: Open Systems Interconnec� on (ISO standard) is a model to used to standardize communica� on

Sniffi ng: Reading packets meant for other nodes.

SSL Connec� on: Short for Secure Sockets Layer, a pro-tocol for transmi� ng private documents via the Inter-net. SSL uses a cryptographic system that has two keys to encrypt data − a public key known to everyone and a private or secret key known only to the recipient of the message.

User Agent: User agent is the generic term used to de-scribe any device which might access a web page (web browser, search engines, handheld mobile phones etc).

Digital Mind

By Gunjan Bansal, 2nd Year, B. Tech. CSE

06

Ingredients of

Man in the Middle Attacks

ARP Poisoning

Page 7: CSEA Linked List Node3

Digital Mind

Computer Science and Engineering Association, IIT GuwahatiLinked List 07

Basics Devices in a NetworkHub: A Mul� Port device used for communica� on within a single network. It is very slow and causes a lot of network overhead as it broadcasts all frames to all connected nodes all the � me. It is usually replaced by switch in a large network.Switch: A Mul� Port device but much more intelligent than a hub. Used for communica� on within a network (Layer 3 switches an excep� on which are beyond scope of this ar� cle). It makes forwarding decisions based on the MAC address of the des� na� on and reduces net-work overload to a great extent. Broadcas� ng in gen-eral is replaced by mul� cas� ng and unicas� ng.Router: A Mul� Port device which provides communi-ca� on between diff erent networks (also to internet) as well as communica� on between diff erent VLAN’s.

What is ARP Poisoning ?

ARP Poisoning is one of the few techniques employed for sniffi ng/man in the middle/DOS a� acks, the other being MAC spoofi ng (or MAC cloning i.e. fl ooding the switch with MAC addresses at rapid rates which can force some switches to go into fail-safe mode i.e. in BROADCAST mode or act just like a hub)/a� acks by

admin themselves etc. In this ar� cle only the ARP POI-SONING is explained. This ar� cle only explains the ba-sics of ARP Poisoning and is not a step by step guide.

This technique can be used to sniff data packets (done a� er ARP Poisoning has been done), modify them and forward or just drop them or do anything the a� acker wants to do with them. The soul of the a� ack lies in the loop hole of the Address Resolu� on Protocol (ARP) (a Stateless protocol) which is devoid of any kind of au-then� ca� on. This means that the system won’t check if the request for a ARP mapping is authen� c or is be-ing faked neither it checks that ARP reply it received is in actual reply to a query (It accepts all replies even if it didn’t make a query) (some OS like Sun OS prevent this, may be this might also be incorporated in the up-coming patches or OS but this can also be overshooted by performing DOS a� ack 1st which is beyond this ar-� cle). The principle of ARP spoofi ng is to send fake, or “spoofed”, ARP messages to an Ethernet LAN. Gener-ally, the aim is to associate the a� acker’s MAC address with the IP address of another node (such as the de-fault gateway or just any other node in the network). Any traffi c meant for the faked IP (IP address whose MAC mapping has been changed in the ARP cache of

Figure Showing the IP->MAC mapping a� er the ARP-Poisoning has been done

Page 8: CSEA Linked List Node3

the vic� m’s PC) is routed (ONE WAY) to the a� acker. This may lead to DOS a� ack if the a� acker doesn’t set his NIC to IP_Forwarding mode (This is pre� y obvious because all data packets will end at a� acker’s node).For two way data capturing, the a� acker just poisons (change IP->MAC mapping by steps men� oned above) both the nodes. Then it can monitor traffi c fl owing be-tween them by capturing the packets and ge� ng the data out of them. This is usually employed for Telnet/FTP sessions which send passwords in clear text or for Session Stealing.

How exactly does it happen ?

Switch sends broadcast only for the 1st � me it is turned on to populate its MAC Table (Table contain-ing informa� on about loca� on of MAC address on dif-ferent ports). This is done by switch by storing all the MAC addresses of the computer trying to send data to another computer and the computer that replies to it. A� er that this cache is used to forward packets and no broadcas� ng takes place on ports with known MAC->port Mapping. Similar thing is done by PC’s or nodes. They form a ARP cache in their memory which stores IP->MAC mapping. The frames (formed at layer 2 i.e. Data Link Layer of OSI MODEL) contain MAC address of source and des� na� on (des� na� on is ini� ally empty as node doesn’t know MAC of des� na� on, It only knows its IP address). Remember only IP address is added at layer 3 (Network Layer). A� er a successful transmis-sion of data between 2 nodes ARP cache is maintained in the node (which contains IP->MAC mapping) thus the new frames sent will contain des� na� on MAC ad-dress in their frames. Hence, now the switch doesn’t look for the IP->MAC by broadcas� ng/mul� cas� ng but instead just forward the frames to the des� na� on (communica� on on the LAN takes place based on MAC address not IP address). This is a security lapse. If we send a fake ARP reply to a node and change the des-� na� on MAC address to ours then the Packets will be forwarded to us and we can do whatever we want. This is generally employed between Gateway and node so that all data meant for internet is intercepted.

SSL and SSH Connec� onsNow the ques� on might arise that what about SSL con-nec� on?? The data sniff ed above must be in clear text

for us to interpret! , so what about encrypted data?? Well there is a catch here also. One can generate a fake cer� fi cate which “if accepted by vic� m manually” (which he generally does for browsing websites) can cause a good man in the middle a� ack (sniffi ng can be done). No doubt this is many a � mes not possible as the connec� on is already established and an encrypt-ed key is used to store session ids/cookies so, session stealing can’t be performed. But s� ll it is of good use in many poorly constructed sites. Same is applicable for SSH connec� ons. But s� ll to some extent brute-forc-ing or decryp� ng may help (This must be last resort). There is much more complexity here which is beyond our topic.

Solu� ons ??

Till now there is no foolproof method for large LAN to stop this a� ack, but this a� ack can be easily no� ced and admin can catch hold of the culprit easily. One of the solu� ons for a small LAN is STATIC ARP entries (in switch as well as nodes) of all the connected devices. Others may include forming VLAN’s which will require a bit more knowledge to break into. Others may in-clude ARP Inspec� on on switches. Also to some extent one can switch user agent to evade popular available so� ware as most of them use the user agent to start search into packet by default. They usually go into packets if the par� cular user agent is found in the data packet (this way only session hijacking and password sniffi ng from just browser can be done). This se� ng is also mostly set to use Mozilla. Se� ng our User agent to an arbit agent (may cause some browser based sites to malfunc� on) will help us to evade this a� ack to some extent. But in actual all packets are sent to a� acker’s PC, he might manually override default se� ngs or go in for reading the packets himself. Think twice before proceeding when your browser complains of fake cer-� fi cate. Sta� c ARP entries are of great u� lity but their use is restricted to a large extent. In the End, The Lord Of the (Token) Ring,(the fellowship of the packet),“One Ring to link them all, One Ring to ping them, one Ring to bring them all and in the darkness sniff them.”

Digital Mind

Computer Science and Engineering Association, IIT GuwahatiLinked List 08

Page 9: CSEA Linked List Node3

Natural Selection

By Pranav Kumar, 3rd year, B.Tech. CSE

“Meet Blondie. A 24-year old graduate student of mathema� cs at the University of

California at San Diego. She skis and surfs and is an ace at math, but her real claim to fame is her ability to play checkers. She’s not good enough to defeat a grand master (yet), but she did earn a spot in the top 500 of an interna� onal checkers tournament. Not bad when you consider that Blondie taught herself how to play without reading books, taking classes or ge� ng � ps from experienced players. Even be� er when you realize that Blondie is a computer program and the rest of her persona is a product of my imagina� on.” Excerpt from “Blondie24: Playing at the edge of AI” by David B. Fogel

Yes folks, Blondie24 is an ar� fi cial intelligence checkers-playing program developed by David B. Fogel and Kumar Chellapila in 1999. It is not the fi rst AI program for checkers, but it is signifi cantly diff erent from the others. But before I go into describing how smart it is, let me tell you how it was made.

The algorithm the program uses is a very simple one: a minimax algorithm. For the unini� ated ones, it is policy widely used in games to maximize your advantage and minimize that of your opponent. At each step, the program looks ahead n moves for each side from the current board posi� on and evaluates the results using an evalua� on func� on. The move corresponding to the board posi� on with the highest score wins. This process is known as an n-ply-search. For example, a program which evaluates 4 moves in advance is said to execute a 4-ply-search. Correspondingly, there can be 6-ply searches, 8-ply searches, and so on. Basically, every other checkers program implements

this algorithm in one way or another. But this is not the cool part of Blondie24.

Now, let us come to the evalua� on func� on. Conven� onal “ar� fi cial intelligence” programs (such as the Checkers World Champion “Chinook”) rely on features that were chosen using human exper� se and weighted by hand tuning. Their “intelligence” is pre-programmed into them. The secret to Chinook’s success lies in the high-speed parallel architectures which can calculate billions of possible board posi� ons per second, plus an end-game database that allows it to make perfect moves a� er eight or fewer pieces remain on the board.

However, Blondie24 does not need such luxurious machines to run on. It uses an ar� fi cial neural network for its evalua� on func� on. Neural Nets can be trained to perform tasks without programming required informa� on in them. That is, they have the ability to formulate their own tac� cs. The neural net receives as input a vector representa� on of the checkerboard posi� ons and returns a single value which is passed on to the minimax algorithm. Thus, what Blondie24 does is, it considers each possible move for the next 4 moves (in a 4-ply search), evaluates each board posi� on using the neural net, and fi nally decides which move to make based on the scores of the diff erent board posi� ons. But this is not the cool part either.

The cool part is the training algorithm used for the neural net. It was trained with an Evolu� onary Algorithm. Why is it called that? It is because it is derived from nature’s process of evolu� on. The process which made us humans what we are today.

09

Blondie24ying at the edge of AI

Page 10: CSEA Linked List Node3

The process of natural selec� on. The neural nets were evolved through hundreds of genera� ons � ll Blondie became an expert level checkers player. This is how it was done:

Each checkerboard was represented by a vector of length 32, with each component corresponding to an available posi� on on the board. Components in the vector were elements from {−K, −1, 0, +1, +K}, where 0 corresponded to an empty square, 1 was the value of a regular checker, and K was the number assigned for a king. A common heuris� c has been to set this value at 1.5 � mes the worth of a checker, but such exper� se was eschewed in these experiments. Instead, the value of K was evolved by the algorithm. The sign of the value indicated whether the piece belonged to the player (posi� ve) or the opponent (nega� ve). The evolu� onary algorithm began with a randomly created popula� on of 15 ar� fi cial neural networks (also described as strategies), Pi, i = 1, …, 15, defi ned by the weights and biases for each network and the associated value of K. Weights and biases were sampled uniformly over [–0.2, 0.2], simply to provide a small range of ini� al variability, with the value of K set ini� ally at 2.0. Each strategy had an associated self-adap� ve parameter vector σi, i = 1, …, 15, where each component corresponded to a weight or bias and served to control the step size of the search for new mutated parameters of the neural network. The self-adap� ve parameters were ini� alized at 0.05 for consistency with the range of ini� al weight and bias terms. Each “parent” generated an off spring strategy by varying all of the associated weights and biases, and possibly the K value as well. Specifi cally, for each parent Pi, i = 1, …, 15, an off spring Pi’ was created by:

σi’(j) = σi(j) exp(τNj(0,1)) ; j = 1, …, Nw

wi’(j) = wi(j) + σi’(j)Nj(0,1) ; j = 1, …, Nw

where Nw is the total number of weights and bias terms in the neural network (here, 5046), τ = 1/sqrt(2 sqrt(Nw)) = 0.0839, and Nj(0,1) is a standard Gaussian random variable resampled for every j. The off spring king value K’ was obtained by:

Ki’ = Ki + δ where δ was chosen uniformly at

random from {–0.1, 0, 0.1}. For convenience, Ki’ was constrained to the range [1.0, 3.0]. These players played a set of games with each other and received points based on winning (+1 point), losing (-2 points) or drawing (0 points). A� er 150 games, the 15 players with the highest scores were retained as parents for the new genera� on.

So you see, the program was learning to play checkers without any help from anyone. No other strategies were programmed. Just the basic rules, implemented by the ply-search engine, like each checker moved diagonally forward one square at a � me and it became a king on reaching the last row and so on. OK, so now i can go on with boas� ng about how awesome Blondie24 was.

A� er training for 840 genera� ons (which took about 6 months using the computer technology of the 90’s), the best player was used to play with human opponents on the website h� p://www.zone.com. The username Fogel and Kumar used was, yes, you guessed it, Blondie24. They chose the name so they could a� ract other players easily. A� er all, hardly anyone wants to play with someone having the username chellapilla24!! And the program gained its popular name from this experiment. The site used the standard of the United States Chess Federa� on for ra� ng players. New players started with the score 1600 and the score was adjusted with the outcome of each game and the ra� ng of the opponent. As Blondie was put to the test, it recorded an impressive win-draw-lose ra� o of 94-32-39 in 165 games. The best win came against a human player ranked 2173 (just 27 points short of the master level), who was ranked 98th out of the 80,000 people registered at zone.com. The fi nal ra� ng of Blondie24 according to calcula� ons was 2045.85 with a standard devia� on of 0.48. This was an expert level ra� ng and placed Blondie be� er than 99.61% of the players registered at the site.

However, the real achievement was yet to come. The current world-champion checkers program is called Chinook, rated at 2814. Chinook relies on features that were chosen using human exper� se and weighted by hand tuning. It also includes a look-up table of

Natural Selection

Computer Science and Engineering Association, IIT GuwahatiLinked List 10

Page 11: CSEA Linked List Node3

transcribed games from previous grandmasters and a complete endgame database for all cases with up to eight pieces on the board (440 billion possible states). Chinook does not use self-learning techniques to improve its play, relying instead on opening books, perfect informa� on in the endgame, and on high-speed computa� on to look ahead as many ply as possible. Blondie24 certainly cannot compete with Chinook at its best, or with players at the lesser-ranked master level. Yet the evolu� onary program exhibits a fl exibility that cannot be achieved with Chinook or other similar approaches. It can invent new and unorthodox tac� cs.

The real achievement came when Blondie succeeded in defea� ng Chinook at the novice se� ng. The novice-se� ng of Chinook is equivalent to a high-level expert rated player. This was a great feat and marked the rise of true “ar� fi cial intelligence” over conven� onal programming.

And now, the team at Natural Selec� on Inc. under Dr. Fogel has gone another step further with Blondie25, the Chess-playing program. Relevant paper can be viewed at: h� p://65.44.200.132/Library/2006/CIG2006.pdf

Other useful links: Fogel and Kumar’s paper introducing Blondie24: h� p://65.44.200.132/Library/2000/Intell-CheckersPaper.pdf

World Champion Chinook’s website: h� p://www.cs.ualberta.ca/~chinook/

Also in the reading list: “Blondie24, Playing on the edge of AI” by David B. Fogel.

Natural Selection

Computer Science and Engineering Association, IIT GuwahatiLinked List 11

Call for Articles

Since its incep� on, the CSEA has conducted many workshops, lectures and programming contests to increase awareness regarding new technologies and the science behind them.

Linked List is yet another a� empt to get closer to the fellow IITians.

Linked List requires ar� cles that maintain the originality and the quality of the magazine.

Send in your ar� cles to [email protected] or contact the editorial team for any clarifi ca� ons.

Looking forward for an overwhelming response.

Page 12: CSEA Linked List Node3

Cover Story

Gautam Sewani, 4th year B.Tech student from the department of Computer Science and Enginnering explores what the buzz of Cloud Compu� ng is all about.

So� ware Piracy is the most glorifi ed criminal ac� vity in India. Paying for so� ware is a sureshot way of

making yourself a subject of ridicule amongst your peers. Indeed, you will be hard-pressed to fi nd many individuals who have ever ‘bought’ so� ware in their lives.(Excluding, of course, the amount we pay to Microso� every� me we buy a computer, but then most of us aren’t aware of it and what we don’t know doesn’t hurt us :P).

So� ware as a Service

Now, if you weren’t so shameless, and had spent a few bucks buying, say a Windows 7 license (instead of asking Jal to download it for you), you would have the following ques� ons troubling you:• When you have paid for the so� ware, why is it that

you can use it only on one par� cular computer? It’s totally baffl ing - when you bought an audio case� e in the grand old days, you could use it on any case� e player without paying anything extra.

• You have a friend who plays Age of Empires II (he’s a purist and frowns upon stuff like DOTA) the whole day on his computer. You, on the other hand, are an outdoor person and use your computer just

12

Cloud Computing

Page 13: CSEA Linked List Node3

for a couple of hours daily. Why then, do you have to pay as much for the OS as your friend?

• Why is so� ware intalla� on such a pain? Why do you have to beg the geek next door to come and fi x my “thisso� waresucks.dll not found” error every� me you install a so� ware?

Well, it turns out that some wiseguy heard your rants, and came up with whats known as So� ware as a Service (Saas). In a nutshell, it’s so� ware which you can use “any� me, anywhere”, without installa� on, paying an amount propor� onal to the � me for which you use the so� ware. Obviously, to achieve all this, the web is the preferred medium of delivery, and most of the products in this paradigm are browser-na� ve.

U� lity Compu� ng

If I said, “To set up a factory, you need to set up a power plant”, you’d call me insane, and rightly so. You can buy electricity from the government, vary your usage and pay according to the amount you use. But in the fi eld of IT, the statement “To set up a half-decent search engine, you need to set up a $100 million data center” used to be a truism. That is, before the advent of U� lity Compu� ng.

Selling compu� ng resources to the public, just like Electricity, Water, LPG and other public u� li� es is called u� lity compu� ng. It has obvious advantages. Let us return to the example of the search engine you want to set up. You think your algorithm is be� er than Sergey and Page, but you are not sure if others will think the same way. You want to try it out anyway. Without u� lity compu� ng, you would have to run a� er venture capitalists, raise a signifi cant amount of capital and set up a data center. A� er all this, if the public rejects your hot-shot algorithms, your investments turn to dust and you are doomed. With u� lity compu� ng, in contrast, you rent out compu� ng resources. You will only be charged for the amount of resources you use (which depends on the kind of traffi c your search engine gets). And

what’s more, it’s perfectly scalable – if your site traffi c increases, more resources will be automa� cally allocated, so that you do not have to plan ahead, and your site won’t face any outage.

To summarize, u� lity compu� ng provides the following benefi ts:

1. The elimina� on of a huge up-front payment: you can start small and grow as required.2. Automa� c scalability, so that you don’t have to plan ahead with regard to compu� ng resources.3. Ability to release resources when you are not using them, so for example if your site traffi c decreases, your compu� ng resource bills decrease too.

Cloud Compu� ng

Cloud compu� ng has become a buzzword, and as with all buzzwords, it’s got gazillion defi ni� ons fl oa� ng around. Here, we defi ne cloud compu� ng to be the sum of SaaS and U� lity Compu� ng. From the perspec� ve of end-users, Cloud Compu� ng is SaaS – Google Docs and Acrobat.com can be cited as examples, where we use statements like “Our documents are on the cloud”. From the perspec� ve of a SaaS provider, Cloud Compu� ng is U� lity Compu� ng.

Cover Story

Computer Science and Engineering Association, IIT GuwahatiLinked List 13

Page 14: CSEA Linked List Node3

A cloud is an en� ty containing compu� ng resources. Internally, it can be a grid, a supercomputer or another such system. If these compu� ng resources are rented out, it is known as a public cloud. The organiza� on owning the cloud is called a cloud compu� ng provider. If the organiza� on does not rent out the cloud and uses it for it’s own internal purposes, it’s called a private cloud.

Why Now, Not Then?

It should be clear by now that Cloud Compu� ng is a godsend for SaaS providers. But is it economically viable from the perspec� ve of a cloud compu� ng provider? Indeed, this lack of economic viability was the missing link which had held back the emergence of compu� ng as a public u� lity for a long � me. The newfound interest in the cloud is a result of signifi cant changes in the landscape of the internet and the Web which has made owning and ren� ng out a cloud an a� rac� ve business proposi� on.

Building a public cloud requires investment to the tune of hundreds of millions of dollars. But due to the tremendous of growth of web services, companies like Google, Microso� and Amazon were already building such systems to sa� sfy the compu� ng needs of their own services. They also invested in crea� ng so� ware

infrastructure like MapReduce and Google File System to use the cloud in a convenient way. This, coupled with the fact that a large part of their resources remained unu� lized made ren� ng them out the obvious next step. Note that the economic viability increases as the size of the cloud increases. This is illustrated by the table given above (from Internet-scale service effi ciency by J. Hamilton) which compares the network, storage and administra� on costs for medium-sized (1000 servers) and large-sized (50000 servers) data centers.

Another compelling reason was the fact that cloud (or SaaS) versions of a lot of enterprise applica� ons are being created. Google, for example has off ered Gmail for enterprises at a fi xed monthly cost wherein all the data (mails etc) and so� ware is on Google’s cloud, freeing the enterprises of storage, maintenance and installa� on costs. This is a direct a� ack on Microso� Exchange, the predominant enterprise communica� on system. Hence, to defend these franchises, Microso� is forced to off er cloud-based versions of it’s enterprise applica� ons. Therefore, it had to create it’s own cloud infrastructure, which led to the birth of Azure. What’s under my Control?

When you buy a computer, you get total control over it. Is that also true with the compu� ng resources bought on a cloud? The answer is, it depends.

There is a whole spectrum available when it comes to the level of fl exibility off ered. On one hand, we have Amazon EC2, which looks pre� y much like physical hardware. It off ers API calls to request and confi gurehardware (obviously virtualized). The user has complete control over the en� re so� ware stack. However, such a high level of control has it’s fl ipsides too. It makes it very diffi cult for amazon to off er automa� c scalability, because it’s seman� cs depend to a very high degree on the so� ware stack and the applica� ons used.

At the other extreme, we have the Google AppEngine,

Cover Story

Computer Science and Engineering Association, IIT GuwahatiLinked List 14

Table: Comparison of medium-sized and large-sized data centers

Page 15: CSEA Linked List Node3

Cover Story

Computer Science and Engineering Association, IIT GuwahatiLinked List 15

which only supports tradi� onal Web-based applica� ons with a request-reply model. It provides impressive scalability but cannot be used for general-purpose compu� ng.

Microso� Azure is somewhere between Amazon EC2 and Google AppEngine. Unlike AppEngine, It is not restricted to a specifi c type of applica� on. It provides users the ability to use any programming langauage. However, the language is compiled to .NET CIL (Commer Intermediate Language) and executed in the CLR (Common Language Run� me). The user thus cannot change the run� me and the OS.

Challenges and Opportuni� es

Despite the fact that Cloud Compu� ng has a lot going for it, a few key challenges, related to both the technology available and legal policies adopted, need to be overcome for it to realize it’s full poten� al. I will discuss a few of them here.

As men� oned earlier, a big a� rac� on of Cloud Compu� ng to SaaS providers is the feature of automa� c scalability. The current implementa� ons of this feature leave a lot to be desired. For example, Amazon will charge you by the number of ‘instances’ you occupy, without taking into account the computa� onal cycles being used by those instances. An area of ac� ve research is to use Machine Learning to off er automa� c scalability with many research labs working on it.

Virtualiza� on is a key technology for Cloud Compu� ng. While the benefi ts it provides are signifi cant, it also comes with a performance penalty. Analysis has shown that though VM’s (Virtual Machines) are excellent at sharing CPU and Main Memory, they cause a sharp dip in performance when it comes to I/O. A key challenge then is to make I/O architectures which work well with a large number of VM’s.

Cloud Compu� ng throws up some important legal ques� ons. What happens if a country requires it’s enterprises to keep customer data within it’s na� onal boundary? And wouldn’t enterprises be nervous about storing data in countries where laws exist to make this

data available to the government in interest of Na� onal Security? (US Patriot Act, for instance). How secure is the data stored in clouds and what kind of encryp� on should one adopt to ensure total secrecy?

A word about open-source

MapReduce is a cloud compu� ng programming framework developed by Google. It can be termeda programming paradigm for cloud compu� ng. It allows programmers to specify programs in termsof two opera� ons: Map and Reduce. The terminology is borrowed from func� onal programming. For more details, refer to the ar� cle on MapReduce in the book Beau� ful Code.Hadoop is an open-source implementa� on of MapReduce, primarily backed by Yahoo. Eucalyptus is an open-source infrastructure for implemen� ng clouds on clusters (provided with the latest Ubuntu distros).

Conclusion

We are lucky to live in a � me where all the economic and technological factors have conspired to make cloud compu� ng viable. Based on the changes we have witnessed, specula� on on future trends con� nues unabated. Certain changes are inevitable. So� ware will have to change to run on clouds instead of stand-alone hardware. Virtualiza� on will witness rapid development to allow cloud providers to off er a single physical machine to as many customers as possible. Just as availability of water as a public u� lity rendered wells useless, cloud compu� ng may make thick clients like powerful Desktops superfl uous. Or Cloud Compu� ng might run into a brick wall and be dismissed as a fad of our � mes. Whatever the case maybe, one thing is for sure, (as Dylan said) The Times They Are a-Changin’!

[Wri� en by]Gautam Sewani,4th Year, B.Tech., Dept. of Computer Science & Engineering.

Page 16: CSEA Linked List Node3

Bonjour

On 27th March, I received an off er from Verimag Research Lab, a leading research centre on theore� cal and technical aspects of modeling, developing and formally verifying real � me systems. Model Checking and verifi ca� on being my area of interest, it was like a dream come true. Applica� on, confi rma� on, excitement, last minute visa and there I was in the Land of Fashion. Happiness of stepping on foreign land was enhanced manifold in the train journey when a Dutch girl seated beside me started the conversa� on with a “Bonjour!” I wished the train would run a bit slower! (unfortunately, it was the TGV). At � mes like these, one realizes the importance of Indian Rails!

Next morning, it was wonderful to see myself surrounded by the snow-capped peaks of the Chartreuse, Belledonne and Alps. What a scenic start of a new day in a new country with an awesome climate! My advisor, Dr. Oded Maler (a pioneer in my research fi eld), made me comfortable with the project, lab rou� ne, access policies etc. No one was allowed to be in the lab a� er midnight. In Dr. Maler’s words - “You are supposed to sleep during night”. (Any chance he might be aware of IITians’ schedules?) It was good to know that people were fl exible and open to my ideas too. Food (followed by language) was the major problem as even non-veggies found it diffi cult to survive. We (Rohith, Goverdhan and I) ended up cooking on our own with me as the lead and the only chef!

Almost se� led, it was � me for the “Euro Trip” and the fi rst des� na� on was Switzerland. We procured eatables which were supposed to be unaff ordable in the Swiss terrains. Train journeys through green valleys, snow covered mountains, dark tunnels and waterfalls

cast a perfect image in our minds. We visited Mount Titlis, Interlaken, Geneva and Zurich. Shockingly, we met Indians at every place, the climax being when we were traveling to Mount Titlis and an Uncle in an approaching train shouted: “Beta, kahan se aa rahe ho?” Kudos to our Popula� on! Night-outs which were spent roaming around added another feather to the Swiss “cap”. We also travelled to Cannes (the city of hotels), Nice (city of “beau� ful” beaches) and watched the fabulous fi reworks on Bas� lle Day. A� er 2.5 months and a collec� on of over 1500 pics (Digicams having revolu� onized our world with thoughts as “10 mein se 1 to acchi niklegi”), it was � me to come out of the dream. We had reserved 2 days for Paris and visited the Louvre, Eiff el, Arc de Triomphe and Sacré-Cœur.

Till now, I have portrayed only the lighter side. You would wonder if I ever worked! Indeed I worked hard rest of the � me (with a sense of responsibility too to keep up the legacy of the IIT brand). I was involved in the implementa� on of one of their projects using Matlab/Simulink. Who else would be happier than CS guys if they get to code! I realized how actual research was done, in the true sense. Their passion towards their work and research is commendable, with a clear demarca� on between their personal and professional lives. They believe in enjoying every moment, be it at the workplace or outside. I thanked Dr. Maler and Alex (my co-guide) a� er discussing future aspects and research opportuni� es. Eventually, I waved adieu to France on 24th July. My vote of thanks to Dr. Purandar Bhaduri for his support and encouragement that got me started and my friends Prabhat and Vallabh for playing “fi rst cri� cs” to this piece. Wish everyone a bright future!

16

GoyalManish

Manish, M.Tech. 2nd year CSE, recounts his summer internship experiences at Verimag Research Lab, Grenoble, France.

Page 17: CSEA Linked List Node3

Good Times

A� er three long interviews, and another short one with the consular offi cer for my visa, it was confi rmed that I would be going to Microso� for my internship. The apartments in which we would stay - two of us from Guwaha� , three from Kgp, and seven from Kanpur - were booked, and so were the fl ight � ckets. Microso� wanted us to stay for 12 weeks, but our vaca� on was only 11.5 weeks long, so it was cramped - we were to leave the day a� er our endsems ended.

Everyone’s third year internship is memorable, and so was ours. Except that our memories started a li� le earlier - at Guwaha� airport, to be precise. I had not slept the night before because stuff needed to be packed, and the hostel room cleared out. What’s more, nobody told me that we wouldn’t really be needing woollens there. So I had a more-than-full suitcase that refused to shut, and I was just wai� ng to fall asleep. We reached the airport 1.5 hours before the fl ight was due, and I took my � me freshening up, and ge� ng my suitcase to close properly. We approached the check-in counter half-an-hour before the fl ight was scheduled, and the lady there, very politely and s� ll smiling, refused to let us get on the plane. Lesson learned: “Don’t waste � me at the airport.”

We fi nally reached Redmond on � me, and then our home. Whatever you may say, jet-lag does occur - and my earlier experiments with sleep put me out of ac� on for the next 18 hours. The next day was our fi rst day at work - my work was on a Microso� -internal programming language named Scope, in which many of the queries to Bing’s huge store of data were made. Its similar to SQL, except that it is op� mized for the resources that Microso� uses in Bing. I had to improve

its integra� on with Visual Studio, so that programmers elsewhere within Bing could have an easier � me with the language. It was more at a proof-of-concept level, since my manager himself said that this was actually a 1 year opera� on, and they wanted to see how possible/easy it was. Work sure was memorable, and also were my manager’s constant words of encouragement. But I’m supposed to be talking about extra-curricular ac� vi� es here, so let’s move on.

One thing that most Indians will no� ce there is how polite people are (although our stereotypical American isn’t quite so). The fi rst few days, I spent training myself to respond courteously to gree� ngs of “How do you do?” with “Good, and how about you?”, and wishing others “Have a great day!” The fi rst � me someone greeted me like that, I blushed, caught my tongue, and didn’t quite know what to say.

America has a policy called the Uniform Monday Holiday Act - with this, most government holidays are defi ned as the fi rst or last Monday of some month. You always get long weekends, and never miss a holiday because it falls on a weekend. I argue here that this system should come to India as well, at least we’ll have a much be� er apprecia� on of how many holidays we have in a semester. The fi rst such long weekend was Memorial Day, and we decided to go over to Los Angeles. Several memorable “kand”s happened in LA - I’ll only describe the least of them here.

The average Indian has to worry about his fi nances, and the twelve of us were not that much above average. The area in which we chose to stay in LA was probably a bad choice, and everyone was scared for their lives,

17

Mukund R

Mukund, 4th Year B. Tech. CSE, tells us about his extra-internship activities

during the summer at Microsoft Bing, Redmond, USA

Page 18: CSEA Linked List Node3

Good Times

especially a� er we no� ced that the local Pizza Hut had thicker glass windows than in the visa offi ce, and a prominent sign read, “Cash registers operated by � me lock. Management cannot open them, even if demanded to do so.” We had heard that LA has a good public transport system. Again, in the interests of saving money, we opted to use public transport. Later in the night, at about 1 am, we found ourselves about 20 miles from the hotel, and nobody knew how the bus service worked. Finally we had to call for 3 cabs, and get a ride back to the hotel. Lesson learned: “Only New York City has public transport.”

Back to Redmond, we each had 12 days of car rent coupons, and probably the best roads we have ever seen (P.S. Germany, I’ve heard, has much be� er roads). Much of what we did was possible only because of this, otherwise we were all too lazy to go anywhere. We discovered a small city called Medina, home to one man called Bill Gates, which off ers scenic views of the Sea� le skyline. On many nights, a� er work, we used to drive down there and see Sea� le’s refl ec� on in Lake Washington.

Most others wanted to return home in one piece, so when we heard of a local skydiving center, only three of us volunteered. The other two were visibly scared, I wasn’t so (I’m very modest). I had considerable diffi culty in not thinking of the actual moment when I would jump off the plane, but if you’re not thinking of it, then there’s nothing to be scared about. A few minutes of prepara� on, and some $400 of payment later, we found ourselves in the back of a small aeroplane, I’m wearing this strange harness with a pink cap, one jumpmaster strapped to my back, and one lady si� ng next to me fi lming my reac� ons. At 13000 � , I was s� ll the only fearless man, and so approached the open door confi dently. I looked outside, and then back inside. I saw the chap from Kanpur looking at me, wai� ng for me to jump. I didn’t want to appear hesitant, and so when my jumpmaster counted to 3, I blindly released hold of the railing and jumped. We could think during the 6 minute journey down. But for the one minute you’re in free fall, there’s nothing much you can do: wave to the camera, talk to your jumpmaster, look around, think, and look at the ground. The last 2 things you should never do together. I thought, “Ok. So if the

parachute opens, you’re probably safe. If it doesn’t, then you’ll defi nitely die. Did you have to do this?” Well, the chute opened, and I landed safely. But no, I was never scared.

Speaking of Near-Death Experiences, we went to a place called Six Flags in New Jersey. Its a few hours by bus from New York City, and its extremely fun. But if you’ve ever sat on a roller coaster, never sat on one, or think that Indiana Jones in Disneyland is scary, then this place is for you. Its home to several of the world’s scariest roller coasters - Kingda Ka, El Toro and the Great American Scream Machine. Kingda Ka was closed for the day, but we screamed during the ride on the Great American Scream Machine, and were too scared to even scream while on the El Toro. I wasn’t so petrifi ed during the skydive as I was on the El Toro. That machine is dangerous. Lesson learned: “Never sit on a roller coaster for fun.” Of course, Disneyland roller coasters are s� ll fun. Btw, one chap from Kgp had this to say just before the ride climaxed: “Bhagwan bacha le!”

This was just a small list of the fun things I did during my internship. I’m sure everyone who goes abroad during the summer will have similar stories to tell.

Editor’s Note: For a similar descrip� on of “How Good My Intern Is”, the extra-internship ac� vi� es of Gautam Sewani, 4th Year, B. Tech. CSE, during his summer internship at Microso� IDC, Hyderabad, India, visit his blog at h� p://kholublogs.blogspot.com/

Computer Science and Engineering Association, IIT GuwahatiLinked List 18

Page 19: CSEA Linked List Node3

Geek Corner

Inspired by spsneo’s blog (h� p://www.spsneo.com/blog), Sanmukh, a 3rd year B. Tech. student from the Department of Computer Science and Engineering, enlists some handy tricks on retrieving previously used commands from the linux terminal.

To all you lazy linux command line coders and scripters, bash has a rich feature to pamper your laziness, The Bash History. Almost all of you might be knowing that pressing the up arrow brings the previous command onto the command prompt, and pressing it some more gives you the less recent ones. But it doesnt end here, actually you’ve just started here. Lets explore the rich features the bash history can provide us.

The fi rst thing to note here is that bash stores your command history in a fi le named .bash_history in your home folder. Just open the fi le and you can see upto the last 500 commands you typed. Delete a command if you wish it to be deleted, modify it or do whatever you like. (I trust you’d fi gure out the reasons for doing so). Then you have the history command in your linux box which shows you the list of all commands along with their ids. If you need a fi ner result, you could use the powerful “grep” command which linux provides.An illustra� ve example:$history | grep ssh 165 ssh -Y 172.16.25.98 166 ssh -Y 172.16.25.98 167 ssh -Y [email protected] 169 ssh -Y [email protected] 170 ssh -Y 172.16.25.98 309 pintosssh

310 jatingassh 311 labssh 317 ssh [email protected] 319 ssh $uname@$pintos 321 ssh $uname@$pintos

Bash also allows for incremental search on the history list. Use Ctrl+R and type the fi rst few le� ers of the command, the last command from the history list that matches with your string would be displayed. Press Ctrl+R again to fi nd a command further back. Now just press enter to execute the command or press any of the arrow keys to bring the command on the prompt to edit and execute it. (reverse-i-search)`vec’: g++ vectortest.cpp

But by far the most powerful feature is the bash history expansion. History expansions are implemented by the history expansion character ‘!’. The line selected from the history list is called ‘event’ and por� ons of that command that are selected are called ‘words’.

So there are basically three parts to a history expansion all of which are op� onal and separated by a colon ‘:’

1. Event Designators:

• !n -It refers to nth command in the history. • !-n - It refers to nth command in the history from the end. • !! is an alias for !-1. • !string - It refers to the most recently used command in the history star� ng with “string”. It is again an useful expansion when you don’t remember the

19

Tips & Tricks Bash History

K Sanmukh Rao, 3rd Year, B.Tech. CSE

Page 20: CSEA Linked List Node3

Geek Corner

arguments to a command which you have executed earlier. • !?string[?] - It refers to the most recent command containing “string”. The trailing ? may be omi� ed if “string” is immediately followed by a newline.

2. Word Designators:

• n - The nth word, count star� ng from 0. 0th word normally refers to the command. Example: $sudo cat /etc/resolv.conf //Instead you want to edit the resolv.conf fi le $sudo vi !!:2 //This is equivalent to sudo vi /etc/resolv.conf • ^ - This refers to the fi rst word. This is equivalent to :1 as refered above. The only advantage is that you can omit : (colon) when you use ^. Example: $cat ~/.bashrc $vi !!^ //Equivalent to “vi !!:1” that is vi ~/.bashrc • $ - This refers to the last word. • x-y - This refers to a range of words; ‘-y’ is equivalent to ‘0-y’. • * - This refers to all the words except the 0th one. This is helpful when you have to execute a command with all the arguments passed to the last command. • x* - This is an alias for x-$ .

Note: If a word designator is used without an event specifi ca� on, the last command in the history is used as the event. Example :$cat ~/.bashrc

$vi !:1 // This is equivalent to vi !!:1 or vi ~/.bashrc

3. Modifi ers:

• h - This removes the trailing fi le name component, leaving the head. Example: $cat /home/spsneo/.bashrc $ls !!:1:h //This expands to ls /home/spsneo Explana� on: !! refers to the last command and then :1 refers to the 1st word of the last command and then :h removes the trailing fi le name component i.e., .bashrc Hence the expansion. • t - This removes all leading fi le name components, leaving the tail. • r - This removes the trailing suffi x of the form .xxx, leaving the basename. • p - Print the new command but do not execute it. • s/old/new - This subs� tutes the fi rst occurrence of “old” with “new”. Example: $cat ~/.bashrc $!!:s/rc/_history //Expands to cat ~/bash_history • g - This is used in conjunc� on with ‘:s’ modifi er. This causes changes to be applied over the en� re event line rather than just the fi rst occurrence. Example: $cat test.cpp test.h $!!:gs/test/source/ //This expands to cat source.cpp source.h

Adopt these features to save yourself from a lot of repe� � ve typing and enjoy the terminal :)

Computer Science and Engineering Association, IIT GuwahatiLinked List 20

Page 21: CSEA Linked List Node3

Digital Mind

This ar� cle will serve as an appe� zer for the exci� ng fi eld of computa� onal methods in bio and chemo-informa� cs. The ar� cle illustrates the importance of this growing area and the materials and methods involved. It goes on to men� on the steps involved in the making of a QSAR model and compares two popular techniques employed for this fi eld, that of ar� fi cial neural networks and decision trees.

Ever wondered what you can fi nd in common between images, text, the shape of clouds, ac� vity of chemical compounds and cricket match scores? In one word, the answer could be “Pa� erns”. For images, this can be the pixel representa� on of the images; for text, the frequencies of certain le� ers; for clouds, the fractal pa� erns in their shapes; for cricket matches, it might be the runs scored by a batsman in the last 10 matches. For chemical compounds and their biological ac� vi� es, well, let us delve a li� le deeper.

From a certain point of view, any pa� ern can be seen as matrices and vectors and in case of the input data, we refer to this as the input feature vector. When represen� ng images, the feature values might correspond to the pixels of an image, when represen� ng texts, perhaps to the occurrence frequencies of le� ers. For a set of chemical compounds (which is our input dataset), the feature vector will correspond to a certain number of molecular descriptors for each of the chemical compounds. These molecular descriptors are various structural features of the compounds like molecular weight, electronega� vity, diameter or the number of rotatable bonds or hydrogen atoms in the

molecule. The vector space associated with these vectors is o� en called the feature space. To enable us to “visualize” or see this dataset, we might have to employ some dimensionality reduc� on method (like principal component analysis - PCA, or its nonlinear variants) and reduce the original feature space to a 2D or 3D space.

Figure: Structure of HEPT deriva� ves (some of which exhibit an� -HIV ac� vity). This structure can lead to millions of possible compounds by various combina� ons of R1, R2 and R3!

This is the generic structure of a set of compounds, 1-[(2-hydroxyethoxy)methyl]-6-(phenylthio)thymine or simply, HEPT deriva� ves. What makes this class of compounds noteworthy is that some of them have been shown to exhibit an� -HIV ac� vity and inhibit the HIV retrovirus. This makes them exci� ng objects of research for the drug discovery industry. Now look at the structure closely. Here, R1, R2 and R3 are alkyl groups and X is either oxygen or sulphur. Each of R1, R2 or R3 can be a hydrogen atom, a methyl group, an ethyl group, an isopropyl group, a halo alkyl group and so on. The list goes on expanding as we increase the

Drug Discovery

Om Prasad Patri, 3rd Year, B.Tech. CSE

21

Predicting Structure-Activity Relationships

Intelligent

Page 22: CSEA Linked List Node3

Digital Mind

Computer Science and Engineering Association, IIT GuwahatiLinked List 22

number of carbons and branching. Let us say we have 1000 groups which can act as R1, R2 or R3. Considering all possible combina� ons, we have 109 (one billion) possible compounds. We don’t want to miss out on any compounds which can be poten� al HIV inhibitors. Now, it is not feasible to individually manufacture all these 109 compounds and test their an� -HIV ac� vity!

However, our task would be much easier if we knew some about “pa� erns” or rela� ons between the chemical structure and an� -HIV ac� vity. For example, it could be a set of IF-THEN rules like: IF dipole moment of R1 is high AND molecular weight of R2 is low, THEN an� -HIV ac� vity is high. Such rela� ons between structural features of a chemical molecule and its biological ac� vity are termed as quan� ta� ve structure-ac� vity rela� onships (QSARs). QSARs can be seen as a method to encapsulate the chemical and biological informa� on about a compound such that some conclusions can be drawn about the rela� onships between chemical structure and biological ac� vity. In this ar� cle, we will consider two major areas where predic� on of QSARs is of prime importance: drug discovery (specifi cally: designing compounds with high an� -HIV ac� vity) and predic� ve toxicology (specifi cally: iden� fying compounds with carcinogenic poten� al). The drug discovery applica� on will be treated as a regression exercise (predic� ng an an� -HIV-ac� vity index of the chemical compounds) and the toxicology applica� on as a classifi ca� on task (predic� ng whether a compound has carcinogenic poten� al or not).

Based on a database of some known compounds and their biological ac� vi� es, intelligent techniques can be used to determine QSARs for them. The QSARs can then be used to predict proper� es of a large number of compounds which have not yet been manufactured, termed in-silico drug design. Only selected compounds which are expected to have desirable biological ac� vi� es should be prepared in the industry to increase effi ciency, thus saving on � me, money and resources.

Computa� onal methods can greatly help in the drug design process by predic� ng ac� vi� es of compounds before they are actually manufactured. For the toxicology applica� on also, proper predic� on of carcinogenic poten� als can ease the tasks of

various drug regula� on authori� es as well as reduce rampant animal tes� ng in the area. These techniques include approaches based on sta� s� cal and machine learning, pa� ern recogni� on, clustering, similarity-based methods, as well as biologically mo� vated approaches, such as neural networks, evolu� onary approaches or fuzzy modeling, collec� vely described as Computa� onal Intelligence. Applica� ons of ar� fi cial intelligence (AI) methods involve selec� on of relevant informa� on, data visualiza� on, classifi ca� on and regression, op� miza� on and predic� on.

A generic step-by-step process for modeling any pa� ern recogni� on problem is as given in the fi gure. Since QSARs are also essen� ally pa� erns, so, designing a QSAR model follows essen� ally the same steps.

Figure: A step-by-step model for fi nding ‘pa� erns’

Our aim will be to predict the values of the fi nal ‘target’ ac� vity by using the data in the input feature vector. This will involve extrac� ng complicated rela� onships between various variables in the input and the target data. We will have to construct a learning system (for regression/classifi ca� on) , ‘train’ the system with data from previously observed ‘target’ values (referred to as supervised learning), and then use the trained

Page 23: CSEA Linked List Node3

Digital Mind

Computer Science and Engineering Association, IIT GuwahatiLinked List 23

system to predict the values of an� -HIV ac� vity index (regression) or predict a ‘true’ or ‘false’ value for carcinogenic poten� al (classifi ca� on). This system has then to be tested on data which were not in the training set so as achieve suffi cient generaliza� on. Finally, we have to interpret the results in terms of their biochemical signifi cance thus leading to rela� ons between chemical structures and biological ac� vi� es of compounds, or simply, QSARs.

Summarizing, the major steps for the implementa� on of a QSAR model (as shown in the sketch) would be:

• Dataset Preprocessing : Selec� on and preprocessing of a dataset with a well-known endpoint e.g. index of an� -HIV ac� vity or carcinogenicity or toxicity

• Chemical Representa� on : Iden� fi ca� on and calcula� on of relevant features (descriptors)

• Construct classifi ca� on and regression models :

Detec� on of rela� onships (QSAR models) between these features and the concerned endpoint

• Evalua� on of the predicted model : The QSAR model and learned rela� onships have to be validated and their performance evaluated

• Interpreta� on of the rela� onships in terms of the defi ned endpoint

Ar� fi cial neural networks (ANNs), with a layered architecture, can be used to model such a complicated rela� onship to provide any desired mapping. The rise in use of ANNs for pa� ern recogni� on problems is because of their ability to generalize input-output rela� onships from a limited set of training data. However exci� ng ANNs may seem, they alone are not deemed good enough by today’s standards and have given way to more advanced methods. Decision trees are another type of pa� ern classifi ers which arrive at a decision by a sequence or hierarchy of stages, choosing one branch of the tree at each intermediate stage. A decision tree is basically like any other tree data structure but here, the decisions taken while traversing the tree (which branch to choose) are decided by a certain condi� on or ques� on asked in the parent node.

Decision tree classifi ers are sequen� al compared to the massive parallelism of ANNs. ANNs also have be� er generaliza� on capabili� es. However, for our purpose decision trees can be more benefi cial as some structural feature of a chemical compound can be directly represented as a node in the decision tree instead of the elusive hidden layers in neural networks. Training or learning using decision trees is faster than neural networks because in a decision tree, all the training examples are considered simultaneously to make every decision. Further, decision trees do not impose any restric� ons on the distribu� on of the input dataset unlike many other methods. If we could make a ‘hybrid’ structure which combines the best features of neural networks and decision trees, we could probably get the best of both worlds.

Page 24: CSEA Linked List Node3

Nostalgia

Krishna Kishore Annapureddy (h� p://linkedin.com/in/akkishore/) is an alumnus from the batch of 2004-2008. He was the General Secretary, CSEA during his fi nal year at IIT Guwaha� and was popularly known as ‘KK’. Unsa� sfi ed with his job as a So� ware Engineer at Google, he went on to work for ‘Knowlarity Communica� ons’ (h� p://www.knowlarity.com), a startup formed by a group of IITians, which provides automated communica� on solu� ons.

Komal Jalan, Batch Representa� ve of 3rd year B. Tech. CSE, interviews him for CSEA.

Recount some parts of your life at IIT Guwaha� .Now that I am out of IITG, I can say its one of the best parts of my life. Friends, par� es, labs, quizzes, exams, alcher, manthan, techniche, achievements, failures, girl friends, breakups, makeups, and fi nally gradua� on, its a wonderful mix of everything. To the 4th year guys, I should say this is your last shot at it. 8th sem is your best. You will have the least responsibili� es. Enjoy to the fullest and become the laziest ;)

Was Google a dream job for you as for most IITians?It was nothing like a dream job. I was actually confused in my 7th semester on whether to go for a MS or a PhD or a job. I was not able to decide and ended up taking the fi rst job that I got.

Why did you leave Google?I didn’t like the work that I was doing. There was not much of a challenge. As a fresher, mostly I was doing JS and HTML templates and etc. I am more interested in building systems. Google’s organisa� on of teams is such that there are separate teams for infrastructure/systems development and as a fresher you cannot be

moving to those teams.

The main mo� va� on behind the start-up plan?Just to clarify I am just an employee in the startup that I am working in currently. As an early engineer in a startup, you get to do the most exci� ng work: build systems from scratch and that’s far more fulfi lling andrewarding.

What are your further plans?Nothing specifi c for now. Given that I own some part of the systems developed at the startup, I am not planning on moving. I will try and make them be� er. But in the long term I have plans of star� ng my own venture.

Any advice to the students of IITG?The community that you have around you, your peers, your professors, and the infrastructure that you have at your disposal are the best things that the ins� tute has to off er you. Make best use of them. Also I would like to propel the culture of startups among you guys. I am also learning and I will not give any advice here. But I would like to give you examples of some startups of IITG alumus: • Drish� So� (h� p://www.drish� -so� .com/) by

Sachin Ba� a 1999-01• Muziboo (h� p://www.muziboo.com/) by Prateek

Dayal 2001-05• ViVu (h� p://www.vivu.tv) by Siva Kiran 2002-06• Cash UR Drive (h� p://www.cashurdrive.com/) by

Raghu Khanna 2004-08.Check them out. If you guys have ideas, talk to your peers, seniors, professors. Make use of EDC at the ins� tute and let it come to life. Be ready to take risks in your life, if you are not risking, you are risking it all”. All the very best.

24

KrishnaKishore

Page 25: CSEA Linked List Node3

Bird’s Eye View

ANDROID is a fairly recent mobile opera� ng system that delivers a complete set of so� ware for mobile de-vices: an opera� ng system, middleware and key mobile applica� ons. It was ini� ally developed by Google, and later the Open Handset Alli-ance (OHA), a mul� na� onal alliance of 48 technology and mobile industry lead-

ers. The Android mobile pla� orm has enabled wireless operators and manufacturers to give their customers be� er, more personal and more fl exible mobile experi-ences. The en� re source code of Android is available under an Apache open source License.

Android runs on top of the Linux kernel, and allows developers to write applica� ons in Java, using a set of Java libraries bundled with the Android pla� orm. Fur-ther, it u� lizes a custom JAVA Virtual Machine that was designed to op� mize memory and hardware resources in a mobile environment.

Highlights of Android

• Open : Android was built from the ground-up to enable developers to create mobile applica� ons that take full advantage of everything a handset has to of-fer. For example, an applica� on can call upon any of the phone’s core func� onali� es such as making calls, sending text messages, or using the camera, thus al-lowing developers to create richer and more cohesive experiences for users.

• All applica� ons are created equal : Android does not diff eren� ate between the phone’s core applica-� ons and third-party applica� ons. They can all be built to have equal access to a phone’s capabili� es. With de-vices built on the Android Pla� orm, you are able to ful-ly tailor the phone to your interests. You can swap out the phone’s homescreen, the style of the dialer, or any of the applica� ons! You can even instruct your phone to use your favourite applica� on to view photos.

• Breaking down applica� on boundaries : With

25

Android, a developer can combine informa� on from the web with data on an individual’s mobile phone - such as the user’s contacts, calendar, or geographic lo-ca� on, to provide a more relevant user experience. For instance, you can view the loca� on of your friends and be alerted when they are in the vicinity.

• Fast and easy applica� on development : Android provides access to a wide range of useful libraries and tools that can be used to build rich applica� ons. For ex-ample, Android enables developers to obtain the loca-� on of the device, and allows devices to communicate with one another enabling rich peer-to-peer social ap-plica� ons. In addi� on, An-droid includes a full set of tools that have been built from the ground up along-side the pla� orm providing developers with high pro-duc� vity and deep insight into their applica� ons.

Market Share

According to Q2 2009 mar-ket share data from Canalys, the share of various Mo-bile OSes in the worldwide smartphone market, in or-der, is Symbian (50.3%), RIM Blackberry (20.9%), Apple iPhone (13.7%), Windows Mobile (9%) and then An-droid (only 2.8%). However, Gartner Inc has predicted that by 2012, Android will hold 14% share in the global smartphone market, ahead of iPhone, Windows Mo-bile and Blackberry smartphones. Android will rank 2nd globally, behind the Symbian OS(39% by that � me).

By providing developers a new level of openness that enables them to work more collabora� vely, Android has accelerated the pace at which new and compelling mobile services are made available to consumers.

ByPuneet Jindal4th year, B. Tech. CSE

Android

he Linux kernel,, and allowsapplica� ons in Java, using a sesett of

ndled with the Android pla������� ooooooorrmrmrmrr . Fur-es aa cucuststomom J JAVAVAA ViVirtrtuauall MaMachchchchchchchchiniinininiinee ththatat w wassas

d to op� mize memory and haaaaahaardrdrdrddrdrdr ware resourcesobile environment.

Highlights of Android

• Open :::: AA Andnn roidddd w w wwwwwwwasaaaaaa bbbbuiuiuiuuiuiuiltltltltltl ff f roororom m mmmmmm thtthtttt e grrououououndnnn -uuuuuuuuuupppppppppppppppppppp pppp tototototototototttttttotttttttenabababblelelee dd d devevevele opopopppppperererererererrsss s s ss totototoott c c c cccrerereeeeeeatatataata eeee e momomomomoobibbibibibibb lelelll aaappppppppppppplilililililillililicacacacacacacacacaccac � ons ththththataaatatatatatakekekeke f f ffululull l l adadadaddvavavavavavavavaantnnnntntntnnttagagagagge e eeeee e fofofofofof eeeeeeeeevevevevevevevevvevevevv ryryryryyyythththththinininininingggg g g aa aa a aaa hahahahahaaahah ndndndndndndnnnnnn sesetttttt hahahahahahhaas sssss tototototototototototoo o o f-fefer.r.rr F FFFororoorrororroro e eeee e eeexaxaxaxaxaxaxaxaxaxxxaxx mpmpmpmpmpmpmpmpmpppppleeleeeeeeeee, ,, , , ananannnnn a a aa a aa aapppppppppppppppppp lililillilllil caacaaacaaaaaaaacaca� �� � ������ ��� �onoonononononnn cc cccc c ccannaaananananan c cc alalalalllalalllll ll l upupupuppuppupuuppupupooonooooooonnonoo aa aaanynyyynynyyy ooo o oo ff fffffffththhhhhhhee e e ee e e e phphphphphphphononnnnnne’e’e’e’e’e’e’e’e’eeee ssss sssss cocococococooocococcoc rererererererererererereee fffff funununununuuuuuuuuu c�c�c�c��c�c�c�c oo ooo o onanananaananananananaalililililililililililii�� ��� �� ����eseseseseseseseseseseseess s s ssssssucucucuccuuchhh hh hh asasaasasasasasss m m m m m mmmm mmakakakakakkakkakkininininininng g gg ggg cacacacacacacacacaalllllllllllllllll sssss,sss,seseseseseeeesendndnndndndndnndnndinnnnnnnnnnnng ggggg g g gg ggg teteteteteteteteteteteteteetextxtxtxtxtttxttxtttttt m m mm m mm mmmeeseseseseseeee sasasasaaaaaaaaaasaaaaaaagegeggegegegegegegegegegegeggg s,s,s, o o ooooooooooooooor r rrr usususuu iniiinininnnnnnnng g g g ggggg gg gggg ththththttthttththhhhhe e e ee eeeeeeee e caaaaaameemememeeerarararaararararrr , , , ,,,, ,, , thththththththtthhhususuuuuusssu aaaaa aaaa aal-l-l-l-lllllllllooowiwiwiiiwiwiwingngngngngngngngngnngng dd dddd d dddd ddd evevevevevevevevevevvvveleeleelleleleeleeeleleelelelelelooopopoppopopppppppopoooppeerererereerererererrerere s s sss s ss s totototottototootootototo cc c cc c crereeeeeeererereeeeeee tatatatatatatattattataataaatata e ee iiiiririririririririirrichcchchchchchchchchcchcchchchhhhererererereereeeeeerere a andndnddndnddddndndndndndndndndndnndnnnndnddn mmmmmmmmmmmmmmmmmmoooroorororoooooooooo e eeeeeee eeee ccoccccccc heheheheheheheeeeeehheheeeeheh isisisissisisisissisisississs vevveeeeeevvveevvveevvevveeeexxpeeriiiririririienenenenenenenencececeeeeeceessssss s s s fofofofofofofofofoff r r rr r r r rr usususususssususussererrrerrrerereereerss.s.s.sss.s.ss

• AlAlAlll l apapapplp icca�a� oonsnsnsnsss aaaa a a aarererererererr c cccc c ccc c cc ccccrerererererererererrerrereatatatatatatatatatedededededddddeddededed e e e e ququalalallalalalalllalalalalaaalalaalaaalalalaal :::: ::::: : ::: ::: ::::: A AA AA A AAA A A A AA AAAAAAndndndndndndndddndrororororororooooooooooooooididididididididididididididididididdididididiid d ddddddd d d d d dddd dddooooooeoooeoeoooeoooeooeoeoeoeooeoeoeoeo ssnononoott t dididiffffffff ff erereerererenenenen��� �atata eee bebbbetwwtweeeeeeeeennnnnn n ththththththhhththththhtheeeeeeeee e e e phphphphphphpppphphphphphphp ononononnnnonnnno e’e’eeeeeeee sss s sssss cccccccoccocccoccccccccccccocc reerererereeereereeeeeeeeeeeeeee a aa aa aaaa aaaapppppppppppppplllililililillil cacacacacacacacacacacacccaccaccccacaccacacccaccacca-� �ononns ss anand d thhthirird-d-d papap rtrty y apappllplp icicicica�a�a�a oo o nsnsnsnsssss.. . ThThThThThThThThThThThTThT eyeyeyeyeyeyeyeyeyeyeyeyeyeyeyey ccc ccc c c c c ccccananananannananananaa aaaaaa aaaaaallllllllllllllllllllllll bbbbbbbbbb bbbbbbbbb bbbbbbbbbbb bbbbbbe eeee e eeee eeeee eeeeeeeeeee bububububububububububuubuububuuuuubbbbbbbbbubbb ililililiiiiliiltt t t ttotto hhhavavave ee eqeqeqe uauauauall l acacceec ssssss t t o oo a a phhp onone’e’s s cacacapaapap bibiibibibibibililiilililililiil � �� � �� ���� � ���esesesesesesesesseseseseses. . . .... WiWiWiWiWiWiWWiWiWiWWWWiththhthththhhththththththt d d d d ddd dddddddddddddddddddddde-e-ee-e----e-e---e---viv cecees buuillt tt onnnn t tthehehe AAA Andndnn roor ididid PP P Plalala� orrm,m, yyouuo a areree aaaa aablblblblblblbblb eee e eeeeee tototototototototototo fff f f f f fff ff fululululululululuulululuu ----lylylyyy t ttaiaiailolor rr thththe ee phphppp ononone ee tototo y y yyyyyououo r r inteteterereereststs s.s YYouou ccannan s swawaapp p ouououououuooouououttt t tt t t t ttht e phhhononone’e’’ss s hhohohomememescscscrerereenenenn, thhe e ststylly e e ofoffof t t theeheh d diaiialleleerrr, oo o rrr anna y y

f the applica� ons! You cann eeveven n instruructcttct yy y youuoo rr phhonone eese y youour r fafavovoururititee apapplplicica�a� onn tot view phhotottosossos..

g down applica� on boundaries : WWitth h

drdroioidd ininclclududestools that have been builtfrom the ground up along-sidede tthehe ppla� orm providing developers wwwwwwitititiitth h high pro-dudud c�c�vv vitittittyyyy y aanananaa ddd dededeeppppep iiiii insnsnsnsnsnsnsn igigigigggghthtintotooo ttt tttheheheheheheiriririr apppplililililicacaacacac � �� �ononnoo ss.s.

Market Share

AcAcAAAA cording to Q2 2009 mar-keketttt shshshshshshshshararararararaaare ee data from Canalys, the share offf ffff f ff ffff ff vvvavavavavavavvvvavvv rious Mo-bibibibibiiibibibibbibibibbibibibbbibbibbibibbbbbbbbbbbbibibbbbiileleleleleleeelellelelelelleleeeeleeeelel O O O O OOOOO OOSeSeSS s ss iniininininninin t t ttttt ttttt thehehehehehehehehhheehehh w w orldwide smartphonnnnnnne e e ee ee e eeeeee market, in oor-r-r-r-dedededer,r,r,r, i s SySySyyySySSySySSySySySySySySySSySySSySySySySyySyySySySSySSySSySyyyyyyyyyymbmbmbmbmbmbmbmbmbmbbbmmbmbbmmmmmmmbbmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm iaiaiaiaiaiaiiaiaiaaiaan (5555(5( 00000000.0000.0.3%3%3%3%33%3%3%3%3%3%3%3%33%3%%3%3%%),),),))),)))),),,)))), R IM Blackbeebebeebeebeeeeeeeerrrrrrrrrrrrr y (20.9%), AAAAppppppleleeiPiPiPiPiii hone (( (13313311 .77.77777777777777 )%)%%)%)%%)%)%%)%)%%)%)%)%%)%%%)%%%%%%%)%%)%%%%)%)%)%)%%)%)%%%)%)%%)%)%))))%))), ,,,,,, ,, , , WiWiWiWWiWiWWiWWWWWWWiWWiWWiWWiWWWWWWWiWWWWW ndndndndndndndnddndddnddddndndndnddnddddddddnddddddddndddddddddnddndndddowooooowowowoooooooowwowwwws ss s MoMoMoMoMoMoMMM bibibibiiiiiiiiib leleleleleleeleleleleleleleleellll (9%) and thhhheenene AAn-n-drdrdrdrrdrdddrroiioioiiioio ddddddddd d d d (o(o(o(o(o(o((o(o(o(o(oonlnlnn yy 2.2.8%8%8%8%)).). HH HHHHHHHowowowowowowwwwwwwwowowowowowwwwwowowwwwwowowwwwwowwwwwwwwowwwwwwwwwwwwwwwwevevevevevevevevvveveveeveevevevevevevvevvvvvvvevvevveveveeeeeeeeveveveeeeeeeeeeeeveeevveerereeerererereerererererreererererereeereeereerrrrrr,,,,, ,,, ,,,,, , , ,,, GGGGGaGaGaGaGaGaGaGaGaGaGGGaGaGaGGaGGaGaGGGaGGaaGaGaGGGaGaGaGaGaaaGGGGaGaGGGGGaGGGGGGG rtner Inc has s s s ss prprp edediicted thththththththtththththththttt atttatatatataatattatt bbb b b b b y yyy 200202020220201212121212221211212221212212, , , AnAAnAnAAAAAAAA drroioioidddd d wiwiwiwiw llllllll hh hhhh oldd 14% sharrrrrre ee e inininin tttt thehehhh g globaal lsmsmsmsmsmmsmsmmmmsmmaraaararararrarrarrrrrtptptptptpptpptptptppptppppphohohohohohohohohoohohoohoh nnnenennnnnnnnne mmarararrrarkekekekekekkekekeekeketttttttttt,t, a aa aaa a aa aa aaheheheeadadddadadad ooo o oooofffffff f ff iPiPiPiPiPPhohohooh neeeneneneneeenenenenenenen ,, WiWindndowowowowwss s s s MoMoMoMoMoMoMoMoMoo----bibbbbbbbbbbbbbb leleleleleelelele a a aa a a a aandndndndndnnddndnd B B BB BBBBBBBlallalalalalalalaaaaaaackckckckckckckckckckckkc bbbbbbbebbeebeerrrrrrrrry y yy y yyyyy smsmmmmmararararararrartttptptptptptptptpttptpptptpphohohhhhhhhhhh nenes.sss.s.s.s.s A AAndndndroididid w www wwwwwwwwwwwilililililililililiiii l l llllllll rarararaa kknk2n2n2n2n2n2n2n22nnnnd dddddddddddddd glglglglglglglglglgglglgglobobobobooobobbbbbobobbbobbbbo aaalalaaalalallylylylylylylylylylylly, bebbebebbbbbbeb hhhihihihihihhhihhhih ndndndndndnndnddndndnd t tttttttttthhhhhhehhhhheheheh SSS SS S ymyymbiiianaannnananananananannaaaaa O O O OSS(S(SSSS 3339% % % %%%%%% bybybybybybybybyybybybybybybybyybybybybybyy t t t tttt tttttttttthahahahahhhhh t � me).

ByByByByByByyByByByByByByBByByByyByByByByByByBBByBB p p p p p p pp ppppp ppppp p ppprorororororrrorororororororor vivivvivivivivivivivivvviviviviviviviviv didididididididididididididididididididddddiddd ngngngngnggngngngngngngngngg d d d d d d dd d d d d dddddeeeeeeeveveveveeeeve elelelopoppoppppppppppppppeeeeeererereeeeeeee s s a new w leleevevel l ffofof openness s s ththtt atatattatttteeeenenenneneeneneneeeneneneneennnababababababbbabbblelelelelelelelelelelelesssssssss s thththththththththththttthhhheeeeememememeememeee t t to oo wwwwowwowowwwwwwwwwwwww rk mmorrrre e eeeeeeeeeeeeeee cocococococococococococococcocococococccococococoollllllllllllllllllllllllllllllllllllllllllll ababbbabaaborora�a� v velely,y, AAAAndndndndddndnddndnddndnddddddrororooororoooooooooidididiidididididididididihhhhahahhas s s acacccccccececececececeecececececeellleleleleleelelelll raatetted d ddd d d d d d ddd ddddd ttthththttttttt e papacecee a a aaaaaaatt tttttttt whwhiich h new and d cocococ mpmpmmmpmmmmpm elelllllilililililingmomomomomomomomomomomomommomom bbibibibibibibiibiibbibibilllllelelelellell sererviviviv cccccececccccces are e mamadeddedededed aaaaaaaaaaaa aavvvavavvvvavvvavvvvv ililii able to coooonsnsnsn ummmmmmmmmmmmmmmmmmmeereererereeeeeeeeerersssss.

ByByByPuPuneetetet J J Jini dadadadaallllllll4t4 h h yey ar, B.B TT Tecece hh.h CCC CSESESESESEEEE

Page 26: CSEA Linked List Node3

As always, Google is your friend. But you need some trivial and intelligent manipula� ons before seeking help. Best of Luck !

Q1. Who was the fi rst HOD of IITG’s CSE Department?

Q2. Connect the pictures below.

Q3. [This ques� on appeared in a compe� � on held at Stanford in 1985.] It is widely known that syntac� c and seman� c correctness are dis� nct. A gramma� cally correct sen-tence might be meaningless. Noam Chomsky gave a famous example: “Colourless green ideas sleep furi-ously.” Compose a passage in which this statement becomes meaningful. The shorter your passage, the be� er.

Q4. Connect the pictures on the right.

21The Turing Test 26

Page 27: CSEA Linked List Node3

The Turing Test

Q5. Find a connec� on between the pictures above.

Q6. In 1963, Harvard linguist Susumo Kuno asked a computerized parser to process the sentence, “Time fl ies like an arrow.” The computer gave 7 diff erent in-terpreta� ons of the same sentence - which you’d fi nd on Wikipedia.

Consider the phrase: “Outside of a dog, a book is man’s best friend.” Give as many interpreta� ons of this as you can possibly come up with.

Q7. An arcsecond is 771.6*10-9 of a circle. The wave-length of green light is 550*10-9 m. Light takes 3*10-9 s to travel 1 meter. The radius of the Hydrogen atom is 2.5*10-11 m. The charge on a proton is 1.6*10-19 C. There is no doubt that the “scien� fi c nota� on” has greatly expanded our capacity to state numbers, both large and small. Yet, we could create smaller numbers s� ll, say 10^(-10^10^10^10^(4.829*10^183230)). You know well that there is no least posi� ve real number. Thus, if you played a game with a friend, in which you took turns to write a smaller number, no clear winner would ever emerge. But now we challenge you - you have only one chance, and only one sheet of A4 paper. The game becomes interes� ng.

Name a single posi� ve real - using mathema� -cal nota� on and/or language that any modern math-ema� cian could understand. Your objec� ve is to write a number smaller than that wri� en by everyone else.

Some acceptable answers include 4, pi, 1/pi, “the reciprocal of that number obtained by adding pi to itself 16 � mes”, numbers in scien� fi c nota� on - say 6.626*10-34. You might also use func� ons - “the re-ciprocal of the exponen� al func� on e^x, evaluated at x=500”, say. You could defi ne new func� ons like f_500(1000) where f_0(x) = e^-x, and f_i(x) = 0.5*f_(i-1)(x).”

Answers which are not acceptable are things like, “half of the smallest number wri� en by the other contestants” - for you cannot assume their existence, and “my age, divided by the age of the universe” - a mathema� cian could not evaluate such a number. It has to be greater than zero, and we are talking about standard real analysis - modern mathema� cs has produced such marvelous ideas such as Internal Set Theory, which talk about numbers smaller than any known real - these we have not the background, nor the � me to evaluate.

Computer Science and Engineering Association, IIT GuwahatiLinked List 27

Submission Instructions

- Mail your answers to [email protected] with the subject as “Turing Test”- Only one entry per email id would be accepted- Deadline for submissions is 6 PM, 9th November (Monday)- Members of CSEA and all ‘related’ are not en� tled to par� cipate- Name and photograph of lucky winners will be

published in the next issue.

Page 28: CSEA Linked List Node3

The Turing Test

Computer Science and Engineering Association, IIT GuwahatiLinked List 28

Answers of the Turing Test (Node 2)

1. Google Copernicus

2. X - RocketMail Y - YahooMail

3. Rediff

4. Padmasree Warrior

5. “You” , In 2006, Time magazine chose “You” (the people) as its Person of the Year , In 2005 it was Bono, Bill and Melinda Gates, In 2007 it was Pu� n.

6. Alice and Bob

7. Wikimedia Founda� on, Picture signifi es the cri� cs of its wikipedia project

8. Alexa Internet, Inc. (U.S.-based subsidiary company of Amazon.com)

9. Reddit and Digg

10. Longhorn and Blackcomb projects of Microso� leading to Vista

Winners:

First PrizeP V Ravi Kiran Sastry,Department of CSE.

Second PrizeSameer Agarwal,Department of CSE.

Page 29: CSEA Linked List Node3

Linked ListBrought to you by:

Computer Science and Engineering Associa� on,Department of Computer Science and Engineering,Indian Ins� tute of Technology Guwaha�

Email: [email protected]: h� p://csea.iitg.ernet.in

Mail in your sugges� ons to [email protected]. Visit h� p://csea.iitg.ernet.in for more.

Save Trees.Do not waste paper.

The Editorial Team

• Om Prasad Patri (Editor)

• Abhishek Anand

• Ni� n Dua

• Karthik R

• Siddharth Prakash Singh

• Vinay Kumar (Design)

LLLLiiinnnkkkkeeedddd LLLiiissstttBrought to you by: