csea linked list node4

28
Faculty Talk Prof. Inside: 4 th Year Special.. Cover Story

Upload: iit-guwahati

Post on 23-Mar-2016

219 views

Category:

Documents


0 download

DESCRIPTION

Magazine of the Computer Science and Engineering Association-Student Body Department of Computer Science and Engineering,Indian Institute of Technology Guwahati,India

TRANSCRIPT

Page 1: CSEA Linked List Node4

Faculty TalkProf.

Inside:

4th Year Special..

Cover Story

Page 2: CSEA Linked List Node4

CONT

ENTS

editorial03

Faculty Talk04Dr. S. B. Nair, Associate Professor, CSE

08Neminath Hubballi, Research Scholar CSE

16Anurag Nilesh / Ashish Thakur, 4th year B. Tech. CSE

Memcached : Scaling your Web Application11Siddharth Prakash Singh, 4th Year B. Tech. CSE

COVER STORY

Ajax : Pitfalls and Solutions18Shirish Surti, 2nd Year M. Tech. CSE

GEEK CORNER

CSP for Verifi cation of Security Protocols20Niteesh Kumar, 2nd Year B. Tech. CSE

DIGITAL MIND

Abhishek Gupta22NOSTALGIA

4th Year Special : BTP Mania25

Internship : Do’s and Dont’s

Internet 3.0 : A New View of ConnectivityDIGITAL MIND

GOOD TIMES

You@YourDomain with Google Mail24

4th Year Batch Photograph28

Karthik R, 2nd Year M. Tech. CSE

Page 3: CSEA Linked List Node4

Computer Science and Engineering Association, IIT GuwahatiLinked List 03

Recently attending a workshop on “Science Journalism and Communication“, I realized how grave the problem of communication in science and technology is. It is one thing to understand what is happening (scientifi cally or technologically) but it is a different ball game altogether to present to a layman and make him understand. Especially in India, where the percentage of media coverage and journalism for science is much below the recommended international standards, the issue is quite pertinent. Often, due to temptations of sensationalizing scientifi c news (or just plain ignorance of the reporter), it leads to miscommunication. Just think of how diffi cult it is for most of us to understand a scientifi c paper in any journal or magazine. And we are in a (supposedly) technologically-aware campus. One brilliant solution I happened to learn of (at this workshop) is that of ‘Scientoons’ (www.scientoon.com), originated by Pradeep K. Srivastava of the Central Drug Research Institute (CDRI). A scientoon is a science cartoon which not only makes you laugh and smile but also learn a scientifi c concept in a novel manner. And so in this issue, instead of the usual comic strips, you will fi nd a select few scientoons here and there. Go, search for them!

Through Linked List, we have constantly endeavored to learn how to bring to you computer science stuff and make you understand whilst retaining your interest in the topic. This fourth node of Linked List is yet another addition to that effort.

For all the people in the campus excited about going on their summer internships, we have some special tips. And for keeping the memories of our beloved 4th yearites, the to-be alumni, we choose a few fi nal year people, their BTPs and take a look at the work they have been doing. Without labelling the issue as ‘futuristic’, I would like to say that it covers a few technologies which will (or already have) started dominating the web as we see it today. Take the instance of memcached, which is currently used by almost all web applications and among the vast universe of it users, you might have heard of a certain Facebook or Twitter or Digg (or LiveJournal or YouTube). Be it about emotional robots or very-large-scale web applications or the new Internet 3.0, this node has something of each of them. So wait no longer, and delve deeper!

With this node, comes the end of my tenure as Editor of Linked List. I hope that you have enjoyed reading the previous issues as much as we have enjoyed creating them. I would take this opportunity to thank the entire Editorial Team for the brilliant support they have provided. My special regards to V. Krishna Brahmam, 2009 alumnus CSE IITG, for the design templates. We wish the new team all succes and hope that the succeeding Nodes of this Linked List will continue the tradition of excellence!

Om Prasad PatriPublication Secretary, CSEAEditor, Linked List

Editorial

Page 4: CSEA Linked List Node4

Prof.

Faculty Talk

S. B. NairDr. S. B. Nair, Associate Professor in the Department of Computer Science and Engineering shares his interests with us.

Please tell us something about your academic background.

I was born, brought up and educat-ed in a (then) small city named Amravati in Maharashtra. I grad-uated in Science from Nagpur University and then did a Masters’ in Applied Electronics followed by Masters’ in Electronics Engineer-ing from Amravati University. It was during this phase that I came across AI and then pursued my Ph.D. in an allied area from the same University.

You worked as a Senior lecturer before completing your Ph.D. When and how did you then de-cide to pursue your Ph.D?

Yes, I worked as a faculty at the Post Graduate Dept. of Applied Electronics at Amravati Univer-sity for more than 12 years before I joined IITG. I always wanted to pursue higher education. During my initial years at the Universi-ty, I was selected as a regular can-didate for Ph.D. at the CSE Dept. at IITM. But the University refused

to sponsor me. My would-have-been Supervisor at IITM and sev-eral others advised me not to quit a University job. Instead they asked me to pursue the same at my Uni-versity with help from my would-have-been Supervisor acting as a co-guide. Of course, all this never really worked – thanks to the red-

tape and the very many rules and regulations. After slogging for about 6 years, while also working as a Senior Lecturer and carrying most of my research work at the Central Electronics Engineering Research Institute (CSIR lab.) at Pilani and at the Centre for AI &

Robotics (a DRDO lab.) at Banga-lore, I obtained my Ph.D. in 1998.

What are your fi elds of interest?

I am currently into Bio-inspired Robotics. This is all about Bio-inspired AI paradigms acting as controllers for real robots. These paradigms try to cope up with the mess that the earlier AI researchers thought was the best. I am also interested in developing software and hardware for the physically challenged and deploying them for free.

How did you develop interest in them?

This is a long story. I’d rather cut it real short. It began with my teaching AI for the first time way back in 1987. I was using the same book which I later co-authored. But I found all the stuff within to be a bit theoretical and wondered whether they can ever be realized. Finally I concluded that the chal-lenge is not in the software alone but the hardware too. Well, that’s how the robots, that form the ide-al test-beds for intelligence, came into my interest purview.

with

“How would it be if a robot welcomes you with the same gusto

as when you meet a long lost friend?

Or imagine your car purring on a long, fast ride and then exclaim-

ing ‘Ouch!’ when it goes over a bump

after which it tends to move cautiously.”

04

Page 5: CSEA Linked List Node4

You have co-authored a book on ‘Artifi cial Intelligence’. Tell us something about this book.

Authoring a book was something I dreamed of always but never got to it. The McGraw Hill guys had been pursuing me right from the days when I was a faculty at the University. It finally happened in 2009. The book is one of the oldest ones in AI and the current version happens to be its 3rd edition. The major revision can be seen in the last few chapters that reflect more of bio-inspired paradigms. The book also has its on-line learning centre pages on the web which among others, hosts code for some of the relevant programs.

You have recently done some work related to ‘Emotional Ro-bots’. Could you explain what does ‘Emotional Robots’ mean?

Emotions have always been easy to express but hard to actually define and generate. Their causes and mechanisms in biological beings are yet to be uncovered. Emotional robots provide to some extent a means to remove the monotony which otherwise exists in their manner of working and presenting themselves. How would it be if a robot welcomes you with the same gusto as when you meet a long lost friend? Or imagine your car purring when you take it for a long and fast ride in those under-ideal-conditions-like roads and then exclaims “Ouch!” when it goes over a bump after which it tends to move cautiously. Well all this renders a certain amount of human-like feeling forcing an

imaginary bond between man and robot. Of course emotions are not embedded just for the heck of it. Emotional robots can in many ways act as constant companions and soothe the mentally and physically challenged, in some way or the other. Such robots have been known to have a profound effect on children who have suffered from psychological problems or trauma. Many researchers have in some way or the other tried their hand at making such emotional robots. At the moment this field is all set to take off in spite of several debates on the manner in which the emotions are recognized, comprehended and generated. At the CSE Dept. a few of the students are presently working on recognition as well as generation of such emotions.

Please mention a few interesting projects that you have been associ-ated with in the past and the pres-ent?

After I joined the IITG, I was in-volved, along with other faculty members, in a project funded by the Ministry of Information and Com-munication Technology that aimed at setting up a Resource Centre for Indian Language Technologies. The centre aimed at developing technol-ogies for languages of the North-East. The main objectives were to map these languages, provide for relevant fonts, editors, multi-lin-gual dictionaries, language corpo-ra, OCRs and speech and language translation systems.Around the same time Microsoft Academic Alliance also provided some modest funding in dollars for two projects – one involving a net-

work of robots and the other on an intelligent desktop agent. Though these projects have been completed, a small group of students are still working under my guidance to en-hance the former.Later, based on some wonderfully useful work on software aids for the visually challenged carried out by some of the CSE students, we re-ceived a modest funding from the Government to develop and enhance such technologies. At the moment a group of three B.Tech. students are actively engaged in the design of an autonomous robotic guide for the visually challenged. Chris-tened Nayan, the system, compris-ing both a hardware and a software component, will guide the person both indoors and outdoors. In 2005, I had the opportunity to mentor a great team of four B.Tech. students (Gautam Das, Rahul Singh, Suvesh Malhotra and Ar-chit Gupta) who realized a project codenamed SapienNet, a very use-ful people-to-people network. The students made it to the finals ofthe Windows Embedded Systems Challenge held that year at the Mi-crosoft Research campus in Red-mond, USA.

Recently, the Robotics Lab. at IITG has been enriched with state-of-the-art technology. What are the objectives of this lab?

The term state-of-the-art is a bit of an exaggeration! We have recently received funding to set up labs. for Robotics and Embedded Systems from the Department of Science and Technology, under their FIST programme. This year we have procured the new generation

Faculty Talk

Computer Science and Engineering Association, IIT GuwahatiLinked List 05

Page 6: CSEA Linked List Node4

Lego NXT Mindstorms robots and software such as Matlab and LPA Prolog. Just keep “all” your fingers crossed! If things go well we will soon see the lab being equipped with more robots such as e-pucks and Boe-bots complemented by sensor nodes. We will also be procuring speech corpora and related software and robot simulators like Webots. The biggest problem at the moment is space. While moving around in the lab. these robots will thus need to be deft at obstacle avoidance, not to mention acquire an ability to safe-guard themselves from being stomped! As for the objectives we hope all this will attract more students to actually dirty their hands in the lab. and come up with novel and useful ideas. The lab will of course aim at research in the area and provide for the much needed practical test-bed for the algorithms being developed.

Do you agree that present students are more inclined to doing innova-tive work in software than hard-ware? What are the reasons be-hind this trend?

We seem to take pride in the fact that many countries view us as a massive human resource of software engineers or programmers. On one side this seems to be good but on the other we seem to be writing all the goodies for the hardware for cheap while the others manufacture, embed and sell the hardware to us with the software partly/fully written by us. This is analogous to putting in a lot of effort to publish your work in a (foreign) International Journal and then subscribe for the same by paying

hefty amounts. They seem to have the machinery (hardware) to make the hardcopies while we supply the intelligent material (software) to be printed (loaded/embedded) free of cost! Both ways they seem to get the lion’s share which is possibly why they are a developed country! Imagine if we stressed a bit more on hardware and co-developed it with the software, where would the others be? We may then as well swap the term “developing country” with theirs viz. “developed”. Of course this will take some investment and a paradigm shift from our current mindsets to earn a fast buck.

You spent a year at Hanbat Na-tional University, South Korea. How do you compare the Academ-ic environment there with that at IIT Guwahati?

The academic environment there is great. Its more of practice than the-ory. So a proper blend seems miss-ing which of course is true for us too. I remember seeing a student who required a zinc battery of cer-tain specifications. In a few days he showed me the same all assembled - electrode by electrode - and neat-ly packed and of course working to specifications! The same was the case with robots.Things are mostly custom-made and assembled and there is usu-ally no concept of buying and us-ing ready-made stuff. This way one learns to be more practical while solving a problem and knows for sure whether what one thinks can be actually implemented.The University also runs workshops throughout the year educating school children on how to assemble

and use robots and other things. This is an important service they do to their prospective students. Masters’ projects are many a time industry-related problems. The stu-dent carries out a project that is fi-nally used in some way or the other by the concerned industry. One in-teresting thing is that most of the Dept. offices and labs. are handled by ex-students. So virtually ev-eryone is a technical hand and ap-preciates the problems we encounter in an administrative set-up. Ev-erything is kept spick and span – while the Professors sweep and clean their respective rooms, the labs. are cleaned and tidied by the students. Since everyone there has to undergo mandatory military, navy or air force training for 2-3 years, they are very disciplined and controlled in their behaviour.

Being a developed country, procur-ing things is also fast, efficient with minimum red tape. The Gov-ernment funding is liberal but not without results. If you get funds, then both the Professor and his stu-dents have to slog it to justify the same. It is not just a matter of sub-mitting a report and defending it at the end of the day. They need to show something that actually works and is of use for either the industry or the public. One can-not get away with leaving the ap-plication of the technology vague. Funding is given mostly for prac-tical research which is why the country is known for its hardware and economy. Language posed a big barrier. English is spoken only by a select few which did make things difficult.

Faculty Talk

Computer Science and Engineering Association, IIT GuwahatiLinked List 06

Page 7: CSEA Linked List Node4

What are the things that one must consider before devoting his/her life to research?

Apart from ethics, one of the most important aspects is that one should have a passion for research. If you are doing research merely to acquire a degree or to ensure someone gets a degree or earn more then such research in my opinion is of no real use to anyone but oneself. Further one needs to muster courage and build the enthusiasm to venture and tread on untrodden paths. Other dominating aspects are whether the research will have some impact on the society at large.

Take the case of Prof. Norman Borlaug, Nobel Laureate and winner of the World Food Prize and creator of a life-saving wheat strain. But for his research, we may have seen the developing world (including India) ravaged by war and famine. Now this is what I consider real research. Wonder why all this reminds me of the Irish proverb – “You’ll never plough the field by turning it over in your mind.”

Any message that you would like to convey to the students through this magazine?

There are several good things that can be practiced with a wee bit of effort but if it is done by all of us comprising this society we will find one another more amiable. A few of such things that come to my mind at the moment are – Humility, patience, discipline and perseverance. I would also advise you all to read, comprehend and practice the message portrayed in

some of the classic short stories written by Leo Tolstoy – How much land does a man need?, Elias, What men live by,…. I used to send these over as a parting gift to the older batches. Of course, if you have read them maybe its time to read them again!

Let me quote Gandhiji’s talisman which as per him could be used when you are in doubt: “Recall the face of the poorest and the weakest man you have ever seen and ask yourself if the step you contemplate is going to be of any use to him. Will he gain anything by it? Will it restore him to a control over his life and destiny? In other words will it lead to Swaraj of the hungry and the spiritually starving millions? Then you will find your doubts and your self melting away.”

[Interviewed By]

Anurag Kumar Nilesh,B. Tech. 4th Year CSE,for CSEA

Faculty Talk

Computer Science and Engineering Association, IIT GuwahatiLinked List 07

Robotics Lab @ IITG

Recently, Robo� cs Lab has been offi cial-ly set up in our department. Thementors for this lab are Dr. S.B.Nair and Dr. P.K.Das. The website of the lab is h� p://www.iitg.ernet.in/sbnair/Robot-icsLab/index.html

The objec� ves of this lab are to1) Design robots which can assist hu-mankind2) Put theore� cal concepts into prac� ce and3) Support innova� ve ideas

A group of three students par� cipated in the Microso� Imagine Cup 2010 in the Accessibility for Local Innova� on Award category. The project idea, named as Nayan (picture below), was to design a robot that can guide the visually chal-lenged persons in outdoor as well as in-door loca� ons.

People who are interested in robo� cs please join the google groups created for Robo� cs Lab at IITG:http://groups.google.co.in/group/ro-bo� cslab-iitg

Page 8: CSEA Linked List Node4

The history of networking dates back to ARPANET in 1970, which was a project of United States defense department. It is exactly from here that the world of computer networking sprung. It was Licklider (the head of ARPANET team) who found an analogy be-tween human interac� on and computer interac� on. The success story of fi rst computer to computer inter-ac� on created an enormous interest and kept driven research in communica� ons for a prolonged period. In the process, lot of communica� on architectures were implemented.

Most of these architectures were proprietary: Digital’s DecNet, IBM’s SNA (system network architec-ture) and Novel’s Netware are the prominent ones. Then the researchers no� ced the need to have a com-mon standard for communica� on thereby avoiding the monopoly of these vendors for any updated solu� ons and new added features in the exis� ng ones, which ini� ated the standardiza� on of overall communica� on architecture and associated protocols. A� er a long de-bate and design process the industry came up with a standard popularly known as OSI reference model in the year 1983. Later another standard called TCP/IP ar-chitecture came into existence. It was TCP/IP architec-ture which got wide acceptance in the industry and is being used even today. This no� on of connec� vity and networking of computers emphasized world is consid-ered as the fi rst genera� on or internet 1.0.

The next genera� on of communica� ons was centered on the actual crea� on of devices such as routers,

switches which work with these standardized protocols that made the actual connec� vity in the World Wide Web. Formally it was in the year of 1993 when Mosaic released its fi rst commercial web browser, that people who were connected to networks could do much more than they were doing before. This era in communica-� on which emphasized on device and so� ware devel-opment is generally known as internet 2.0.

In a sense it was the internet 2.0 that represents the real meaning of internet. Because of wide acceptance of TCP/IP architecture and packet based rou� ng con-cepts the industry spent a considerable amount of � me fi xing problems with TCP architecture and por� ng these protocols to new communica� on media such as wireless. Even today we live in the era of TCP/IP and are s� ll fi xing problems in it!. Numerous issues such as security, rou� ng, scalability and shortage of address space were important factors for research. Number of security components like Firewalls, Intrusion Detec� on Systems and Intrusion Preven� on Systems were devel-oped to fi x the security issues. Open Shortest Path First (OSPF), Border Gateway protocols were designed to address the scalability issues of internet. Link State and Distance vector rou� ng were developed to handle rout-ing issues and concepts like Network Address Transla-tor (NAT), IPv6 and private addressing were found out to mi� gate the shor� alls of IP addresses globally. In addi� on quality of service (QoS) of the internet was also of prime importance. The issue with the current Internet architecture is that none of above problems is fi xed permanently as the changing demands of users

Digital Mind

By Neminath Hubballi, Research Scholar, Dept. of CSE

08

Internet 3.0A New View of Connectivity

Page 9: CSEA Linked List Node4

Digital Mind

Computer Science and Engineering Association, IIT GuwahatiLinked List 09

need addi� onal features in the basic solu� ons and that creep addi� onal fl aws into the systems. As the no� on and idea of current internet is a quite old concept, (although the designers of ARPANET nev-er thought it!) over the years we learnt a lot about networking and associated risks. Now the industry understands what the TCP/IP and packet based switching technol-ogy can give and cannot give. It is very recently that thinking of a new design of the Internet architec-ture which can u� lize the best that packet switching technology can give and to that add things which are essen� al in the new era has be-gun. According to Prof. Raj Jain of Washington university, (one of the pioneers in the thought process of Internet 3.0), it will be fundamen-tally diff erent from its successors and will be designed in a way as if the en� re internet will be designed from zero. It meets the demands of commerce and al-lows the government and fi rms to enforce policy deci-sions and control, track about who is coming into their

network and doing what. It permits the user to control when, where and how they actually want what.

In a project called as Global Environment for Network Innova� on (GENI), a ques� on was raised “is this the

way we design the internet if we were star� ng from the scratch” and “What are the requirements of today’s business”. This ques� on is perhaps valid as the fi rst genera� on of internet was designed by researchers for their research interest and now it has grown to an extent where it is part of everyday life. The require-ments of modern day Internet can be enumerated as below:

1. Energy effi cient Communica� on: Cur-rent communica� on methods require both sender and receiver to be awake for communica� on to happen. With mobile communica� ons base sta� ons have got the limited storage capacity of messages when the receiver is offl ine. This need to be extended to other communica� on

methods too.

2. Iden� ty management: The point of connec� vity and

Page 10: CSEA Linked List Node4

one’s iden� ty becomes vital in communica� ons. Cur-rently if a system moves to a diff erent loca� on its IP address changes, hence there is a need to change the iden� fi ca� on mechanism and get rid of dynamic IP ad-dresses.

3. Loca� on awareness: Finding the loca� on where the receiver or sender is situated becomes vital from the security point of view. Either party can decide where they want to go and what they want to exchange if the loca� on where they are in is known. Hence there is a need for loca� on awareness to the communica� ng end points.

4. Support for explicit communica� on: With the dis-tributed service and client-server nature of commu-nica� on, implicit communica� on is an unnecessary mess. Instead clients need to be allowed to iden� fy the nearest server and establish a communica� on channel with it.

5. Person to person communica� on: Network was designed for desktop to desktop communica� on but today’s requirement is person to person communica-� on. Persons may be using any device like desktop, cell phone, palmtop etc and she should be reachable with that. The network needs to iden� fy the best way of reaching the person rather than the device. This can be achieved if addresses are given to human beings rather than to the devices they are using.

6. Security: This is the biggest concern of the today’s Internet. The next genera� on internet has to be secure allowing the end par� es to enforce the rules as what is permi� ed and who is permi� ed. Governments need to protect their ci� zens with the data or exchange the way they protect the na� on with defense forces. En-force policy decisions and maintain the integrity of communica� on.

7. Separa� on of control and data planes: Currently In-ternet uses a single channel for both control and data planes - this is a signifi cant security threat. For example the TCP/IP connec� on setup and piggybacking mecha-nisms give enough informa� on to do malicious things to the ongoing communica� on. Hence the connec� on management and data transmission need to be sepa-

rated in a similar way as that of cell phone networks.

8. Asymmetric protocols: The protocols used in cur-rent Internet are designed for systems with iden� cal capabili� es. But with the person to person commu-nica� on enabled, one system may be signifi cantly re-source constrained compared to other. Allowing the network to adjust the communica� on when the de-vices are asymmetric is necessary.

9. QoS guarantees: In the present Internet IP is totally unreliable, thus ensuring QoS to the end user fl ows becomes diffi cult. Next genera� on Internet should ad-dress this fact and should be designed for achieving desired quality of service.

All of the above requirements are not easy to be met and any solu� on that pops up need to be debated and evaluated. Industry, academia and research groups have to collaborate in the design in much the same way as TCP/IP was standardized years back.

Digital Mind

Computer Science and Engineering Association, IIT GuwahatiLinked List 10

Page 11: CSEA Linked List Node4

Cover Story

Siddharth Prakash Singh, 4th year B.Tech student from the department of Computer Science and Enginnering tells us all about building modern scalable web applica� ons using memcached.

Memcached has become a buzz-word these days for deploying scalable web-applica� ons. Despite

this, surprisingly a large number of people do not know “what exactly is memcached?”. Lets learn it by dissec� ng the word memcached into cache and mem. A ‘cache’ is a component that stores data transparently to improve performance by serving the frequently requested data faster. The readers must be aware of cache in processors - L1 cache and L2 cache, browser cache etc. Lets take an example of browser cache.

Whenever you surf a new website for the fi rst � me, the browser downloads the HTML page, javascript fi les, CSS fi les, media fi les etc. and saves it in the cache (on your PC). So next � me when you open the same website again, it is loaded and displayed much more quickly as the browser can serve the locally saved resources. Caches are almost always constrained by size. For our example, there is no limit to the number of dis� nct websites we visit, the browser cannot save data for each of them. When the cache is full, a cache replacement algorithm is used to replace some fi les with new ones. The cache replacement algorithm is such that when a replacement is done, it tries to ensure that the informa� on it stores is the most likely informa� on to be needed again.

11

Memcached

Page 12: CSEA Linked List Node4

Now let us come to the next part of the word - “mem”. As you might have guessed it, mem refers to the memory. Hence, memcached precisely is using memory as cache. But it is not just that. It is a distributed memory caching system. This means that if you have a memcached server, you don’t have to worry on what machine is the data object cached. You just have to give the command “Get the object named foo” and the memcached knows from where to get foo. The cache can span as many machines as you need. Memcached is fast. It is an in-memory, huge hash table

which can be distributed among several machines. The � me complexity of fetching cached result is hence, O(1), the order we love the most. It u� lizes highly effi cient, non-blocking networking libraries to ensure that memcached is always fast even under heavy load. Hence, in circumstances when your database might be failing under heavy load, memcached won’t be. And in fact memcached was designed to alleviate the database load, which is the bo� leneck and risk for scalability for majority of the high load web apps.

How to use memcached?

Memcached just by its own is a distributed memory caching server/daemon. It can be used in a

variety of ways. It can be coupled with a database or it can be directly used as a caching layer between client and database. A couple of use cases are discussed below.

Memcached as a caching layer

Now, when we have a basic understanding of what memcached is, lets dive into its usage. From a

layman’s perspec� ve, here is the basic sequence of code:

a.Every � me you need to query a database for any read opera� on, check whether the par� cular data is stored in the cache. If the data is found in memcached, then use it as opposed to querying the database for it.

Cover Story

Computer Science and Engineering Association, IIT GuwahatiLinked List 12

Page 13: CSEA Linked List Node4

b.If the informa� on is not found in the memcached, then query the database. Once you get the result of the query, send it to the client, and don’t forget to put it in the cache as well. Now, in subsequent calls to fetch this informa� on you don’t need to call the database at all.

c.Now, if there is an update opera� on on any data, and if the data is found in the cache, delete it from the cache. This keeps the cache consistent.

For all those who couldn’t imagine the execu� on of above three steps, I have some PHP code -

Consider the data produced by the query: SELECT * FROM hugetable WHERE � mestamp > lastweek ORDER BY � mestamp ASC LIMIT 50000; is required every � me somebody loads the homepage of a web app. For a highly loaded web app, this database query may make the app slow. Let’s put the data in memcached.

Isn’t it simple and easy?

Memcached as a rate limiter Back in January 2009, some high profi le celebrity Twi� er accounts were hacked. The hacker ran a dic� onary a� ack on a Twi� er engineering team

member, discovered his password and then used Twi� er’s privileged admin tool to reset the password of accounts they wanted to hack.

This incident led some developers (not Twi� er developers) to rethink about access rate limi� ng. Rate limi� ng could be implemented by incremen� ng a counter stored in a fi le or a database. Wri� ng to a disk is costly and is diffi cult to scale.

Memcached comes to rescue. There is a command incr which atomically increment an already exis� ng counter simply by specifying its key. add command can be used to create that counter and it fails without giving any

error if the specifi ed key already exists (so beware).

Now lets say we want to limit a user to 10 hits every minute. A naive implementa� on could be to create a counter based on user’s IP (this approach might not work for proxy environment, where large number of users are behind the same proxy server), something like - numbero� its_202.141.81.2_2010-04-10-12:37. Increment this counter for every hit from the IP address 202.141.81.2 and block the request if it exceeds 10. We can set the counter to automa� cally expire a� er one minute while crea� ng the counter. This naive solu� on will work without wri� ng to disk. This is just an

Cover Story

Computer Science and Engineering Association, IIT GuwahatiLinked List 13

// $memcache = new Memcache(); $huge_data_for_front_page = $memcache->get(“huge_data_for_front_page”);if($huge_data_for_front_page === false){ $huge_data_for_front_page = array(); $sql = “SELECT * FROM hugetable WHERE timestamp > lastweek ORDER BY timestamp ASC LIMIT 50000”; $res = mysql_query($sql, $mysql_connection); while($rec = mysql_fetch_assoc($res)){ $huge_data_for_frong_page[] = $rec; } // cache for 10 minutes $memcache->set(“huge_data_for_frong_page”, $huge_data_for_frong_page, 600);}// use $huge_data_for_front_page how you please

Page 14: CSEA Linked List Node4

example, in real world one needs to do more homework on this approach.

Choosing a key You might have no� ced that we tried to retrieve the data from the memcached using a key “huge_data_for_front_page” in the above example. The cache can be thought of as a big associa� ve array in which each item is stored as a key-value pair with key being an arbitrary string. Therefore to store and retrieve data in the cache you need to defi ne a key. A key uniquely iden� fi es data stored in the cache, and is used when storing, retrieving and removing data from the cache. Technically speaking a key can be any arbitrary string. But you should defi ne a pa� ern for naming key to avoid confl ict and easy management. Key naming pa� ern is also important for security as men� oned in the next sec� on.

Data distribu� on among memcached hosts Data distribu� on among memcached hosts is an important concern. Lets say we have n memcached hosts and we want to distribute the load among all the n hosts. The most common way to do this to use the mod operator. For simplicity let us assume for the moment that we are saving data in memcached using numerical keys. For data d1 the key is an integer k1. We save the data for the key k1 in the server with id k1 mod n . This is a nice way to distribute the load among all the n servers. But the problem arises when you want to add or remove a memcached host. Say the number of users using your app has drama� cally increased and you need to add more caching machines to increase the number of cache hits. Or, on the contrary say one of the caching machine has crashed and you temporarily need to remove that from the list of memcached hosts. In bot the cases n changes. Whats the big deal when n changes? When n changes every key will now hash to a diff erent server id - k mod n’. This can be devasta� ng. It is something like all the cache has suddenly disappeared. Almost 100% cache miss by adding or removing a caching machine! How does memcached work then? Solu� on is Consistent Hashing!

The basic idea behind consistent hashing algorithm is to hash both key and the server id using the same hash func� on. The following example will illustrate this -

Cover Story

Computer Science and Engineering Association, IIT GuwahatiLinked List 14

The hash func� on maps keys and server id to a number range. Imagine mapping this range into a circle so the values wrap around. Here’s a pictureof the circle with the number of keys (1,2,3,4)and server ids (A,B,C) marked to the points they hash to (based on a diagram from Web caching with consistent hashing by David Karger et al):

To fi nd which server a cache object with key k goes in, we move clockwise round the circle un� l we fi nd a caching server. So in the diagram above, we see object 1 and 4 belong in cache A, object 2 belongs in cache B and object 3 belongsin cache C. Consider what happens if cache C isremoved: object 3 now belongs in cache A, and all the other object mappings are unchanged.

If then another cache D is added in the posi� on marked it will take objects 3 and 4, leaving only

Page 15: CSEA Linked List Node4

object 1 belonging to A. This scheme works pre� y well. Whenever a caching server is added or removed very few cache miss occurs.

Security concerns Memcached access is not authen� cated by username/password. So there is no na� ve access control for memcached. Few simple things can be done to secure your instance of memcached: • Prevent external access: Deploy memcached behind a fi rewall and allow machines from within a specifi c network to access the cache. • Choose obscure keys: There is no way a user can query memcached for the list of keys. Hence, only if somebody knows the key can access the data. So obvious keys are vulnerable. Add obscurity to the key by adding some number in the key like “foobar:12321”. Or, something be� er like using some hash func� on on the key. Some other issues • By default memcached can store data of size upto 1MB only • By default memcached key can have a length of upto 256 characters.

Client Libraries The client/server interface to memcached is simple and lightweight. Client libraries are now available in almost all major programming languages. For a list of available libraries take a look at this page: h� p://code.google.com/p/memcached/wiki/Clients

One can measure the popularity of memcached by taking a look at its users - Facebook, Wikipedia, Flickr, Twi� er, Youtube, Digg, Wordpress, Livejournal, Farmville, Amazon.com and the list goes on. In this sec� on, we will look at some of the use cases of memcached. For more informa� on on memcached check out: h� p://www.memcached.org

[Wri� en by]

Siddharth Prakash Singh (SPS),4th Year, B.Tech., www.spsneo.com/blogDept. of Computer Science & Engineering.

Cover Story

Computer Science and Engineering Association, IIT GuwahatiLinked List 15

Call for Articles

Since its incep� on, the CSEA has conducted many workshops, lectures and programming contests to increase awareness regarding new technologies and the science behind them.

Linked List is yet another a� empt to get closer to the fellow IITians.

Linked List requires ar� cles that maintain the originality and the quality of the magazine.

Send in your ar� cles to [email protected] or contact the editorial team for any clarifi ca� ons.

Looking forward for an overwhelming response.

Page 16: CSEA Linked List Node4

Good Times

“Internship” where did that word come from and what does it really mean? Well some� mes dissec� ng some words help but this one didn’t. Try dissec� ng the word so as to get some meaningful words out of it which could possibly explain the meaning. Some of possible combina� ons could be Intern + Ship, and another one could be Interns + Hip. Since the second one is sort of creepy enough to censor much of the contents of this ar� cle so be� er s� ck to the fi rst and try to decipher something meaningful out of its components i.e. Intern and Ship. An intern is someone who works in a temporary posi� on with an emphasis on on-the-job training rather than merely employment. And ship well everyone knows what is a ship. So if we join these two words and try to get out something meaningful out of this, we end up confusing ourselves. Well if we go with the second combina� on it does end up to something meaningful. Well whatever the meaning might be according to us(your seniors) it is a � me where you learn lot of things like working on some research topic, team work and yeah it is part where you have hell lot of fun and these few months are one of the most memorable � mes of your life. So to make it memorable you go� a be aware of some do’s and don’t of internship. So here we go…

What to DO : 8 Tips for Internship (By Anurag Nilesh)

I am sure that by now, you would have started coun� ng down the days le� before your internship begins. Some of you may have already started making travel plans such as to visit Eiff el Tower, Disneyland, etc.. Some of you may be looking forward it as an opportunity

to get thrilled via adventure sports such as mountain biking, paragliding, bungee jumping, skydiving and so on. I hope that your plans work out and you all have a memorable � me during the internship period. However, this ar� cle is not to suggest you tourist spots or adventure sports but to suggest a few prac� cal � ps.

Some of you may be wondering about the � tle as to why 8 and not why 10. Has the author ran out of � ps to make it a perfect ten? Well, I can give you three reasons for this peculiar � tle. One, choosing this peculiar � tle gives me a chance to lengthen this ar� cle by trying to explain the � tle but without brainstorming for two more � ps to make it a perfect ten. Two, since i am the author of this ar� cle, i hope you will agree with me that i am en� tled to create any � tle.

1. Learn cooking. If you know cooking, then it shall serve you in good stead. It shall help you in keeping a check on your food budget and hence, more money in the travel budget. If you don’t know cooking very well, try to learn how to make rice, omle� e, aloo-ki-sabji etc. at your home or learn it there itself from internet sites or from your friends. You don’t need to be an expert in cooking. Believe me that you will even love your own half-cooked meals.

2. Plan your trips. You should try to plan your trips so that you make maximum use of weekends. Look out for holidays which extends your weekend and try to make use of such weekends to do long-distance trips such as trips to neighbouring countries/states.

3. Money ma� ers. I suggest that you carry some extra cash than your planned budget while making trips.

16

Internship

Anurag Kumar Nilesh and Ashish Thakur,

4th Year B. Tech. CSE

Do’s & Dont’s

Page 17: CSEA Linked List Node4

Good Times

Computer Science and Engineering Association, IIT GuwahatiLinked List 17

Carrying an interna� onal credit/debit card may come handy.

4. Experience foreign culture. You shall get a chance to interact with people of diff erent origins and cultures, a chance to visit historical and famous places. Don’t miss this opportunity to learn about their work culture and social culture.

5. Maintain a travel journal. I suggest that you either maintain a travel blog or personal travel journal. Believe me that documen� ng your travel experiences will not be a waste of � me.

6. Remember the purpose of your visit. Don’t forget the purpose of your internship. O� en, people consider travelling/adventure as the purpose of their internship but that’s not true. The purpose of your internship is towork on some project. It’s a chance to showcase your talent and prove that you were truly deserving for this internship off er.

7. For those in India. If you are doing your internship in India, don’t get disappointed. I suggest that you start wri� ng a blog and in that document something like how delicious the food was last evening or what a great game of foosball you had last weekend. Such descrip� ons shall envy your counterparts surviving on their half-cooked meals.

8. Play it safe. Finally an advice to my virile � gers, play it safe.

What NOT to do during your Internship! ( By Ashish Thakur)

Read 8 Tips for Internship by Anurag and put a (!) operator in front and if you think you are done well I guess you are are wrong. So in case you want to know what NOT to do during your internship then read on...

• Never lose the keys of your apartment: If you loose your key in IITG all you have to do is to break the lock and get a new one and it would cost around 50 INR. But in case you lose your key in Europe and in case your landlord has only one key to that lock fellas you

are so screwed. As it costs around 40-50 euros to break the lock and that is a hell lot of money! So be careful with the keys of your apartment.

• Never travel without � ckets: Well three guys of CSE Dept had to pay around 40 Euros each to the cops as they were traveling without a � cket in the bus. So whenever you are using public transport make sure to buy the � ckets as in case you don’t buy one who will end up paying he� y fi ne.

• Don’t be an Indian: Don’t take me wrong over here. We Indians li� er, spit, talk and laugh very loud, never follow other necessary e� que� e, ogle whenever we see some hot chick. But remember the saying “When in Rome, do as the Romans do”. I am not asking you to be at your best behavior but at least try to follow some basic e� que� e during your internship.

• Don’t get fooled by conmen: You see some hot chick trying to provoke visual contact with you and smiling at you. Remember the case of Saif Ali Khan in the movie Dil Chata Hai…you might get trapped in that beau� ful smile and end up losing all of your stuff .

Page 18: CSEA Linked List Node4

Geek Corner

Ajax (Asynchronous Javascript And XML) is a group of web development techniques used to create interac� ve web applica� ons. Clients can retrieve data asynchronously using Ajax allowing for refreshing and upda� ng parts of a web page. Ever since Google made successful use of Ajax in Gmail and Google Maps, Ajax has got considerable recogni� on from web developers. This ar� cle explains a few pi� alls which you could face while developing your Ajax based website and gives solu� ons to few of the problems.

A very temp� ng use of Ajax would be to design a website with naviga� on menus which uses Ajax to load tab contents in a HTML div as shown in Figure 1. When user clicks on a link in the naviga� on menu the page content for content div is fetched from the server asynchronously. O� en familiar GIF images are used for anima� on like a rota� ng pair of arrows or dots or some other fancy image indica� ng to user that the div contents are being loaded.

Let us study what obstacles such a design could create for a website.1. BookmarkingA user visi� ng your website is o� en interested in only a par� cular sec� on the URL of which he/she bookmarks to revisit again in future. Using ajax as shown in Figure 1 causes the browser URL to remain same. So if now a user clicks on “link1” the browser URL points to h� p://your-website-link.com and when user clicks on “link2” the browser s� ll points to h� p://your-website-link.com . 2. Javascript AdsA good source of revenue which you can earn from the content on your website is using content based contextual ads. Google adsense dominates the ad

market. However to your surprise you discover that Google ads don’t show up when you add them to the contents in ‘content div’ shown above. To understand why the ads don’t show up we need to understand how the Google ads or any other javascript ads work on a webpage.

To add ads on your webpage you sign up with Google Adsense. Google provides you a Javascript code snippet which runs when your page loads as shown in Figure 2. When the tab content is fetched from server as a part of content div in Figure 1 Ajax sets

content div.innerHtml = fetchedHtmlThis statement does change the content of content div but the Javascript snippets in the fetched html are not evaluated by browser. As a result, step 1 in Figure 2 fails. The adsense server uses keywords gathered from your page to generate ads in step 3 of Figure 2. Google Adsense bot visits diff erent pages on your website to index pages and gather keywords. If Ajax is used Google adsense bot can gather keywords only from the index.html page at h� p://your-website-link.com. Thus even

18

Pitfalls & Solutions AJAX

Shirish Surti, 2nd Year, M.Tech. CSE

Page 19: CSEA Linked List Node4

Geek Corner

Computer Science and Engineering Association, IIT GuwahatiLinked List 19

if you do manage to evaluate the javascript in content div html code the ads which will be displayed in content div will be irrelevant to its context.

3. Search Engine IndexingAn important metric for judging the popularity of a website is its Google page rank. Google bot visits your web pages and ranks your website based on the content in your web pages. If your website is developed using Ajax as shown in Figure 1 every � me the google bot visits your website only the page contents of index.html are returned. This does not allow all the pages from your website to be indexed leaving your website with a poor page rank.

4. Non-Javascript supportIf a user with a text browser like lynx which does not support Javascript visits your website it is not possible for him/her to visit other pages apart from index.html. In order to avoid problem 1, it is necessary to ensure that your naviga� on menu is not completely ajaxifi ed. Few parts of the content could be loaded using Ajax but not the en� re naviga� on menu should be using Ajax. Few have proposed work arounds for problem 2 like the one at h� p://www.jguru.com/forums/view.jsp?EID=1305379 but it violates Google adsense program policies. Right now there is no support from Google for websites which use Ajax for naviga� on like

the one in Figure 1. A good news is problems 3 and 4 do have a solu� on. The solu� on is to provide both href links (for non java script users) and handle onClick event for making an Ajax call. Consider the snippet below,< a href = “link1.html” onClick=”ajaxCall(‘link1’); return false;”> link1 </a>Now when a user with a browser not suppor� ng javascript visits the page and clicks on link1 the page at link1.html opens. If javascript is supported by the browser the onClick func� on call executes fetching the data using Ajax. The hrefs direct the Google search engine bots to the linked web pages causing them to be indexed as well. A very good approach which solves problems 3 and 4 is that of using Hijax which treats Ajax as an enhancement.

HijaxHijax approach is a very simple idea:1. First, build an old-fashioned website that uses hyperlinks and forms to pass informa� on to the server. The server returns whole new pages with each request. 2. Now, use JavaScript to intercept those links and form submissions and pass the informa� on via XMLH� pRequest instead. You can then select which parts of the page need to be updated instead of upda� ng the whole page. Hijax Example,window.onload = doPopups;func� on doPopups() { if (document.getElementsByTagName) { var links = document.getElementsByTagName(“a”); for (var i=0; i < links.length; i++) { if (links[i].className.match(“help”)) { links[i].onclick = func� on() { window.open(this.getA� ribute(“href”)); // this can be replaced by a Ajax call return false; }; } } } }<a href=”help.html” class=”help”>contextual help</a> For more informa� on on Hijax please visit h� p://domscrip� ng.com/presenta� ons/xtech2006/

Page 20: CSEA Linked List Node4

Digital Mind

CSP is an abstract language designed specifi cally for the descrip� on of communica� on pa� erns of concurrent system components that interact through message passing. It is underpinned by a theory which supports analysis of systems described in CSP. It is therefore well suited to the descrip� on and analysis of network protocols. Protocols can be described within CSP, as can the relevant aspects of the network. Their interac� ons can be inves� gated and certain aspects of their behaviour can be varied through use of the theory.

Formalisms based on Hoare’s Communica� ng Sequen� al Processes (CSP) and Milner’s Calculus of Communica� ng Systems (CCS) for verifying protocols are currently being used by the Interna� onal Standards Organisa� on (ISO). However, these models need to be extended if protocol performance specifi ca� on and verifi ca� on is to be done, as neither of these models have � ming informa� on (other than sequencing) nor a way of specifying controlled loss of informa� on. CSP descrip� on of a protocol has a precisely defi ned seman� cs - it is a precise mathema� cal ques� on as to whether the protocol meets the property or not.

One of the strengths of CSP is the ease with which specialised theories can be constructed on top of the seman� c model. This allows par� cular specifi ca� on statements to be defi ned in terms of the standard seman� cs, and new proof rules appropriate to these specifi ca� ons to be provided. This approach is taken where we specify and reason about authen� ca� on proper� es and also about agent’s inability to generate

par� cular messages. Although standard proof rules would support the verifi ca� on . since they are sound and complete; it is preferable to develop a specialised theory since it provides an appropriate level of abstrac� on for suppor� ng the kind of reasoning we require.

The authen� ca� on property we consider states that if some events R in the system are restricted, then other events T should not occur, We establish this by defi ning a suitable rank func� on on messages which shows that only messages above a par� cular rank can circulate in the restricted system and hence, messages from T are not possible.

A network provides a means for users, such as people or applica� on programs to communicate by sending and receiving messages. This situa� on may be modelled at a high level of abstrac� on in CSP as a process NET which provides to each user two ways of interac� ng with it, sending messages to other par� es and receiving messages from other par� es. There are two views from which security proper� es can be considered. One is from the viewpoint of the users of the network who do not know which other par� es are to be trusted. Proper� es expressed from this viewpoint will generally include assump� ons, implicitly or explicitly, that a user’s communica� on partner will not act contrary to the aims of the protocol. For example that any shared secrets should not be disclosed to third par� es from a high level, God`s eye view which iden� fi es those nodes which follow their protocols faithfully and also iden� fi es those which are engaging in more general

Niteesh Kumar, 2nd Year, B.Tech. CSE

20

CSP for the Verifi cation of Security Protocols

Page 21: CSEA Linked List Node4

Digital Mind

Computer Science and Engineering Association, IIT GuwahatiLinked List 21

ac� vity; perhaps in a� emp� ng to a� ack a protocol. If this view is taken, then care should be taken to ensure that this privileged informa� on is not accidentally used in the protocol descrip� on. The responses of a node should not be dependent on informa� on which is available only at the high-level view. In some circumstances node may not have knowledge concerning its communica� on partner; in other cases, a protocol may be invoked only when communica� ng with par� cular known and trusted users. How this knowledge and trust is obtained is outside the scope of this ar� cle.

Security protocols are designed to provide proper� es such as authen� ca� on, key exchanges, key distribu� on, non repudia� on proof of origin, integrity, confi den� ality and anonymity for users who wish to exchange messages over a medium over which they have li� le control. These proper� es are o� en diffi cult to characterize formally or even informally. The protocols themselves o� en contain a great deal of combinatorialcomplexity making their verifi ca� on extremely diffi cult and prone to error.

Process algebra can provide a single framework both for modeling protocols and for capturing security proper� es facilita� ng verifi ca� on and debugging. Security proper� es such as confi den� ality and authen� city may be considered in terms of the fl ow of messages within a network. The use of a process algebra such as Communica� ng Sequen� al Processes (CSP) seems appropriate to describe and analyze them. Security proper� es may be described as CSP specifi ca� ons; how security mechanisms may be captured and how par� cular protocols designed to provide these proper� es may be analyzed within the CSP framework. It has been argued that security proper� es should be considered as proper� es concerning the fl ow of messages within a network, to the extent that this characteriza� on is jus� fi ed.

For analysis purposes, we will consider the system from the God’s-eye view. Confi den� ality will be captured as a specifi ca� on requiring that any message output to user, must have actually been sent to user. We restricta� en� on to the message set M as being those messages which are intended to remain confi den� al.

We also assume they cannot be generated by user which would be true for example for signed messages, though this is a simplifying assump� on that is not jus� fi ed in all circumstances. Other messages , such as encrypted messages or control messages will in general be available to eavesdroppers but confi den� ality is notconcerned with protec� ng these messages.

Security proper� es are generally proper� es requiring that something bad should not occur, though they are not exclusively of this form. These tend to be considered as safety proper� es. But there is a dis� nc� on to be drawn between the security requirements implemented by such a protocol, and its liveness requirements which are important for communica� on but which are generally independent of security. It is possible that there are some security proper� es which can be expressed only as liveness proper� es; hence the traces model for CSP will be adequate for our present needs to analyze proper� es of the form.

Page 22: CSEA Linked List Node4

Nostalgia

Abhishek Gupta (h� p://www.linkedin.com/in/abhishek85gupta) is a CSE alumnus from the batch of 2004-2008. He has recently completed his MS in CS at Stanford with a specializa� on in Ar� fi cial Intelligence(AI). He worked as an intern with Apple Inc in 2009 and is currently working as a So� ware Engineer with the Search Team at LinkedIn. He shares with us some really bright insights and sugges� ons for the academic system at IIT Guwaha� .

Ni� n Dua and Siddharth Prakash Singh interview him for CSEA.

Tell us about your life as a student at IITG? Any specifi c moments which you will cherish for life?I am a typical CS student. I used to love the Physics and Math classes in the fi rst year. In the second year we were given a computer in a lab with AC! From that point onwards, I spent almost all my � me in the CS lab. I used to really enjoy the programming assignments. Not only was the CS lab a great place to work but also to chit-chat with my fellow classmates. I cherish the late night coding sessions for the assignments, the 4-Bit CPU all-nighters and the beau� ful PINTOS. I also used to enjoy my discussions about life and IITG with Singh, Nangia, Aggarwala and Aditya Raj. IITG was a great learning experience for me not just professionally, but personally as well. I cannot thank IITG enough for all that it has given me.

How would you compare student life in India to that in the US? What addi� ons and changes would you like to see in the IITG educa� on system?This is a great ques� on. I have been thinking about this for a long � me now. I feel the student life is pre� y much the same. But the amount of learning that you do per unit of � me spent is much higher in the US. One of the reasons is that every course devotes a lot of resources in terms of Teaching Assistants’ (TA’s) offi ce hours, starter code for assignments,

discussion sec� ons lead by class TA’s. This greatly smooths the ini� al learning curve, gives the students confi dence and encouragement that there is a team of professionals only for helping them understand the subject be� er. Finally, having starter codes help you in learning the core ideas of the course without having to deal with other orthogonal issues. All this results in a lower barrier to entry for exploring something new. To put things in perspec� ve, almost all the courses are like the Opera� ng Systems(PINTOS) course at IITG but with a much be� er TA support! Another important reason is that over here one has more op� ons in terms of courses. As a result of which, one ends up doing something that one really likes and hence the student has much higher mo� va� on to learn. Furthermore, learning more ideas per unit of � me spent encourages students to explore more. This in turn increases their breadth of knowledge and helps them make a more objec� ve decision so as to what it is that they do or do not like. Even at places like IITs, a large frac� on of students say that they don’t like their courses and they were be� er off doing something else. There is nothing wrong with this statement per se. But the truth is that a vast majority of these students give up at the start itself because of the ini� al (un)smooth learning curve. Had the ini� al curve been smoother, then the students would not have given up and would have actually learned the subject. Having learnt the subject, students might have been in a be� er posi� on to make an objec� ve and informed assessment of their interests.

The following changes might be helpful: a. Problem: Its hard for students to fi gure out interes� ng things in CS especially the interes� ng things happening in IITG itself. Solu� on: Every semester there should be a seminar course (OPTIONAL TO ATTEND) where every Professor gives a 30 minute talk on what research he has been upto, what are the interes� ng areas related to his fi eld

22

AbhishekGupta

Page 23: CSEA Linked List Node4

Nostalgia

of study and why it is exci� ng, which courses might be relevant for students if they want to work with him. This would help both 1st year, 2nd year and 3rd year students in gaining a holis� c understanding of the available opportuni� es at IITG and would help them plan their degree.

b. Problem: Most of the coding assignments are graded by demoing them to the TA and there is no plagiarism detec� on. This unfortunately acts as an incen� ve for students to copy code from others. This hampers the student’s understanding signifi cantly because in CS a large part of the learning happens when you actually sit down and code. Solu� on: Coding assignments should be checked by code and not by manually demoing them to TAs. Code plagiarism should be detected by MOSS. Students should be told beforehand of the poten� al consequences of copying stuff .

c. Problem: IITG lacks role models. This results in lower mo� va� on and lower self-confi dence amongst the students. Most students are clue-less about what they want to do in lives. A sneek peek of what their seniors/alumnis are up to might be helpful. Solu� on: Increase the visibility of what their alumni/seniors are up to within the current batch of students by linking alumni’s web pages from IITG CSE students homepage. Time and again interview a few alumnis and 3rd/4th year students, about what they have been upto in the past 6 months or so and post it on a Google Group.

d. Problem: It is hard for people to gauge the end-goal of courses. By the � me people realise it, it’s already the end of the semester. Having more informa� on about why the material covered in the course is useful, what is interes� ng in the course, what are the pain-points etc. would be really helpful. In summary, there is no easy way to transfer wisdom of seniors to juniors. Solu� on: It would be nice to have a webpage for every course where students who took those courses last year can post their views of those courses.

e. Problem: Bad learning curve for students and non-standardized assignments. Solu� on: Collaborate with universi� es like Stanford, MIT. Use their assignments and lecture notes, if possible. This would help provide students with more

state-of-the-art learning experience e.g. PINTOS!Can you briefl y discuss your present works and your future plans?I started my MS in CS at Stanford University immediately a� er I graduated from IITG. I recently graduated from Stanford with a specializa� on in Ar� fi cial Intelligence (AI). I am currently working as a So� ware Engineer with the Search Team at LinkedIn. My goal for this year is to build a Recommenda� on Engine for LinkedIn to recommend jobs, people and News. Eventually I intend to start something of my own. I am s� ll fi guring out the remaining details!

What has been the mo� va� on for your strong academic and research orienta� on?I am in love with the idea that in CS, one person can fundamentally change how people live their lives! Furthermore, as I took more CS courses at IITG, my belief in this hypothesis and my liking for CS only grew stronger.

Few words of advice for your junior batches?The goal of college is to help you to fi gure out what you want from your life and what you like. I would like to quote Steve Jobs here “Your work is going to fi ll a large part of your life, and the only way to be truly sa� sfi ed is to do what you believe is great work. And the only way to do great work is to love what you do. If you haven’t found it yet, keep looking. Don’t se� le. Your � me is limited, so don’t waste it living someone else’s life. Don’t let the noise of others’ opinions drown out your own inner voice.” To this eff ect, college is a great place to go out of your comfort zone and explore stuff . The only way to fi nd what you truly love is by having an open mind and explore what seems interes� ng to you with passion and perseverance. Ul� mately, this is the only way you can make an informed and objec� ve decision about your life based on what you like, and not based on what others like. Don’t be afraid of failures. As a guiding principle, always think of the day when you would graduate. Do you really want regrets and ‘WHAT If?’s a� er you graduate? Try to fi nd like-minded colleagues and make discussion groups for things that you like. It is great to have smart people around you, people with whom you can debate various guiding principles of life and design choices in CS. Leverage their grey ma� er and try and learn together with them. Finally, remember that IITG is just the beginning.

Computer Science and Engineering Association, IIT GuwahatiLinked List 23

Page 24: CSEA Linked List Node4

TidBits

Ever wondered about checking your own domain’s email through a GMail-like interface? Karthik R, 2nd Year M.Tech. CSE guides us through a step-by-step tutorial for se� ng up mail for your domain using Google Mail. This system is already in use in various ins� tu� ons like IT-BHU and VIT.

Google Mail is so� ware that is given as a service to support mail for any domain, so that one can access mail from their site using Google mail interface. This is achieved by direc� ng all mail des� ned to our domain to be sent to Google’s mail exchange server. To do this, one must set their mail exchange server as that of Google. Now all mail des� ned to our domain would be sent to Google. One can use the Google Mail Applica� on to access mail, just like accessing any mail from Gmail.com. In fact, Gmail is also implemented using the same model.

Follow the steps given below to setup Google Mail for your domain:

Step 1: Set up the Mail Exchange (MX) records to ASPMX.L.GOOGLE.COMStep 1.1: Login to CPanel of your domainStep 1.2: Open MX Records sec� on under ‘Mail’Step 1.3: Select your domain and set the value of MX to ASPMX.L.GOOGLE.COM

Step 2: Create Google App Engine AccountStep 2.1: Visit h� p://code.google.com/intl/en/appengine/ and create an App Engine account

Step 3: Add Domain name to Google App AccountStep 3.1: Visit google.com/a/ and select Standard

Edi� onStep 3.2: Use ‘Get Started’ sec� on and add your domain name as AdministratorStep 3.3: Specify Account Details for your accountStep 3.4: Create and Administrator account for your domain

Step 4: Confi rm ownership of domainStep 4.1: Select Upload a HTML fi le mode to confi rm ownership of domainStep 4.2: Create a fi le with given name and contents and upload it to root directoryStep 4.3: Now use the Confi rm ownership link and fi nish the procedure

Step 5: Accessing mailStep 5.1 Now, use the link mail.google.com/a/domainname to check mail. A� er ownership is confi rmed you can add Chat, Contacts, Calendar, Documents, Sites and Mobile to your domain, by selec� ng from Google Apps Dashboard.

24

with Google Mail You@YourDomain

Karthik R, 2nd Year, M.Tech. CSE

Page 25: CSEA Linked List Node4

With the end of the BTP season approaching fast, let us have a look at a few of the BTP topics on which our beloved 4th Yearites have been working !

Ni� n Kumar Gupta : PIVO - Improving Web Browser History Tools

Most of the modern web browsers provide history tools that allow people to select and revisit pages that they have viewed before. However, these tools tend to take ad-hoc ap-proaches that do not appear to take advantages of the past research. You will be surprised to know that in a recent study as many as 41% of the par� cipants were unaware of a history list available in their web browser. How many � mes have you tried to search something in your browser history and then se� led for a Google search? Isn’t it some� mes frustra� ng when you are looking for something which you visited days before, and now you can not fi nd it in your browser history (although it is there, you are just unable to recognize it!!)?Our goal, in this project, is to develop a web browser history tool which will provide support for recurrent be-havior, query reformula� on and will provide visual cues in order to facilitate recogni� on. We apply associa� on data mining to fi nd the related pages in the history for a webpage and present the results in a easily browsable ‘hubs and spokes’ architecture providing annotated trails that users can follow to reach some desired page in the history. We also index web pages from history and past search queries for major search engines (Google, Yahoo and Bing) to facilitate local searches and be� er results.

Siddharth Prakash Singh : Byzan� ne Fault Tolerant, Scalable Database System Architecture

Database management systems are now turning into sophis� cated, complex so� ware hav-ing millions of lines of code. These so� ware systems are built to reliably implement the ACID (Atomicity, Consistency, Isola� on, Durability) seman� cs while achieving high transac� onal throughput. With increasing complexity of the so� ware, bugs become inevitable, in spite of the best eff orts put in by the vendors and developers. Bugs in the so� ware system can lead to faults which may immediately crash the system. The database systems are designed to recover from these crash faults by using the write-ahead log. Crash faults can lead to down� me during recovery which has been taken care of earlier by using replicated systems. However, bugs may also cause another class of faults - Byzan� ne Faults. These are arbi-trary faults which can lead to incorrect execu� on of a query and hence returning wrong results to the client or in-ser� ng wrong data in the database. In fact, even if a bug eventually led to crash, the system might have exhibited byzan� ne behavior producing erroneous results before crashing. Byzan� ne faults are hard to detect and hence even harder to prevent. These can be tolerated by replica� on using the solu� on of famous Byzan� ne General’s Agreement Problem. But this solu� on leads to a very low-performance system. Prevalent database systems are not capable of tolera� ng byzan� ne faults. In this project, I have proposed a middleware based replicated byz-an� ne fault tolerant architecture for database systems without losing much on performance. Normally for any applica� on, the number of read opera� ons is much larger than that of write opera� ons. Hence, to achieve read opera� on scalability, I am trying to integrate memcached based caching facility with the middleware.

214th Year Special 25

Page 26: CSEA Linked List Node4

Nipun Sehrawat : Scalable Load Balancing for the Cloud

Cloud compu� ng is a distributed compu� ng paradigm, marked by dynamic provisioning of compu� ng resources, such as processing power and storage, from a ”cloud” of such resourc-es. The advent of cloud compu� ng is said to have started a transi� on in the IT industry, from having their own data centers to using compu� ng resources from a cloud, analogous to the shi� in using private generators for electricity produc� on to depending on power grids for electricity require-ment.Scalability is one of the prominent features off ered by cloud compu� ng for services such as web hos� ng. Typi-cally a website is hosted on mul� ple virtual servers in the cloud, depending on the overall amount of traffi c be-ing experienced by the website. With such an architecture comes the problem of load balancing among various servers that are collec� vely hos� ng a single given service. Most of the current solu� ons are hardware based proprietary solu� ons, which off er limited scalability and fault-tolerance. In this work, we implement a so� ware based distributed load balancing solu� on, which has a be� er scalability and fault-tolerance. This solu� on works in conjuga� on with Eucalyptus, which is an open source cloud com-pu� ng implementa� on. In a cloud compu� ng environment, where one has to pay according to the amount of compu� ng resources being used, automa� c scaling-up and scaling-down of Load-Balancers becomes an impor-tant issue. This is addressed by running Load-Balancers in pre-confi gured Virtual Machines, which can be easily deployed, suspended and resumed. The work involved (1) Modifying a Kernel based Load Balancer (KTCPVS), (2) Modifying a DNS so� ware (Unbound) and (3) Java-RMI based distributed programming.

Abhishek Anand : Machine Learning for Effi cient Garbage Collec� on in Flash Filesystems

Unlike disk drives in which there is mechanical movement of disks and the head to select a par� cular block of the disk for I/O, in fl ash-drives there are no moving parts. Flash memory consists of blocks of semiconductor devices which are selected electronically. This has many advantages: no seek � me(the � me required in disk-drives for the disk and head to move to the desired loca� on), ability to endure extreme shock, high al� tude, vibra� on and extremes of temperature, silent opera� on and of-ten less power consump� on. The downside is that unlike disks, a block has to be erased before it can be wri� en again(modifi ed). Moreover to even modify a bit, you have to erase a whole erase unit(typically of size 512KB). Therefore, when a fi le is modifi ed in a Flash fi lesystem, the corresponding part/page is wri� en at some other lo-ca� on instead of erasing the previous loca� on. The previous loca� on now contains state data. In course of � me, a large frac� on of the device can contain stale data and hence a process called Garbage Collec� on is required. It recovers those stale loca� ons by dele� ng their erase units and moving the non-stale data in those units to other units. Naturally, we would like the unit we want to erase to have only state data so that no copying of non-stale data is required. This calls for grouping fi les which are overwri� en together into same units.In the past many people have grouped fi les with similar overwrite frequency together. However, the overwrite frequencies of newly created fi les are not available. In my BTP, I’m using machine learning to predict the over-write-frequency of a fi le when it is created so that it can be grouped properly. Preliminary results have shown that various a� ributes of fi les like the path(folder) in which created, it’s owner, the applica� on which created it can predict it’s overwrite-frequency with great accuracy. Moreover, once the overwrite frequencies are avail-able, we s� ll have to answer ques� ons like how many groups to form, what should be the ranges of overwrite frequencies of those groups. To answer these, I formulated a mathema� cal model which approximates a fl ash-fi lesystem and used that model to fi nd the op� mal grouping. The fi nal step is to do simula� ons to prove that my techniques indeed reduce the garbage collec� on costs.

214th Year Special 26

Page 27: CSEA Linked List Node4

Mukund R : On State Reachability in Counter Automata

Imagine you are wri� ng so� ware for a microwave oven. A� er you complete the so� -ware, and the device passes ini� al tests, your manager asks you a simple ques� on, “Are you sure that the microwave emi� er (he would probably use the more technical term, magnetron) will not be on if the door is open?” For a simple program of a few lines convincing your manager of this fact may be simple. But look at how diffi cult it is to analyse even a small quicksort implementa� on - for most programs that are prac� cally of any signifi cance, it is diffi cult to be sure. And s� ll, there are thousands of programs with millions of lines of code each - airplane autopilots, opera� ng systems, webservers, in nuclear power-plants etc - where we need to be absolutely sure that there are no mistakes. That’s where formal verifi ca� on comes in. Across its diff erent fl avours and variants, the common goal is typically to make a computer automa� cally verify the correctness of some program, system or model.

Program verifi ca� on is closely related to the problem of automa� c theorem proving. It was once a goal of math-ema� cians to have a systema� c procedure by which they could prove theorems (you see, everybody is, at heart, of a par� cularly lazy breed.). And come to think of it, what we want to do in program verifi ca� on is just what you did in algorithms class - prove that some algorithm is correct (the only diff erence being that we want it done au-toma� cally.). But alas, the results of Turing and Godel have shown that it is an elusive goal - it just isn’t possible, theore� cally, to mechanically show that some program obeys some “non-trivial” property.

Ideal program verifi ca� on may not be possible, but that doesn’t mean that cases of “prac� cal” signifi cance can-not be dealt with. What we are interested then, is in specifi c subclasses. My BTP is on one such subclass - the class of counter automata. The term automaton may be familiar to those who have gone through a course on automata theory, but for others (who have a� ended digital design), it is an idealized version of the fi nite-state machine you might be familiar with. Now imagine that these have access to a fi nite number of integer-valued counters - and you have a simple counter automaton.

The problem is now to decide whether, given a counter automaton and an ini� al confi gura� on, a fi nal confi gura-� on is reachable. “Will the airplane ever do a nosedive if the al� tude is below 10000 � ?” - this might be a typical ques� on we wish to answer. About 10 years ago, it was shown that if the automaton were “fl at” then we can say this (rather, write a program that can say this.). The specifi c result was that if a counter automaton is fl at, then its reachability rela� on is eff ec� vely Presburger-expressible (you might want to read the Wikipedia ar� cle on Presburger arithme� c.). In my BTP, I am looking at what happens if a counter automaton is not fl at, and why, although the reachability rela� on is Presburger-expressible, it cannot be mechanically computed.

214th Year Special 27

Page 28: CSEA Linked List Node4

Linked ListBrought to you by:

Computer Science and Engineering Associa� on,Department of Computer Science and Engineering,Indian Ins� tute of Technology Guwaha�

Email: [email protected]: h� p://csea.iitg.ernet.in

Mail in your sugges� ons to [email protected]. Visit h� p://csea.iitg.ernet.in for more.

Save Trees.Do not waste paper.

The Editorial Team

• Om Prasad Patri (Editor)

• Abhishek Anand

• Ni� n Dua

• Siddharth Prakash Singh

• Vinay Rajput (Design)

LLLiiinkkkeddd LLLiiisttBrought to you by: