data structure termpaper

29
Of Topic-> Role of Data Structures in Compiler Design Submitted by submitted to Name-> abhishek jaiswal name->miss. Amandeep kaur Roll.no-> rm1802a17 Regg.no->10802796 Trade->b.tech(cse) lit

Upload: kulabhishek

Post on 24-Nov-2014

113 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: data structure termpaper

Of

Topic-> Role of Data Structures in Compiler Design

Submitted by submitted to

Name-> abhishek jaiswal name->miss. Amandeep kaur

Roll.no-> rm1802a17

Regg.no->10802796

Trade->b.tech(cse) lit

AcknowledgmentFirstly I say to thanks of my subject teacher and education management, because they think able

Page 2: data structure termpaper

me of this topic of term paper and give the golden opportunity to prepare of term paper.

My term paper topic is “Role of data Structure in Compiler Design”. This topic is related on my sub Data Structure; this topic help I know how can compile the software’s. This topic is clear my more concept.

I wish to express to my gratitude to all the people involved in the writing of this project, specially “Amandeep Kaur” mam who was very generous in sharing their time and knowledge with me.I thank to my some other teacher and friends who give help and support me.

I am thankful to my elder brother “Rich Rahul” and “LPU” Wi-Fi; they are give fully support me to prepare my project. My sincere thanks go to my teacher “Amandeep Kaur” mam & “LPU Wi-Fi” because they are help to search and give hints to prepare of my term-paper.

Page 3: data structure termpaper

1. IntroductionWe have seen different applications of template meta-programming such as static data structures , algorithms , design patterns , Reflection, expression templates , and number theory . Compile Time Data Structure is, in fact, not a new concept in C++. Further information about it can be found in the References . Here, we are going to study the Linked List as an example of a compile time data structure, and will try to implement it with template meta-programming.

Template meta-programming is usually difficult to understand at first, especially for those who are not familiar with it. Therefore, we will discuss the run-time counterpart at the same time. We use a naming convention “List” for all the runtime programs to distinguish it with the compile-time version. Although we can implement the run-time programs in more than one way, in the compile-time world, we have only recursion to do everything including looping; therefore, we are going to use recursion in the run-time version to

Page 4: data structure termpaper

Compile Time Linked ListIf we are going to make a single linked list, then our structure of linked list would be something like this:

///////////////////////////////////////////////////////////// Node of Runtime Link List///////////////////////////////////////////////////////////struct ListNode{ int value; ListNode* next;};

Here is the compile time version of this structure:

///////////////////////////////////////////////////////////// Termination of Link List///////////////////////////////////////////////////////////struct End{};

///////////////////////////////////////////////////////////// Node of Static Link List///////////////////////////////////////////////////////////template <int iData, typename Type>struct Node{ enum { value = iData }; typedef Type Next;};

Here, we need one more structure to indicate the termination condition of the linked list. You can also call it is an end marker. In the runtime version, we don’t need it because in that case, we simply check the value of the next filed in the node. If its value is NULL, then it means that it is the last node of the linked list.

Page 5: data structure termpaper

However, in the case of template meta-programming, we have to do template specialization (or partial template specialization) to stop the recursion. We can do template specialization for any specific type or for any specific value. Here, we can’t do template specialization on a value, because the second template parameter is a type. Therefore, we have to create a new type to stop the recursive instantiation of the template. The name of the end marker can be anything, and it can store whatever you like it to. In our example, it would be sufficient to make an empty structure to create a new type as an end marker.

Now, let’s try to implement a few auxiliary functions that work on lists. Here is our first function to insert data in the linked list. We explicitly made its name look similar to the member function of the STL list class, because we are going to implement a few more STL algorithms.

Here is the simplest implementation for inserting items in the runtime single linked list.

///////////////////////////////////////////////////////////// Insert item into Runtime Link List///////////////////////////////////////////////////////////void ListPushBack(ListNode** pHead, int value){ if (*pHead == NULL) { *pHead = new ListNode(); (*pHead)->value = value; } else { ListPushBack(&(*pHead)->next, value); }}

The name of the linked list is prefixed with “List” to distinguish it from the compile time version.

Interestingly, the compile time version of the same function doesn’t use recursion, and its implementation is very easy.

Page 6: data structure termpaper

///////////////////////////////////////////////////////////

// Insert item into Static Link List///////////////////////////////////////////////////////////template <int iData, typename Type>struct PushBack{ typedef Node<iData, Type> staticList;};

And, here is the usage of this function at compile time:

typedef PushBack<2, End>::staticList node1;typedef PushBack<4, node1>::staticList node2;typedef PushBack<6, node2>::staticList node3;typedef PushBack<8, node3>::staticList node4;typedef PushBack<10, node4>::staticList node5;typedef PushBack<12, node5>::staticList myList;

Although we can create a static linked list like this:

typedef Node<-15, Node<15, Node<18, Node<13, End> > > > myList;

the above method to create a static linked list has a few advantages, which we will see when we implement a few STL algorithms in the compile-time version.

Now, let’s implement a few more STL list algorithms at compile time. But, first take a look at its runtime version to better understand template meta-programming.

///////////////////////////////////////////////////////////// Structure to calculate the length of Runtime Link List///////////////////////////////////////////////////////////int ListSize(ListNode* pHead){ if (pHead == NULL) return 0; else return 1 + ListSize(pHead->next);}

Page 7: data structure termpaper

This function is quite simple, and uses tail recursion for optimization. The compiler can optimize tail recursion with looping to avoid any runtime stack overhead. Here is the compile-time version of the same function:

///////////////////////////////////////////////////////////// Structure to calculate the length of Static Link List///////////////////////////////////////////////////////////template <typename T>struct Size;

template <int iData, typename Type>struct Size<Node<iData, Type> >{ enum { value = 1 + Size<Type>::value };};

template <>struct Size<End>{ enum { value = 0 };};

Although the STL list class doesn’t have an at() function, because list doesn’t have a random access iterator, we are trying to implement this function for the linked list. Because we can’t access any item of the linked list randomly, it is a linear time function, not a constant time one just like the at() function of the vector.

Here is the simple run-time implementation of the at() function on a single linked list with linear complexity:

///////////////////////////////////////////////////////////// Structure to find item from specific location from Runtime Link List///////////////////////////////////////////////////////////int ListAt(ListNode* pHead, int iPos){ static int iIndex = 0; ++iIndex;

Page 8: data structure termpaper

if (iIndex == iPos) return pHead->value; else if (pHead->next == NULL) return -1; else return ListAt(pHead->next, iPos);}

The code presented here is just a proof of concept, not a production quality code. One major problem with this function is the return code. If the input position is greater than the length of the linked list, like the length of the linked list is 4, but we are trying to access 6th element, then this function returns -1. This code is misleading, because -1 can be a data in the linked list at a given position, so it would be impossible to differentiate whether it is an error code or the actual data at the given position.

This one is a much better implementation than the previous one:

///////////////////////////////////////////////////////////// Structure to find item from specific location from Runtime Link List///////////////////////////////////////////////////////////bool ListAt(ListNode* pHead, int iPos, int* iVal){ static int iIndex = 0; ++iIndex;

if (iIndex == iPos) { *iVal = pHead->value; return true; } else if (pHead->next == NULL) { return false; } else { return ListAt(pHead->next, iPos, iVal); }

Page 9: data structure termpaper

}

This function returns the value at a specific location by parameter. If the user passes a position that is greater than the length of the linked list, then it will return false; otherwise, it stores the value at the parameter and returns true.

Here is the usage of this function:

if (pHead != NULL){ int val; if (ListAt(pHead, 3, &mp;val)) { std::cout << val << std::endl; }}

Here is the compile time version of the same function:

///////////////////////////////////////////////////////////// Structure to find item from specific location from Static Link List///////////////////////////////////////////////////////////template <int iIndex, typename T, int iStart = 0>struct At;

template <int iIndex, int iData, typename Type, int iStart>struct At<iIndex, Node<iData, Type>, iStart>{ enum { value = iIndex == iStart ? iData : At<iIndex, Type, (iStart + 1)>::value };};

template <int iIndex, int iData, int iStart>struct At<iIndex, Node<iData, End>, iStart>{ enum { value = iIndex == iStart ? iData : -1 };};

Page 10: data structure termpaper

This program has the same problem that it returns -1 when the item is not found. In template meta-programming, we can’t return a value by parameter just like its run-time equivalent. The solution is to introduce one more enum variable inside the structure to store whether the item was found or not. Here is the next version of the same program:

///////////////////////////////////////////////////////////// Structure to find item from specific location from Static Link List///////////////////////////////////////////////////////////template <int iIndex, typename T, int iStart = 0>struct At;

template <int iIndex, int iData, typename Type, int iStart>struct At<iIndex, Node<iData, Type>, iStart>{ enum { value = iIndex == iStart ? iData : At<iIndex, Type, (iStart + 1)>::value }; enum { found = iIndex == iStart ? 1 : At<iIndex, Type, (iStart + 1)>::found };};

template <int iIndex, int iData, int iStart>struct At<iIndex, Node<iData, End>, iStart>{ enum { value = iIndex == iStart ? iData : -1 }; enum { found = iIndex == iStart ? 1 : 0 };};

Although the value variable still stores -1 when an item not found in the linked list, if we use the other variable, i.e., the “found” variable, then we can ignore this value. Here is the usage of this algorithm:

if (At<8, myList>::found == 1){ std::cout << At<8, myList>::value << std::endl;}

3. Compile Time Algorithms

Page 11: data structure termpaper

In this section, we are going to study different STL algorithms and their working with the STL list class, and how to implement those at compile time using template meta-programming. From now, instead of using creating our own single linked list and implementing all the related functions ourselves, we are going to use the STL list class.

3.1. Find AlgorithmThe Find algorithm returns the first occurrence of the specified value in the given range. If it couldn’t find the specified value, then it returns the end iterator, i.e., one past the last element given a range. Here is a simple usage of the Find algorithm on an STL list:

std::list<int> lst;

lst.push_back(7);lst.push_back(14);lst.push_back(21);lst.push_back(28);lst.push_back(35);

std::list<int>::iterator iter_ = std::find(lst.begin(), lst.end(), 7);

if (iter_ != lst.end()) std::cout << *iter_ << std::endl;else std::cout << "Not found" << std::endl;

Here is the closest implementation of the Find algorithm in template meta-programming:

///////////////////////////////////////////////////////////// Structure to find the location of specific item in Static Link List///////////////////////////////////////////////////////////template <typename TBegin, typename TEnd, int iFindData, int iStart = 0>struct Find;

Page 12: data structure termpaper

template <int iData, typename TBegin, typename TEnd, int iFindData, int iStart>struct Find<Node<iData, TBegin>, TEnd, iFindData, iStart>{ enum { value = iFindData == iData ? iStart : Find<TBegin, TEnd, iFindData, (iStart + 1)>::value };};

template <int iData, typename TEnd, int iFindData, int iStart>struct Find<Node<iData, TEnd>, TEnd, iFindData, iStart>{ enum { value = iFindData == iData ? iStart : -1 };};

template <typename TEnd, int iFindData>struct Find<TEnd, TEnd, iFindData>{ enum { value = -1 };};

This implementation returns the position of the specified value, if found in the given range of a single linked list; otherwise, it returns -1. Here is the usage of the above implementation:

typedef PushBack<7, End>::staticList node1;typedef PushBack<14, node1>::staticList node2;typedef PushBack<21, node2>::staticList node3;typedef PushBack<28, node3>::staticList node4;typedef PushBack<35, node4>::staticList myList;

std::cout << Find<myList, End, 7>::value << std::endl;

We can even pass the range, just like in the STL algorithms, to find elements in between a given range.

std::cout << Find<node3, node1, 7>::value << std::endl;

Page 13: data structure termpaper

This is one of the reasons to use the static version of the PushBack implementation rather than create a whole static linked list in one statement. By using the compile time version of the PushBack implementation, we are able to get the iterator equivalent in the compile time world.

3.2. Count AlgorithmThis algorithm returns the numbers of items in a given range that has the given value. Here is a sample to demonstrate this:

std::list<int> lst;

lst.push_back(7);lst.push_back(14);lst.push_back(7);lst.push_back(14);lst.push_back(21);lst.push_back(7);

std::cout << std::count(lst.begin(), lst.end(), 7) << std::endl;

The output of this program is 3, because 7 exists three times in the given range. Here is a similar algorithm implemented at compile time:

///////////////////////////////////////////////////////////// Count Algorithm. Returns the number of elements in Static Link List///////////////////////////////////////////////////////////template <typename TBegin, typename TEnd, int iVal>struct Count;

template <int iData, typename TBegin, typename TEnd, int iVal>struct Count<Node<iData, TBegin>, TEnd, iVal>{ enum { value = (iData == iVal ? 1 : 0) +

Page 14: data structure termpaper

Count<TBegin, TEnd, iVal>::value };};

template <int iData, typename TEnd, int iVal>struct Count<Node<iData, TEnd>, TEnd, iVal>{ enum { value = iData == iVal ? 1 : 0 };};

template <typename TEnd, int iVal>struct Count<TEnd, TEnd, iVal>{ enum { value = 0 };};

This implementation is very similar to the “Find” algorithm, and its usage is also similar to that. Here is the closest implementation of the compile-time version of the same code we saw in this section using the STL “count” algorithm.

typedef PushBack<7, End>::staticList node1;typedef PushBack<14, node1>::staticList node2;typedef PushBack<7, node2>::staticList node3;typedef PushBack<14, node3>::staticList node4;typedef PushBack<21, node4>::staticList node5;typedef PushBack<7, node5>::staticList myList;

std::cout << Count<myList, End, 7>::value << std::endl;

Just like the “Find” algorithm, we can also count the number of specified items in a given range rather than searching in the complete static linked list.

3.3. Find If AlgorithmIf we want to find whether a linked list contains an item that is greater than 10 but less than 20, then we couldn’t do it with the Find algorithm. Here is the solution to this problem, the Find If algorithm. This algorithm returns the iterator of the element in the linked list that satisfies the condition in the given predicate. The predicate can be a simple function or a function object, i.e., a class with an overloaded () operator, that

Page 15: data structure termpaper

returns true. In that function (or function object), we can check any condition we want. Let’s take a look at the run-time version to better understand it before going to the compile time version.

std::list<int> lst;

lst.push_back(7);lst.push_back(14);lst.push_back(7);lst.push_back(14);lst.push_back(21);lst.push_back(7);

std::list<int>::iterator iter_ = std::find_if(lst.begin(), lst.end(), MyPredicate());

if (iter_ != lst.end()) std::cout << *iter_ << std::endl;else std::cout << "Not found" << std::endl;

Here, we pass MyPridicate as a function object, and here is the implementation of it:

struct MyPredicate{ bool operator()(int val) { return val > 10 && val < 20 ? true : false; }};

Implementing the compile time version of The Find If algorithm is not very difficult, and its looks very similar to the Find algorithm, except one addition to the predicate type parameter. Here is the compile time version of the

Find If algorithm:

///////////////////////////////////////////////////////////// Find If Algorithm.

Page 16: data structure termpaper

// Locate the specific items in Static Link List that satisfy the predicate///////////////////////////////////////////////////////////template <typename TBegin, typename TEnd, template <int iData> class Predicate, int iStart = 0>struct FindIf;

template <int iData, typename TBegin, typename TEnd, template <int iData> class Predicate, int iStart>struct FindIf<Node<iData, TBegin>, TEnd, Predicate, iStart>{ enum { value = Predicate<iData>::value == 1 ? iStart : FindIf<TBegin, TEnd, Predicate, (iStart+1)>::value };};

template <int iData, typename TEnd, template <int iData> class Predicate, int iStart>struct FindIf<Node<iData, TEnd>, TEnd, Predicate, iStart>{ enum { value = Predicate<iData>::value == 1 ? iStart : -1 };};

template <typename TEnd, template <int iData> class Predicate>struct FindIf<TEnd, TEnd, Predicate>{ enum { value = -1 };};

Here we use template-template parameter to make it easy to call and similar to STL algorithm. Here is the usage of this algorithm.

typedef PushBack<7, End>::staticList node1;typedef PushBack<14, node1>::staticList node2;typedef PushBack<7, node2>::staticList node3;

Page 17: data structure termpaper

typedef PushBack<14, node3>::staticList node4;typedef PushBack<21, node4>::staticList node5;typedef PushBack<7, node5>::staticList myList;

std::cout << FindIf<myList, End, MyPredicate>::value << std::endl;

In compile time world, the predicate is a structure (or class), not a function or function object. Here is our compile time version of the same predicate we used earlier in this section for the run time version:

template <int val>struct MyPredicate{ enum { value = val > 10 && val < 20 ? 1 : 0 };};

3.4. Count If AlgorithmThe “Count if” algorithm is a cousin of the “Find If”; the only difference is, it will return the number of elements in the linked list that satisfies the condition given in the predicate. Let’s take a look at the run time version of this algorithm first.

std::list<int> lst;

lst.push_back(7);lst.push_back(14);lst.push_back(7);lst.push_back(14);lst.push_back(21);lst.push_back(7);

std::cout << std::count_if(lst.begin(), lst.end(), MyPredicate()) << std::endl;

And, we use the same predicate that we made earlier:

struct MyPredicate{ bool operator()(int val) {

Page 18: data structure termpaper

return val > 10 && val < 20 ? true : false; }};

The output of this program is 2, because there are two elements in the list which are greater than 10 and less than 20. Now, let’s take a look at the compile time version of the same algorithm.

///////////////////////////////////////////////////////////// Count If Algorithm. // Returns the number of elements in Static Link List// that satisfy the predicate condition///////////////////////////////////////////////////////////template <typename TBegin, typename TEnd, template <int iData> class Predicate>struct CountIf;

template <int iData, typename TBegin, typename TEnd, template <int iData> class Predicate>struct CountIf<Node<iData, TBegin>, TEnd, Predicate>{ enum { value = (Predicate<iData>::value) + CountIf<TBegin, TEnd, Predicate>::value };};

template <int iData, typename TEnd, template <int iData> class Predicate>struct CountIf<Node<iData, TEnd>, TEnd, Predicate>{ enum { value = Predicate<iData>::value };};

template <typename TEnd, template <int iData> class Prediate>struct CountIf<TEnd, TEnd, Prediate>{ enum { value = 0 };};

Page 19: data structure termpaper

This implementation is very similar to the “Find If” algorithm. The only difference is, here, we are actually count the number of elements that satisfy the condition given in the predicate rather than find it. Here is the usage of this algorithm:

typedef PushBack<7, End>::staticList node1;typedef PushBack<14, node1>::staticList node2;typedef PushBack<7, node2>::staticList node3;typedef PushBack<14, node3>::staticList node4;typedef PushBack<21, node4>::staticList node5;typedef PushBack<7, node5>::staticList myList;

std::cout << CountIf<myList, End, MyPredicate>::value << std::endl;

3.5. Min, Max AlgorithmAlthough the STL version of the min and max algorithms don’t work on a range, but for consistency, we implemented both algorithms in a similar way, i.e., you can find the maximum or minimum value in a static linked list in a given range. Here are the simple implementations of both the algorithms:

///////////////////////////////////////////////////////////// Structure to find the Maximum value in the Link List///////////////////////////////////////////////////////////template <typename TBegin, typename TEnd>struct Max;

template <int iData, typename TBegin, typename TEnd>struct Max<Node<iData, TBegin>, TEnd >{ enum { value = iData > Max<TBegin, TEnd>::value ? iData : Max<TBegin, TEnd>::value };};

template <int iData, typename TEnd>struct Max<Node<iData, TEnd>, TEnd>{ enum { value = iData };};///////////////////////////////////////////////////////////

Page 20: data structure termpaper

// Structure to find the Minimum value in the Link List///////////////////////////////////////////////////////////template <typename TBegin, typename TEnd>struct Min;

template <int iData, typename TBegin, typename TEnd>struct Min<Node<iData, TBegin>, TEnd>{ enum { value = iData < Min<TBegin, TEnd>::value ? iData : Min<TBegin, TEnd>::value };};

template <int iData, typename TEnd>struct Min<Node<iData, TEnd>, TEnd>{ enum { value = iData };};

And, here is a simple usage of these algorithms:

std::cout << Max<myList, End>::value << std::endl;std::cout << Min<myList, End>::value << std::endl;

If we use the same static linked list that we created in the previous section, then we will get 21 and 7 as the output as the maximum and minimum values in the static linked list, respectively. Just like the other algorithms, here, we can also specify the range to get the maximum or minimum value in a given range, not in the complete static linked list.

3.6. For Each AlgorithmTill now, all the algorithms we discussed were just getting information from the linked list. None of the above algorithms changed any values in the linked list. This is because once you have some value, then it is very difficult, if not impossible, to change the value we assigned at compile time. Now, we are going to discuss an algorithm that performs some operation on the values of the linked list, but here, we explicitly restrict ourselves to not change any value in the linked list. Let’s take a look at this simple example of the “For Each” algorithm to print the elements of the linked list:

Page 21: data structure termpaper

std::for_each(lst.begin(), lst.end(),PrintFunctionObject());

Here, “PrintFunctionObject” is our function object to do the actual work. Here is a simple implementation of this predicate:

struct PrintFunctionObject{ void operator()(int val) { std::cout << val << std::endl; }};

Remember, it can also be a simple function instead of a function object. Here is the compile time equivalent version of the same algorithm:

///////////////////////////////////////////////////////////// For each algorithm. Apply some action on all items of the list///////////////////////////////////////////////////////////template <typename TBegin, typename TEnd, template <typename T> class Function>struct ForEach;

template <int iData, typename TBegin, typename TEnd, template <typename T> class Function>struct ForEach<Node<iData, TBegin>, TEnd, Function>{ void operator ()() { Function<TBegin>()(iData); ForEach<TBegin, TEnd, Function>()(); }};

template <int iData, typename TEnd, template <typename T> class Function>struct ForEach<Node<iData, TEnd>, TEnd, Function>{ void operator ()() { Function<TEnd>()(iData); }

Page 22: data structure termpaper

};

template <typename TEnd, template <typename T> class Function>struct ForEach<TEnd, TEnd, Function>{ void operator ()() { }};

If you have taken a close look at this implementation, then you may have already noticed that this implementation is quite different from all of those we discussed earlier. In this implementation, there isn’t any “enum” variable; on the contrary, here we have the overload () operator. This is because when we have to perform some action, we have to call some function to do the actual work. This program is actually a mixture of run-time and compile-time programming. All of the programs we discussed earlier were completely resolved by the compiler at the time of compiling and the actual result stored in the executable.

However, in this case, although we are doing compile time computation during compilation using Recursion, the actual function call will still execute when you run this program. This is the only way to perform some IO operations while doing template meta-programming, i.e., using functions that will execute at runtime. Here, we are not only limited to performing IO operations but can do a few more interesting things, like create a class using the new operator or a set of class hierarchies, as described in “Modern C++ Design”[1]. Here is the function object for this compile time implementation of the “For Each” algorithm.

template <typename T>struct PrintFunctonObject{ void operator ()(int iData) { std::cout << iData << std::endl; }};

Page 23: data structure termpaper

The usage of this algorithm is also slightly different. Here, we actually want to execute some code at run-time; therefore, we have to create an object of the “ForEach” structure. Here is one possible way to execute this algorithm using an anonymous object:

References1. Modern C++ Design Generic Programming.2. C++ Template Metaprogramming: Concepts.

3. Reflection support by means of template Meta-programming, Guiseppe Attardi, Antonio Cisternino

4. Static Data Structure: Reconciling Template Meta-programming and Generic Programming.

5. Loops, Metaloops & C++, Roshan Naik.

6. Expression Templates, Veldhuizen, C++ Report 1995, Generative Programming – Methods, Tools and Applications

7. Template Meta Programming and Number Theory, Zeeshan Amjad

o CodeProject

o CodeGuru

8. Template Meta Programming and Number Theory: Part 2, Zeeshan Amjad

o CodeProject

o CodeGuru

Page 24: data structure termpaper