cpp labs

Upload: kanimozhi-suguna

Post on 14-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Cpp Labs

    1/33

    LUND INSTITUTE OF TECHNOLOGY C++ ProgrammingDepartment of Computer Science 2011/12

    Laboratory Exercises, C++ ProgrammingGeneral information:

    The course has four compulsory laboratory exercises.

    You are to work in groups of two people. You sign up for the labs at sam.cs.lth.se/Labs

    (see the course plan for instructions).

    The labs are mostly homework. Before each lab session, you must have done all the

    assignments (A1, A2, . . . ) in the lab, written and tested the programs, and so on. Reasonable

    attempts at solutions count; the lab assistant is the judge of whats reasonable. Contact a

    teacher if you have problems solving the assignments.

    You will not be allowed to participate in the lab if you havent done all the assignments.

    Smaller problems with the assignments, e.g., details that do not function correctly, can be

    solved with the help of the lab assistant during the lab session.

    Extra labs are organized only for students who cannot attend a lab because of illness. Notify

    Per Holm ([email protected]) if you fall ill, before the lab.

    The labs are about:

    1. Basic C++, compiling, linking.

    2. The GNU make and gdb tools, valgrind.

    3. C++ strings and streams.

    4. Programming with the STL, operator overloading.

    Practical information:

    You will use many half-written program skeletons during the lab. You must download the

    necessary files from the course homepage before you start working on the lab assignments.

    The lab files are in separate directories lab[1-4] and are available in gzipped tar format.

    Download the tar file and unpack it like this:

    tar xzf lab1.tar.gz

    This will create a directory lab1 in the current directory.

    General sources of information about C++ and the GNU tools:

    http://www.cplusplus.com

    http://www.cppreference.com

    Info in Emacs (Ctrl-h i)

  • 7/30/2019 Cpp Labs

    2/33

  • 7/30/2019 Cpp Labs

    3/33

    The GNU Compiler Collection and C++ 3

    1 The GNU Compiler Collection and C++

    Objective: to demonstrate basic GCC usage for C++ compiling and linking.

    Read:

    Book: basic C++, variables and types including pointers, expressions, statements, functions,

    simple classes, namespaces, ifstream, ofstream.

    Manpages for gcc, g++, and ld.

    An Introduction to GCC, by Brian Gough, http://www.network-theory.co.uk/docs/

    gccintro/. Treats the same material as this lab, and more. Also available in book form.

    1 Introduction

    The GNU Compiler Collection, GCC, is a suite of native compilers for C, C++, Objective C,

    Fortran 77, Java and Ada. Front-ends for other languages, such as Fortran 95, Pascal, Modula-2

    and Cobol are in experimental development stages. The list of supported target platforms isimpressive GCC has been ported to all kinds of Unix systems, as well as Microsoft Windows

    and special purpose embedded systems. GCC is free software, released under the GNU GPL.

    Each compiler in GCC contains six programs. Only two or three of the programs are language

    specific. For C++, the programs are:

    Driver: the engine that drives the whole set of compiler tools. For C++ programs you invoke

    the driver with g++ (use gcc for C, g77 for Fortran, gcj for Java, etc.). The driver invokes

    the other programs one by one, passing the output of each program as input to the next.

    Preprocessor: normally called cpp. It takes a C++ source file and handles preprocessor directives

    (#include files, #define macros, conditional compilation with #ifdef, etc.).

    Compiler: the C++ compiler, cc1plus. This is the actual compiler that translates the input file

    into assembly language.

    Optimizer: sometimes a separate module, sometimes integrated in the compiler module. It

    handles optimization on a language-independent code representation.

    Assembler: normally called as. It translates the assembly code into machine code, which is

    stored in object files.

    Linker-Loader: called ld, collects object files into an executable file.

    The assembler and the linker are not actually parts of GCC. They may be proprietary operating

    system tools or free equivalents from the GNU binutils package.A C++ source code file is recognized by its extension. We will use .cc, which is the recom-

    mended extension. Other extensions, such as .cxx, .cpp, .C, are sometimes used.

    In C++ (and in C) declarations are collected in header files with the extension .h . To distinguish

    C++ headers from C headers other extensions are sometimes used, such as .hh , .hxx, .hpp, .H.

    We will use the usual .h extension.

    The C++ standard library is declared in headers without an extension. These are, for example,

    iostream, vector, string. Older C++ systems use pre-standard library headers, named iostream.h ,

    vector.h , string.h . Do not use the old headers in new programs.

    A C++ program may not use any identifier that has not been previously declared in the same

    translation unit. For example, suppose that main uses a class MyClass. The program code can be

    organized in several ways, but the following is always used (apart from toy examples):

  • 7/30/2019 Cpp Labs

    4/33

    4 The GNU Compiler Collection and C++

    Define the class in a file myclass.h:

    #ifndef MYCLASS_H // include guard#define MYCLASS_H// #include necessary headers here

    class MyClass {public:

    MyClass(int x);// ...

    private:// ...

    };#endif

    Define the member functions in a file myclass.cc:

    #include "myclass.h"

    // #include other necessary headers

    MyClass::MyClass(int x) { ... }// ...

    Define the main function in a file test.cc (the file name is arbitrary):

    #include "myclass.h"int main() {

    MyClass m(5);// ...

    }

    The include guard is necessary to prevent multiple definitions of names. Do not write function

    definitions in a header (except for inline functions and template functions).

    The g++ command line looks like this:

    g++ [options] [-o outfile] infile1 [infile2 ...]

    All the files in a program can be compiled and linked with one command (the -o option specifies

    the name of the executable file; if this is omitted the executable is named a.out):

    g++ -o test test.cc myclass.cc

    To execute the program, just enter its name:

    ./test

    However, it is more common that files are compiled separately and then linked:

    g++ -c myclass.ccg++ -c test.ccg++ -o test test.o myclass.o

    The -c option directs the driver to stop before the linking phase and produce an object file, named

    as the source file but with the extension .o instead of .cc.The driver can be interrupted also at other stages, using the -S or -E options. The -S option

    stops the driver after assembly code has been generated and produces an assembly code file

  • 7/30/2019 Cpp Labs

    5/33

    The GNU Compiler Collection and C++ 5

    named file.s . The -E option stops the driver after preprocessing and prints the result from this

    first stage on standard output.

    A1. Write a Hello, world! program in a file hello.cc, compile and test it.

    A2. Generate preprocessed output in hello.ii, assembly code output in hello.s , and object code

    in hello.o. Study the files:

    hello.ii: this is just to give you an impression of the size of the header.

    You will find your program at the end of the file.

    hello.s: your code starts at the label main.

    hello.o: since this is a binary file there is not much to look at. The command nm

    hello.o (study the manpage) prints the files symbol table (entry points in the

    program and the names of the functions that are called by the program and must be

    found by the linker).

    2 Options and messagesThere are many options to the g++ command that we didnt mention in the previous section. In

    the future, we will require that your source files compile correctly using the following command

    line:

    g++ -c -pipe -O2 -Wall -W -ansi -pedantic-errors -Wmissing-braces \-Wparentheses -Wold-style-cast file.cc

    Short explanations (you can read more about these and other options in the gcc and g++

    manpages):

    -c just produce object code, do not link-pipe use pipes instead of temporary files for communication between

    tools (faster)

    -O2 optimize the object code on level 2

    -Wall print most warnings

    -W print extra warnings

    -ansi follow standard C++ syntax rules

    -pedantic-errors treat serious warnings as errors

    -Wmissing-braces warn for missing braces {} in some cases

    -Wparentheses warn for missing parentheses () in some cases

    -Wold-style-cast warn for old-style casts, e.g., (int) instead ofstatic cast

    Do not disregard warning messages. Even though the compiler chooses to only issue warnings

    instead of errors, your program is erroneous or at least questionable. Another option, -Werror,

    turns all warnings into errors, thus forcing you to correct your program and remove all warnings

    before object code is produced. However, this makes it impossible to test a half-written program

    so we will normally not use it.

    Some of the warning messages are produced by the optimizer, and will therefore not be

    output if the -O2 flag is not used. But you must be aware that optimization takes time, and

    on a slow machine you may wish to remove this flag during development to save compilation

    time. Some platforms define higher optimization levels, -O3, -O4, etc. You should not use these

    levels, unless you know very well what their implications are. Some of the optimizations that areperformed at these levels are very aggressive and may result in a faster but much larger program.

  • 7/30/2019 Cpp Labs

    6/33

    6 The GNU Compiler Collection and C++

    It is important that you become used to reading and understanding the GCC error messages.

    The messages are sometimes long and may be difficult to understand, especially when the errors

    involve the standard STL classes (or any other complex template classes).

    3 Introduction to make, and a list example

    You have to type a lot in order to compile and link C++ programs the command lines are long,

    and it is easy to forget an option or two. You also have to remember to recompile all files that

    depend on a file that you have modified.

    There are tools that make it easier to compile and link, build, programs. These may be

    integrated development environments (Eclipse, Visual Studio, . . . ) or separate command line

    tools. In Unix, make is the most important tool. We will explain make in detail in lab 2. For now,

    you only have to know this:

    make reads a Makefile when it is invoked,

    the makefile contains a description of dependencies between files (which files that must be

    recompiled/relinked if a file is updated),

    the makefile also contains a description of how to perform the compilation/linking.

    As an example, we take the program from assignment A3 (see below). There, two files ( list.cc and

    ltest.cc) must be compiled and the program linked. Instead of typing the command lines, you

    just enter the command make. Make reads the makefile and executes the necessary commands.

    The makefile looks like this:

    # Define the compiler optionsCXXFLAGS = -pipe -O2 -Wall -W -ansi -pedantic-errorsCXXFLAGS += -Wmissing-braces -Wparentheses -Wold-style-cast

    # Define what to do when make is executed without arguments.all: ltest

    # The following rule means "if ltest is older than ltest.o or list.o,# then link ltest".ltest: ltest.o list.o

    g++ -o ltest ltest.o list.o

    # Define the rules to create the object files.ltest.o: ltest.cc list.h

    g++ -c $(CXXFLAGS) ltest.cclist.o: list.cc list.h

    g++ -c $(CXXFLAGS) list.cc

    You may add rules for other programs to the makefile. All action lines must start with a tab,

    not eight spaces.

    A3. The class List describes a linked list of integer numbers.1 The numbers are stored in

    nodes. A node has a pointer to the next node (0 in the last node). Before the data nodes

    there is an empty node, a head. An empty list contains only the head node. The only

    purpose of the head node is to make it easier to delete an element at the front of the list.

    1 In practice, you would never write your own list class. There are several alternatives in the standard library.

  • 7/30/2019 Cpp Labs

    7/33

    The GNU Compiler Collection and C++ 7

    namespace cpp_lab1 {/* List is a list of long integers */class List {public:

    /* create an empty list */List();

    /* destroy this list */~List();

    /* insert d into this list as the first element */void insert(long d);

    /* remove the first element less than/equal to/greater than d,depending on the value of df. Do nothing if there is novalue to remove. The public constants may be accessed withList::LESS, List::EQUAL, List::GREATER outside the class */

    enum DeleteFlag { LESS, EQUAL, GREATER };void remove(long d, DeleteFlag df = EQUAL);

    /* returns the size of the list */int size() const;

    /* returns true if the list is empty */bool empty() const;

    /* returns the value of the largest number in the list */long largest() const;

    /* print the contents of the list (for debugging) */void debugPrint() const;

    private:/* a list node */struct Node {

    long value; // the node valueNode* next; // pointer to the next node, 0 in the last nodeNode(long value = 0, Node* next = 0);

    };

    Node* head; // the pointer to the list head, which contains// a pointer to the first list node

    /* forbid copying of lists */List(const List&);List& operator=(const List&);

    };};

    Notes: Node is a struct, i.e., a class where the members are public by default. This is

    not dangerous, since Node is private to the class. The copy constructor and assignment

    operator are private, so you cannot copy lists.

    The file list.h contains the class definition and list.cc contains a skeleton of the class

    implementation. Complete the implementation in accordance with the specification.

    Also implement, in a file ltest.cc, a test program that checks that your List implemen-

    tation is correct. Be careful to check exceptional cases, such as removing the first or the last

    element in the list.

    Type make to build the program. Note that you will get warnings about unused

    parameters when the implementation skeleton is compiled. These warnings will disappear

  • 7/30/2019 Cpp Labs

    8/33

    8 The GNU Compiler Collection and C++

    when you implement the functions.

    Try to write code with testing and debugging in mind. The skeleton file list.cc includes

    . This makes it possible to introduce assertions on state invariants. The file also

    contains a definition of a debug macro, TRACE, which writes messages to the clog output

    stream. Assertions as well as TRACE statements are removed from the code when NDEBUG is

    defined.2

    A4. Implement a class Coding with two static methods:

    /* For any character c, encode(c) is a character different from c */static unsigned char encode(unsigned char c);

    /* For any character c, decode(encode(c)) == c */static unsigned char decode(unsigned char c);

    Use a simple method for the coding and decoding. Then write a program, encode, that

    reads a text file3, encodes it, and writes the encoded text to another file. The command

    line:

    ./encode file

    should run the program, encode file, and write the output to file.enc.

    Write another program, decode, that reads an encoded file, decodes it, and writes the

    decoded text to another file. The command line should be similar to that of the encode

    program. Add rules to the makefile for building the programs.

    Test your programs and check that a file that is first encoded and then decoded is

    identical to the original.

    Note: the programs will work also for files that are UTF8-encoded. In this encoding,

    national Swedish characters are encoded in two bytes, and the encode and decode functionswill be called twice for each such character.

    4 Object Code Libraries

    A lot of software is shipped in the form of libraries, e.g., class packages. In order to use a library, a

    developer does not need the source code, only the object files and the headers. Object file libraries

    may contain thousands of files and cannot reasonably be shipped as separate files. Instead, the

    files are collected into library files that are directly usable by the linker.

    4.1 Static Libraries

    The simplest kind of library is a static library. The linker treats the object files in a static library

    in the same way as other object files, i.e., all code is linked into the executable files, which as a

    result may grow very large. In Unix, a static library is an archive file, libname.a. In addition to

    the object files, an archive contains an index of the symbols that are defined in the object files.

    A collection of object files f1.o, f2.o, f3.o, . . . , are collected into a library using the ar command:

    ar crv libfoo.a f1.o f2.o f3.o ...

    2 There are two ways to define a global macro like NDEBUG. It can either be specified in the source file:

    #define NDEBUG

    or be given on the compiler command line, using the -D option:

    g++ -c -DNDEBUG $(OTHER CXXFLAGS) list.cc

    3 Note that you cannot use while (infile >> ch) to read all characters in infile, since >> skips whitespace. Useinfile.get(ch) instead. Output with outfile

  • 7/30/2019 Cpp Labs

    9/33

    The GNU Compiler Collection and C++ 9

    (Some Unix versions require that you also create the symbol table with ranlib libfoo.a.) In

    order to link a program main.o with the object files obj1.o, obj2.o and with object files from the

    library libfoo.a , you use the following command line:

    g++ -o main main.o obj1.o obj2.o -L. -lfoo

    The linker always searches for libraries in certain system directories. The -L. option makes the

    linker search also in the current directory.4 The library name (without lib and .a) is given after

    -l.

    A5. Collect the object files generated in the other assignments, except those containing main

    functions, in a library liblab1.a . Then link the programs (ltest, encode, decode) again,

    using the library.

    4.2 Shared Libraries

    Since most programs use a large amount of code from libraries, executable files can grow very

    large. Instead of linking library code into each executable that needs it, the code can be loaded atruntime. The object files should then be in shared libraries. When linking programs with shared

    libraries, the files from the library are not actually linked into the executable. Instead a pointer

    is established from the program to the library. The obvious advantage is that common code does

    not need to be reproduced in all programs.

    In Unix shared library files are named libname.so[.x.y.z] (.so for shared objects, .x.y.z is

    an optional version number). The linker uses the environment variable LD LIBRARY PATH as the

    search path for shared libraries. In Microsoft Windows shared libraries are known as DLL files

    (for dynamically loadable libraries).

    A6. (Advanced, optional) Create a library as in the previous exercise, but make it shared

    instead of static. Then link the executables, with different names, using the shared library.Make sure they run correctly. Compare the sizes of the dynamically linked executables

    to the statically linked (there will not be a big size difference, since the size of the library

    code is small).

    Use the command ldd (list dynamic dependencies) to inspect the linkage of your

    programs. Hints: shared libraries are created by the linker, not the ar archiver. Use the gcc

    manpage and the ld manpage (and, if needed, other manpages) to explain the following

    sequence of operations:

    g++ -fPIC -c *.ccg++ -shared -Wl,-soname,liblab1.so.1 -o liblab1.so.1.0 *.o

    ln -s liblab1.so.1.0 liblab1.so.1ln -s liblab1.so.1 liblab1.so

    You then link with -L. -llab1 as usual. The linker merely checks that all referenced

    symbols are in the shared library. Before you execute the program, you must define

    LD LIBRARY PATH so it includes the current directory (export is for zsh, setenv for tcsh):

    export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATHsetenv LD_LIBRARY_PATH .:$LD_LIBRARY_PATH

    4 You may have several -L and -l options on a command line. Example, where the current directory and the directory

    /usr/local/mylibare searched for the libraries libfoo1.a and libfoo2.a:g++ -o main main.o obj1.o obj2.o -L. -L/usr/local/mylib -lfoo1 -lfoo2

  • 7/30/2019 Cpp Labs

    10/33

    10 The GNU Compiler Collection and C++

    5 Final Remarks

    If a source file includes a header that is not in the current directory, the path to the directory

    containing the header can be given to the compiler using the -I option (similar to -L):

    g++ -c -Ipath file.cc

    To make GCC print all it is doing during compilation (what commands it is calling and a lot of

    other information), the option -v (for verbose) can be added to the command line.

  • 7/30/2019 Cpp Labs

    11/33

    Tools for Practical C++ Development 11

    2 Tools for Practical C++ Development

    Objective: to introduce a set of tools which are often used to facilitate C++ development in a Unix

    environment. The make tool (GNU Make) is used for compilation and linking, gdb (the GNU

    debugger) for controlled execution and testing, valgrind (not GNU) for finding memory-related

    errors.

    Read:

    GNU Make, http://www.gnu.org/software/make/manual/

    GDB User Manual, http://www.gnu.org/software/gdb/documentation/

    valgrind Quick Start, http://www.valgrind.org/docs/manual/QuickStart.html

    Manpages for the tools used in the lab.

    The manuals have introductory tutorials that are recommended as starting points. The manuals

    are very well written and you should consult them if you want to learn more than what is treated

    during the lab.

    1 GNU Make

    1.1 Introduction

    As you saw in lab 1, make is a good tool it sees to it that only files that have been modified

    are compiled (and files that depend on modified files). To do this, it compares the modification

    times5 of source files and object files.

    We will use the GNU version of make. There are other make programs, which are almost but

    not wholly compatible to GNU make. In a Linux or Darwin system, the command make usually

    refers to GNU make. You can check what variant you have with the command make --version.It should give output like the following:

    make --versionGNU Make 3.81Copyright (C) 2006 Free Software Foundation, Inc.This is free software; see the source for copying conditions....

    Ifmake turns out to be something else than GNU make, try the command gmake instead ofmake;

    this should refer to GNU make.

    Make uses a description of the source project in order to figure out how to generate executables.

    The description is written in a file, usually called Makefile. When make is executed, it readsthe makefile and executes the compilation and linking (and maybe other) commands that are

    necessary to build the project.

    The most important construct in a makefile is a rule. A rule specifies how a file (the target),

    which is to be generated, depends on other files (the prerequisites). For instance, the following

    rule specifies that the file ltest.o depends on the files ltest.cc and ltest.h:

    5 In a network environment with a distributed file system, there may occur problems with unsynchronized clocksbetween a client and the file server. Make will report errors like:

    make: ***Warning: File "foo" has modification time in the future.

    make: warning: Clock skew detected. Your build may be incomplete.

    Unfortunately, there is not much you can do about this problem, if you dont have root privileges on the computer.Inform the system administrators about the problem and switch to another computer if possible. If no other computer isavailable, temporarily work on a local file system where you have write access, such as /tmp.

  • 7/30/2019 Cpp Labs

    12/33

    12 Tools for Practical C++ Development

    ltest.o: ltest.cc list.h

    On the line following the dependency specification you write a shell command that generates the

    target. This is called an explicit rule. Example (with a short command line without options):

    ltest.o: ltest.cc list.hg++ -c ltest.cc

    Make has a very unusual syntax requirement: the shell command must be preceded by a tab

    character; spaces are not accepted.

    An implicit rule does not give an explicit command to generate the target. Instead, it relies

    on makes large collection of default rules. For instance, make knows how to generate .o-files

    from most kinds of source files, for instance that g++ is used to generate .o-files from .cc-files.

    The actual implicit rule is:

    $(CXX) $(CPPFLAGS) $(CXXFLAGS) -c -o $@ $ .o is used:ltest.o: ltest.cc list.hlist.o: list.cc list.h

    $^ expands to the complete list of prerequisites. We will make improvements to this makefile

    later.

    Suppose that none of the files ltest, ltest.o, list.o exists. Then, the following commands are

    executed when you run make (the long command lines have been wrapped):

    make ltestg++ -pipe -O2 -Wall -W -ansi -pedantic-errors -Wmissing-braces -Wparentheses

    -Wold-style-cast -c -o ltest.o ltest.ccg++ -pipe -O2 -Wall -W -ansi -pedantic-errors -Wmissing-braces -Wparentheses

    -Wold-style-cast -c -o list.o list.ccg++ -o ltest ltest.o list.o

    If an error occurs in one of the commands, make will be aborted. If there, for instance, is an error

    in list.cc, the compilation of that file will be aborted and the program will not be linked. When

    you have corrected the error and run make again, it will discover that ltest.o is up to date and

    only remake list.o and ltest.

    If you run make without a target name as parameter, make builds the first target it finds in the

  • 7/30/2019 Cpp Labs

    13/33

    Tools for Practical C++ Development 13

    makefile. Since ltest is the first target, make ltest and make are equivalent. By convention, the

    first target should be named all. Therefore, the first rules in the makefile should look like this:

    # Default target all, make everythingall: ltest# Linking:

    ltest: ltest.o list.o$(CXX) -o $@ $^

    Makefiles can also contain directives that control the behavior of make. For example, a makefile

    can include other files with the include directive.

    A1. The file Makefile contains the makefile that has been used as an example. Experiment with

    make: copy the necessary source files (list.h , list.cc and ltest.cc) to the lab2 directory and

    run make. Run make again. Delete the executable program and run make again. Change

    one or more of the source files (it is sufficient to touch them) and see what happens. Run

    make ltest.o. Run make notarget. Read the manpage and try other options. Etc., etc.

    1.2 Phony targets

    Makefiles may contain targets that do not actually correspond to files. The all target in the

    previous section is an example. Now, suppose that a file all is created in the directory that

    contains the makefile. If that file is newer than the ltest file, a make invocation will do nothing

    but say make: Nothing to be done for all., which is not the desired behavior. The solution

    is to specify the target all as a phony target, like this:

    .PHONY: all

    Another common phony target is clean. Its purpose is to remove intermediate files, such as object

    files, and it has no prerequisites. It typically looks like this:

    .PHONY: cleanclean:

    $(RM) $(OBJS)

    The variable RM defaults to rm -f, where the option -f tells rm not to warn about non-existent

    files. The return value from the command is always 0, so the clean target will always succeed.

    A cleaning rule that removes also executable files and libraries could be called, e.g., cleaner or

    realclean.

    All Unix source distributions contain one or more makefiles. A makefile should contain

    a phony target install, with all as its prerequisite. The make install command copies theprograms and libraries built by the all rule to a system-wide location where they can be reached

    by other users. This location, which is called the installation prefix, is usually the directory

    /usr/local. Installation uses the GNU command install (or plain cp) to copy programs to

    $(PREFIX)/bin and libraries to $(PREFIX)/lib. Since only root has permission to write in

    /usr/local, you will have to use one of your own directories as prefix.

    A2. Create the bin and lib directories. Add all, clean and install targets to your makefile and

    specify suitable phony targets. Also provide an uninstall target that removes the files that

    were installed by the install target.

  • 7/30/2019 Cpp Labs

    14/33

    14 Tools for Practical C++ Development

    1.3 Rules for Linking

    In our example makefile, we used an explicit rule for linking. Actually, there is an implicit rule

    for linking, which looks (after some variable expansions) like this:

    $(CC) $(LDFLAGS) $^ $(LOADLIBES) $(LDLIBS) -o $@

    LDFLAGS are options to the linker, such as -Ldirectory. LOADLIBES and LDLIBS6 are variables

    intended to contain libraries, such as -llab1. So this is a good rule, except for one thing: it

    uses $(CC) to link, and CC is by default gcc, not g++. But if you change the definition of CC, the

    implicit rule works also for C++:

    # Define the linkerCC = g++

    1.4 Generating Prerequisites Automatically

    While youre working with a project the prerequisites are often changed. New #include directives

    are added and others are removed. In order for make to have correct information of the

    dependencies, the makefile must be modified accordingly. The necessary modifications can be

    performed automatically by make itself.

    A3. Read the make manual, section 4.14 Generating Prerequisites Automatically, and imple-

    ment such functionality in your makefile. Hint: Copy the %.d rule (it is not necessary that

    you understand everything in it) to your makefile and make suitable modifications. Do not

    forget to include the *.d files.

    The first time you run make you will get a warning about .d files that dont exist. This

    is normal and not an error. Look at the .d files that are created and see that they contain

    the correct dependencies.

    One useful make facility that we havent mentioned earlier is wildcards: as an example,

    you can define the variable SRC as a list of all .cc files with SRC = $(wildcard *.cc). And

    from this you can get all .o files with OBJ = $(SRC:.cc=.o).

    From now on, you must update the makefile when you add new programs, so that you

    can always issue the make command in the build directory to build everything. When you

    add a new program, you should only have to add a rule that defines the .o-files that are

    necessary to link the program, and maybe add the program name to a list of executables.

    This is important it will save you many hours of command typing.

    (Optional) Collect all object files not containing main functions into a library, and link

    executables against that library. If you know how to create shared libraries, use a sharedlibrary libcxx.so, otherwise create a static archive libcxx.a .

    1.5 Writing a Good Makefile

    A makefile should always be written to run in sh (Bourne Shell), not in csh. Do not use special

    features from, e.g., bash, zsh, or ksh. The tools used in explicit rules should be accessed through

    make variables, so that the user can substitute alternatives to the defaults.

    A makefile should also be written so that it doesnt have to be located in the same directory

    as the source code. You can achieve this by defining a SRCDIR variable and make all rules refer

    to the source code from that prefix. Even better is to use the variable VPATH, which specifies a

    search path for source files. Vpath is described in section 4.5 of the make manual.

    6 There doesnt seem to be any difference between LOADLIBES and LDLIBS they always appear together and areconcatenated. Use LDLIBS.

  • 7/30/2019 Cpp Labs

    15/33

    Tools for Practical C++ Development 15

    A4. (Optional, VPATH is difficult) Create a new directory, src, for C++ lab source code and copy

    the source files from lab 1 into it. In a parallel directory build, install the Makefile, and

    modify it so that you can build the programs (and possibly libraries), standing in the build

    directory. The all target, which should be default, should build your ltest, encode, and

    decode programs. No command should write to the src directory.

    2 Debugging

    Consider the following program (reciprocal.cc):

    #include #include // for atoi, see below

    double reciprocal(int i) {return 1.0 / i;

    }

    int main(int argc, char** argv) {

    int i = std::atoi(argv[1]); // atoi: "ascii-to-integer"double r = reciprocal(i);std::cout

  • 7/30/2019 Cpp Labs

    16/33

    16 Tools for Practical C++ Development

    (gdb) is a command prompt.

    The first step is to run the program inside the debugger. Enter the command run and the program

    arguments. First run the program without arguments, like this:

    (gdb) runStarting program: /h/dd/c/.../reciprocal

    Program received signal SIGSEGV, Segmentation fault.*__GI_____strtol_l_internal (nptr=0x0, endptr=0x0, base=10, group=0, loc=0xb7f1c380)

    at strtol_l.c:298...

    The error occurred in the strtol l internal function, which is an internal function that atoi

    calls. Inspect the call stack with the where command:

    (gdb) where#0 *__GI_____strtol_l_internal (nptr=0x0, endptr=0x0, base=10, group=0, loc=0xb7f1c380)

    at strtol_l.c:298#1 0xb7e07fc6 in *__GI_strtol (nptr=0x0, endptr=0x0, base=10) at strtol.c:110#2 0xb7e051b1 in atoi (nptr=0x0) at atoi.c:28#3 0x080487b0 in main (argc=1, argv=0xbfa1be44) at reciprocal.cc:9

    You see that the atoi function was called from main. Use the up command (three times) to go to

    the main function in the call stack:

    ...(gdb) up#3 0x080487b0 in main (argc=1, argv=0xbfa1be44) at reciprocal.cc:99 int i = std::atoi(argv[1]); // atoi: "ascii-to-integer"

    Note that GDB finds the source code for main, and it shows the line where the function call

    occurred.

    You can examine the value of a variable using the print command:

    (gdb) print argv[1]$1 = 0x0

    This confirms a suspicion that the problem is indeed a 0 pointer passed to atoi. Set a breakpoint

    on the first line of main with the break command:

    (gdb) break mainBreakpoint 1 at 0x80487a0: file reciprocal.cc, line 9.

    Now run the program with an argument:

    (gdb) run 5The program being debugged has been started already.Start it from the beginning? (y or n) y

    Starting program: /h/dd/c/.../reciprocal 5

    Breakpoint 1, main (argc=2, argv=0xbff7aae4) at reciprocal.cc:99 int i = std::atoi(argv[1]); // atoi: "ascii-to-integer"

    The debugger has stopped at the breakpoint. Step over the call to atoi using the next command:

  • 7/30/2019 Cpp Labs

    17/33

    Tools for Practical C++ Development 17

    (gdb) next10 double r = reciprocal(i);

    If you want to see what is going on inside reciprocal, step into the function with step:

    (gdb) step

    reciprocal (i=5) at reciprocal.cc:55 return 1.0 / i;

    The next statement to be executed is the return statement. Check the value of i and then issue

    the continue command to execute to the next breakpoint (which doesnt exist, so the program

    terminates):

    (gdb) print i$ 2 = 5(gdb) continueContinuing.argv[1] = 5, 1/argv[1] = 0.2

    Program exited normally.

    A5. Try the debugger by working through the example. Experiment with the commands (the

    table below gives short explanations of some useful commands). Commands may be

    abbreviated and you may use tab for command completion. Ctrl-a takes you to the

    beginning of the command line, Ctrl-e to the end, and so on. Use the help facility and the

    manual if you want to learn more.

    help [command] Get help about gdb commands

    run [args...] Run the program with arguments specified.

    continue Continue execution.

    next Step to the next line over called functions.

    step Step to the next line into called functions.

    where Print the call stack.

    up Go up to caller.

    down Go down to callee.

    list [nbr] List 10 lines around the current line or around line nbr (the

    following lines if repeated).

    break func Set a breakpoint on the first line of a function func.

    break nbr Set a breakpoint at line nbr in the current file.

    catch [arg] Break on events, for example exceptions that are thrown or caught

    (catch throw, catch catch).

    info [arg] Generic command for showing things about the program, for

    example information about breakpoints (info break).

    delete [nbr] Delete all breakpoints or breakpoint nbr.

    print expr Print the value of the expression expr.

    display var Print the value of variable var every time the program stops.

    undisplay [nbr] Delete all displays or display nbr.

    set var = expr Assign the value of expr to var.

    kill Terminate the debugger process.

    watch var Set a watchpoint, i.e., watch all modifications of a variable. This

    can be very slow but can be the best solution to find bugs whenrandom pointers are used to write in the wrong memory loca-

    tion.

  • 7/30/2019 Cpp Labs

    18/33

    18 Tools for Practical C++ Development

    whatis var|func Print the type of a variable or function.

    source file Read and execute the commands in file.

    make Run the make command.

    The debugging information that is written to the executable file can be removed using the

    strip program from binutils. See the manpage for more information.

    You may wish to disable optimization (remove the -O2 option) while you are debugging

    a program. GCC allows you to use a debugger together with optimization but the

    optimization may produce surprising results: some variables that you have defined may

    not exist in the object code, flow of control may move where you did not expect it, some

    statements may not be executed because they compute constant results, etc.

    To make it easier to switch between generating debugging versions and production

    versions of your programs, you can add something like this to your makefile:

    # insert this line somewhere at the top of the fileDEBUG = true // or false

    # insert the following lines after the definitions of CXXFLAGS and LDFLAGSifeq ($(DEBUG), true)CXXFLAGS += -ggdbLDFLAGS += -ggdbelseCXXFLAGS += -O2endif

    You can specify the value of DEBUG on the command line (the -e options means that the

    command line value will override any definition in the makefile):

    make -e DEBUG=true

    3 Finding Memory-Related Errors

    C++ is less programmer-friendly than Java. In Java, many common errors are caught by the

    compiler (use of uninitialized variables) or by the runtime system (addressing outside array

    bounds, dereferencing null pointers, etc.). In C++, errors of this kind are not caught; instead

    they result in erroneous results or traps during program execution. Furthermore, you get no

    information whatsoever about where in the program the error occurred. Since allocation of

    dynamic memory in C++ is manual, you also have a whole new class of errors (double delete,

    memory leaks).Valgrind (http://www.valgrind.org) is a tool (available only under Linux and Mac OS X)

    that helps you to find memory-related errors at the precise locations at which they occur. It does

    this by emulating an x86 processor and supplying each data bit with information about the usage

    of the bit. This results in slower program execution, but this is more than compensated for by the

    reduced time spent in hunting bugs.

    It is very easy to use valgrind. To find the error in the reciprocal program from the previous

    section, you just do this (compile and link with -ggdb and without optimization):

    valgrind --leak-check=yes ./reciprocal==2938== Memcheck, a memory error detector==2938== Copyright (C) 2002-2010, and GNU GPLd, by Julian Seward et al.==2938== ...==2938== Invalid read of size 1==2938== at 0x41A602F: ____strtol_l_internal (strtol_l.c:298)

  • 7/30/2019 Cpp Labs

    19/33

    Tools for Practical C++ Development 19

    ==2938== by 0x41A5DE6: strtol (strtol.c:110)==2938== by 0x41A3160: atoi (atoi.c:28)==2938== by 0x80486B9: main (reciprocal.cc:9)==2938== Address 0x0 is not stackd, mallocd or (recently) freed==2938== ...==2938==

    ==2938== HEAP SUMMARY:==2938== in use at exit: 0 bytes in 0 blocks==2938== total heap usage: 0 allocs, 0 frees, 0 bytes allocated==2938====2938== All heap blocks were freed -- no leaks are possible==2938====2938== For counts of detected and suppressed errors, rerun with: -v==2938== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 17 from 6)[1] 2938 segmentation fault valgrind --leak-check=yes ./reciprocal

    When the error occurs, you get an error message and a stack trace (and a lot of other information).

    In this case, it is almost the same information as when you used gdb, but this is only because the

    error resulted in a segmentation violation. But most errors dont result in immediate traps:

    void zero(int* v, size_t n) {for (size_t i = 0; i < n; ++i)

    v[i] = 0;}int main() {

    int* v = new int[5];zero(v, 10); // oops, should have been 5// ... this error may result in a trap now or much later, or wrong results

    }

    A6. The file vgtest.cc contains a test program with two functions with common programmingerrors. Write more functions to check other common errors. Note that valgrind cannot find

    addressing errors in stack-allocated arrays (like int v[10]; v[10] = 0). Run the program

    under valgrind, try to interpret the output.

    Advice: always use valgrind to check your programs! Note that valgrind may give

    false error reports on C++ programs that use STL or C++ strings. See the valgrind FAQ,

    http://valgrind.org/docs/manual/faq.html#faq.reports

    4 Makefiles and Topological Sorting

    4.1 Introduction

    Targets and prerequisites of a makefile form a graph where the vertices are files and an arcbetween two vertices corresponds to a dependency. When make is invoked, it computes an order

    in which the files are to be compiled so the dependencies are satisfied. This corresponds to a

    topological sort of the graph.

    Consider a simplified makefile containing only implicit rules, i.e., lines reading target:

    prerequisites. You are to implement a C++ program topsort that reads such a file, whose name

    is given on the command line, and outputs the targets to standard output in an order where

    the dependencies are satisfied. If the makefile contains syntax errors or cyclic dependencies, the

    program should exit with an error message.

    This is not a trivial task. The solution will be developed stepwise in the following sections.

  • 7/30/2019 Cpp Labs

    20/33

    20 Tools for Practical C++ Development

    4.2 Graph Representation

    The first task is to choose a representation of the dependency graph. We will use a representation

    where each vertex has a list of its neighbors. This list is called an adjacency list.

    Consider the following makefile:

    D: AE : B DB : A C

    The dependencies in the makefile form the following graph (the graph to the left, the vertices

    with their adjacency lists to the right):

    A D

    B

    C E

    A

    D

    B

    C

    E

    A graph has a list ofVertex objects (A, B, C, D, E, in the figure). A vertex is represented by an object

    of class Vertex, which has the attributes name (the label on the vertex) and adj (the adjacency

    list). In the figure, the adjacency list of vertex A contains B and D.

    The class definitions follow. First, class Vertex (the attribute color is used when traversing

    the graph, see section 4.3):

    class Vertex {friend class VertexList; // give VertexList access to private members

    private:/* create a vertex with name nm */Vertex(const std::string& nm);

    std::string name; // name of the vertexVertexList adj; // list of adjacent verticesenum Color { WHITE, GRAY, BLACK };Color color; // used in the traversal algorithm

    };

    In class VertexList, the adjacency list is implemented as a vector of pointers to other vertices

    (vptrs). The functions top sort and dfs visit are described in section 4.3.

    struct cyclic {}; // exception type, the graph is cyclic

    class VertexList {public:

    /* create an empty vertex list, destroy the list */VertexList();~VertexList();

    /* insert a vertex with the label name in the list */void add_vertex(const std::string& name);

  • 7/30/2019 Cpp Labs

    21/33

    Tools for Practical C++ Development 21

    /* insert an arc from the vertex from to the vertex to */void add_arc(const std::string& from, const std::string& to);

    /* return the vertex names in reverse topological order */std::stack top_sort() throw(cyclic);

    /* print the vertex list (for debugging) */void debugPrint() const;

    private:void insert(Vertex* v); // insert the vertex pointer v in vptrs, if

    // its not already thereVertex* find(const std::string& name) const; // return a pointer to the

    // vertex named name, 0 if not foundvoid dfs_visit(Vertex* v, std::stack& result) throw(cyclic);

    // visit v in the traversal algorithm

    std::vector vptrs; // vector of pointers to vertices

    VertexList(const VertexList&); // forbid copyingVertexList& operator=(const VertexList&);};

    Finally, the class Graph is merely a typedef for a VertexList:

    typedef VertexList Graph;

    Notes:

    add vertex in VertexList should create a new Vertex object and add it to the vertex list.

    It should do nothing if a vertex with that name already is present in the list.

    add arc should insert to in froms adjacency list. It should start with inserting from andto in the vertex list in that way arcs and vertices may be added in arbitrary order.

    find is used in add vertex and in insert.

    The VertexList destructor is quite difficult to write correctly, since you have to be careful

    so no Vertex object is deleted more than once. It is easiest to start by removing pointers

    from the adjacency lists so the graph isnt circular, before deleting the objects. You may

    treat the destructor as optional, if you wish.

    Advice: draw your own picture of a graph structure, with all objects, vptrs and pointers.

    A7. The classes are in the files vertex.h , vertex.cc, vertexlist.h , vertexlist.cc, graph.h . Imple-

    ment the classes, except top sort and dfs visit, and test using the program graph test.cc.

    Comment out the lines in the function testGraph that check the top sort algorithm.

    4.3 Topological Sort

    A topological ordering of the vertices of a graph is essentially obtained by a depth-first search

    from all vertices. (There are other algorithms for topological sorting. If you prefer another

    algorithm you are free to use that one instead.) The topological ordering is not necessarily unique.

    For instance, the vertices of the graph in the preceding section may be ordered like this:

    A C B D E or

    C A D B E or . . .

    The algorithm is described below, in pseudo-code. The algorithm produces the vertices in reversetopological order, which is why we use a stack for the result.

  • 7/30/2019 Cpp Labs

    22/33

    22 Tools for Practical C++ Development

    The algorithm should detect cycles in the graph. If that happens, your function should throw

    a cyclic exception. The cycle detection is not described in the algorithm; you must add that

    yourself.

    Topological-Sort(Graph G) {/* 1. initialize the graph and the result stack */for each vertex v in G

    v.color = WHITE;Stack result;/* 2. search from all unvisited vertices */for each vertex v in G

    if (v.color == WHITE)DFS-visit(v, result);

    }

    DFS-visit(Vertex v, Stack result) {/* 1. white vertex v is just discovered, color it gray */v.color = GRAY;/* 2. recursively visit all unvisited adjacent nodes */for each vertex w in v.adj

    if (w.color == WHITE)DFS-visit(w);

    /* 3. finish vertex v (color it black and output it) */v.color = BLACK;result.push(v.name);

    }

    A8. Implement top sort and dfs visit and test the program.

    4.4 Parsing the Makefile

    A9. Finally, you are ready to solve the last part of the assignment: to read a makefile, build a

    graph, sort it topologically and print the results. The input to the program is a makefile

    containing lines of the type:

    target: prereq1 prereq2 ...

    The names of the target and the prerequisites are strings. The list of prerequisites may

    be empty. The program should perform reasonable syntax checks (the most important is

    probably that a line must contain a target).

    The file topsort.cc contains a test program. Implement the function buildGraph and

    test. The files test[1-5].txt contain test cases (the same that were used in the graph test

    program in assignment A8), but you are encouraged to also test other cases.

  • 7/30/2019 Cpp Labs

    23/33

    Strings and Streams 23

    3 Strings and Streams

    Objective: to practice using the string class and the stream classes.

    Read:

    Book: strings, streams, function templates.

    1 Class string

    1.1 Introduction

    In C, a string is a null-terminated array of characters. This representation is the cause of many

    errors: overwriting array bounds, trying to access arrays through uninitialized or incorrect

    pointers, and leaving dangling pointers after an array has been deallocated. The

    library contains some useful operations on C-strings, such as copying and comparing strings.

    C++ strings hide the physical representation of the sequence of characters. A C++ string

    object knows its starting location in memory, its contents (the characters), its length, and thelength to which it can grow before it must resize its internal data buffer (i.e., its capacity).

    The exact implementation of the string class is not defined by the C++ standard. The

    specification in the standard is intended to be flexible enough to allow different implementations,

    yet guarantee predictable behavior.

    The string identifier is not actually a class, but a typedef for a specialized template:

    typedef std::basic_string string;

    So string is a string containing characters of the type char. There is another standard string

    specialization, wstring, which describes strings of characters of the type wchar t. wchar t is the

    type representing wide characters. The standard does not specify the size of these characters;usually its 16 or 32 bits (32 in gcc). In this lab we will ignore all internationalization problems

    and assume that all characters fit in one byte. Strings in C++ are not limited to the classes string

    and wstring; you may specify strings of characters of any type, such as int. However, this is

    discouraged: there are other sequence containers that are better suited for non-char types.

    Since were only interested in the string specialization of basic string, you may wonder why

    we mention basic string at all? There are two reasons: First, std::basic string is all the

    compiler knows about in all but the first few translation passes. Hence, diagnostic messages only

    mention std::basic string. Secondly, it is important for the general understanding of the

    ideas behind the string method implementations to have a grasp of the basics of a C++ concept

    called character traits, i.e., character characteristics. For now, we will just show parts of the

    definition ofbasic string:

    /* Standard header */namespace std {

    templateclass basic_string {

    // types:typedef traits_t traits_type;typedef typename traits_t::char_type char_type;typedef unsigned long size_type;static const size_type npos = -1;// much more

    };}

  • 7/30/2019 Cpp Labs

    24/33

    24 Strings and Streams

    char type is the type of the characters in the string. size type is a type used for indexing in the

    string. npos (no position) indicates a position beyond the end of the string; it may be returned

    by functions that search for characters in a string. Since size type is unsigned, npos represents

    positive infinity.

    1.2 Operations on Strings

    We now show some of the most important operations on strings (actually basic strings). The

    operations are both member functions and global helper functions. We start with the member

    functions (for formatting reasons, some parameters of type size type have been specified as

    int):

    class string {public:

    /*** construction ***/string(); // create an empty stringstring(const string& s); // create a copy of sstring(const char* cs); // create a string with the characters from csstring(const string& s, int start, int n); // create a string with n

    // characters from s, starting at position startstring(int n, char ch); // create a string with n copies of ch

    /*** information ***/int size(); // number of characters in the stringint capacity(); // the string can hold this many characters

    // before resizing is necessaryvoid reserve(int n); // set the capacity to n, resize if necessary

    /*** character access ***/const char& operator[](size_type pos) const;char& operator[](size_type pos);

    /*** substrings */string substr(size_type start, int n); // the substring starting at

    // position start, containing n characters

    /*** finding things ***/// see below

    /*** inserting, replacing, and removing ***/void insert(size_type pos, const string& s); // insert s at position posvoid append(const string& s); // append s at the endvoid replace(size_type start, int n, const string& s); // replace n

    // characters starting at pos with s

    void erase(size_type start = 0, int n = npos); // remove n characters// starting at pos

    /*** assignment and concatenation ***/string& operator=(const string& s);string& operator=(const char* cs);string& operator=(char ch);string& operator+=(const string& s); // also for const char* and char

    /*** access to C-string representation ***/const char* c_str();

    }

    Note that there is no constructor string(char). Use string(1, char) instead.

  • 7/30/2019 Cpp Labs

    25/33

    Strings and Streams 25

    The subscript functions operator[] do not check for a valid index. There are similar at()

    functions that do check, and that throw out of range if the index is not valid.

    The substr() member function takes a starting position as its first argument and the number

    of characters as the second argument. This is different from the substring() method in

    java.lang.String, where the second argument is the end position of the substring.

    There are overloads of most of the functions. You can use C-strings or characters as

    parameters instead of strings.

    There is a bewildering variety of member functions for finding strings, C-strings or charac-

    ters. They all return npos if the search fails. The functions have the following signature (the

    string parameter may also be a C-string or a character):

    size_type FIND_VARIANT(const string& s, size_type pos = 0) const;

    s is the string to search for, pos is the starting position. (The default value for pos is npos,

    not 0, in the functions that search backwards).

    The find variants are find (find a string, forwards), rfind (find a string, backwards),find first of and find last of (find one of the characters in a string, forwards or back-

    wards), find first not of and find last not of (find a character that is not one of the

    characters in a string, forwards or backwards).

    Example:

    void f() {string s = "accdcde";int i1 = s.find("cd"); // i1 = 2 (s[2]==c && s[3]==d)int i2 = s.rfind("cd"); // i2 = 4 (s[4]==c && s[5]==d)int i3 = s.find_first_of("cd"); // i3 = 1 (s[1]==c)int i4 = s.find_last_of("cd"); // i4 = 5 (s[5]==d)

    int i5 = s.find_first_not_of("cd"); // i5 = 0 (s[0]!=c && s[0]!=d)int i6 = s.find_last_not_of("cd"); // i6 = 6 (s[6]!=c && s[6]!=d)

    }

    The global overloaded operator functions are for concatenation (operator+) and for comparison

    (operator==, operator

  • 7/30/2019 Cpp Labs

    26/33

    26 Strings and Streams

    A2. The Sieve of Eratosthenes is an ancient method for finding all prime numbers less than

    some fixed number M. It starts by enumerating all numbers in the interval [0, M] and

    assuming they are all primes. The first two numbers, 0 and 1 are marked, as they are not

    primes. The algorithm then starts with the number 2, marks all subsequent multiples of

    2 as composites, and repeats the process for the next prime candidate. When the initial

    sequence is exhausted, the numbers not marked as composites are the primes in [0, M].In this assignment you shall use a string for the enumeration. Initialize a string with

    appropriate length to PPPPP...PPP. The characters at positions that are not prime numbers

    should be changed to C. Write a test program that prints the prime numbers between 1

    and 200 and also the largest prime that is less than 100,000.

    Example with the numbers 027:

    1 20123456789012345678901234567

    Initial: CCPPPPPPPPPPPPPPPPPPPPPPPPPPFind 2, mark 4,6,8,...: CCPPCPCPCPCPCPCPCPCPCPCPCPCPFind 3, mark 6,9,12,...: CCPPCPCPCCCPCPCCCPCPCCCPCPCC

    Find 5, mark 10,15,20,25: CCPPCPCPCCCPCPCCCPCPCCCPCCCCFind 7, mark 14,21: CCPPCPCPCCCPCPCCCPCPCCCPCCCC...

    2 The iostream Library

    2.1 Input/Output of User-Defined Objects

    We have already used most of the stream classes: istream, ifstream, and istringstream for

    reading, and ostream, ofstream, and ostringstream for writing. There are also iostreams that

    allow both reading and writing. The stream classes are organized in the following (simplified)

    generalization hierarchy:

    ios

    istream ostream

    ios_base

    iostream

    ifstream istringstream fstream stringstream ofstream ostringstream

    The classes ios base and ios contain, among other things, information about the stream state.

    There are, for example, functions bool good() (the state is ok) and bool eof() (end-of-file has

    been reached). There is also a conversion operator void*() that returns nonzero if the state is

    good, and a bool operator!() that returns nonzero if the state is not good. We have used these

    operators with input files, writing for example while (infile >> ch) or if (! infile).

    Now, we are interested in reading objects of a class A from an istream using operator>> and

    in writing objects to an ostream using operator(std::istream& is, A& aobj);std::ostream& operator

  • 7/30/2019 Cpp Labs

    27/33

    Strings and Streams 27

    Note:

    Since the parameters are of type istream or ostream, the operator functions work for both

    file streams and string streams.

    The functions are global helper functions.7

    Both operators return the stream object that is the first argument to the operator, enablingus to write, e.g., cout c;if (c == () {

    is >> re >> c;

    if (c == ,) {is >> im >> c;if (c == )) {

    cplx = complex(re, im);}else {

    // format error, set the stream state to failis.setstate(ios_base::failbit);

    }}else if (c == )) {

    cplx = complex(re, num_type(0));}

    else {// format error, set the stream state to failis.setstate(ios_base::failbit);

    }}else {

    is.putback(c);is >> re;cplx = complex(re, num_type(0));

    }return is;

    }}

    7 The alternative implementation, to write operator>> and operator

  • 7/30/2019 Cpp Labs

    28/33

    28 Strings and Streams

    A3. The files date.h , date.cc, and date test.cc describe a simple date class. Implement the

    class and add operators for input and output of dates. Dates should be output in the

    form 2012-01-10. The input operator should accept dates in the same format. (You may

    consider dates written like 2012-1-10 and 2012 -001 - 10 as legal, if you wish.)

    2.2 String Streams

    The string stream classes (istringstream and ostringstream) function as their file counter-

    parts (ifstream and ofstream). The only difference is that characters are read from/written to a

    string instead of a file.

    A4. In Java, the class Object defines a method toString() that is supposed to produce a

    readable representation of an object. This method can be overridden in subclasses.

    Write a C++ function toString for the same purpose. Examples of usage:

    double d = 1.234;Date today;

    std::string sd = toString(d);std::string st = toString(today);

    You may assume that the argument object can be output with

  • 7/30/2019 Cpp Labs

    29/33

    Programming with the STL 29

    4 Programming with the STL

    Objective: to practice using the STL container classes and algoritms, with emphasis on efficiency.

    You will also learn more about operator overloading and STL iterators.

    Read:

    Book: STL containers and STL algorithms. Operator overloading, iterators.

    1 Domain Name Servers and the STL Container Classes

    On the web, computers are identified by IP addresses (32-bit numbers). Humans identify

    computers by symbolic names. A Domain Name Server (DNS) is a component that translates

    a symbolic name to the corresponding IP address. The DNS system is a very large distributed

    database that contains billions (or at least many millions) of IP addresses and that receives billions

    of lookup requests every day. Furthermore, the database is continuously updated. If you wish

    to know more about name servers, read the introduction at http://computer.howstuffworks.

    com/dns.htm.

    In this lab, you will implement a local DNS in C++. With local, we mean that the DNS does

    not communicate with other name servers; it can only perform translations using its own database.

    The goal is to develop a time-efficient name server, and you will study different implementation

    alternatives. You shall implement three versions (classes) of the name server, using different STL

    container classes. All three classes should implement the interface NameServerInterface:

    typedef std::string HostName;typedef unsigned int IPAddress;const IPAddress NON_EXISTING_ADDRESS = 0;

    class NameServerInterface {public:

    virtual ~NameServerInterface() {}virtual void insert(const HostName&, const IPAddress&) = 0;virtual bool remove(const HostName&) = 0;virtual IPAddress lookup(const HostName&) const = 0;

    };

    insert() inserts a name/address pair into the database, without checking if the name already

    exists. remove() removes a name/address pair and returns true if the name exists; it does

    nothing and returns false if the name doesnt exist. lookup() returns the IP address for a

    specified name, or NON EXISTING ADDRESS if the name doesnt exist.

    Since this is an STL lab, you shall use STL containers and algorithms as much as possible.This means, for example, that you are not allowed to use any for or while statements in your

    solutions. (There is one exception: you may use a for or while statement in the hash function,

    see assignment A1c.)

    A1. The definition of the class NameServerInterface is in the file nameserverinterface.h .

    a) Implement a class VectorNameServer that uses an unsorted vector to store the name/

    address pairs. Use linear search to search for an element. The intent is to demonstrate

    that such an implementation is inefficient.

    Use the STL find if algorithm to search for a host name. Consider carefully the third

    parameter to the algorithm: it should be a function or a functor that compares an

    element in the vector (a name/address pair) with a name. If you use a functor, the

    call may look like this:

  • 7/30/2019 Cpp Labs

    30/33

    30 Programming with the STL

    find_if(v.begin(), v.end(), bind2nd(first_equal(), name))

    first equal is a functor (a class) that takes two parameters, a pair and a host name,

    and compares the first component of the pair with the name.

    b) Implement a class MapNameServer that uses a map to store the name/address pairs.

    The average search time in this implementation will be considerably better than thatfor the vector implementation.

    c) It is possible that the map implementation of the name server is sufficiently efficient,

    but you shall also try a third implementation. Implement a class HashNameServer

    that uses a hash table to store the name/address pairs. Use a vector ofvectors to

    store the name/address pairs.8

    The hash table implementation is open for experimentation: you must select an

    appropriate size for the hash table and a suitable hash function.9 It can also be

    worthwhile to exploit the fact that most (in fact all, in the test program) computer

    names start with www.. Without any greater difficulty, you should be able to

    improve the search times that you obtained in the map implementation by about 50%.

    Use the program nstest.cc to check that the insert/remove/lookup functions work correctly.

    Then, use the program nstime.cc to measure and print the search times for the three

    different implementations, using the file nameserverdata.txt as input (the file contains

    25,143 name/address pairs10).

    A2. Examples of average search times in milliseconds for a name server with 25,143 names are

    in the following table.

    25,143 100,000

    vector 0.185

    map 0.00066

    hash 0.00029

    Search the Internet for information about search algorithms for different data structures, or

    use your knowledge from the algorithms and data structures course, and fill in the blanks

    in the table. Write a similar table for your own implementation.

    2 Bitsets, Subscripting, and Iterators

    2.1 Bitsets

    To manipulate individual bits in a word, C++ provides the bitwise operators & (and), | (or),

    ^ (exclusive or), and the shift operators > (shift right). The STL class bitsetgeneralizes this notion and provides operations on a set of N bits indexed from 0 through N-1.

    N may be arbitrary large, indicating that the bitset may occupy many words.

    For historical reasons, bitset doesnt provide any iterators. We will develop a simplified

    version of the bitset class where all the bits fit in one word, and extend the class with iterators

    so it becomes possible to use the standard STL algorithms with the class. Our goal is to provide

    enough functionality to make the following program work correctly:

    8 GNU STL contains a class hash map (in newer versions unordered map), a map implementation that uses a hash table.You should not use any of these classes in this assignment.9 Note that a good hash function should take all (or at least many) of the characters of a string into account and that

    "abc" and "cba" should have different hash codes. For instance, a hash function that merely adds the first and lastcharacters of a string is not acceptable.10 It was difficult to find a file with real name/address pairs, so the computer names are common words from a dictionaryfile, with www. first and .se or similar last. The IP addresses are running numbers.

  • 7/30/2019 Cpp Labs

    31/33

    Programming with the STL 31

    int main() {using namespace std;using namespace cpp_lab4;

    // Define an empty bitset, set some bits, print the bitsetBitset bs;

    bs[3] = true;bs[0] = bs[3];copy(bs.begin(), bs.end(), ostream_iterator(cout));cout

  • 7/30/2019 Cpp Labs

    32/33

    32 Programming with the STL

    An outline of the class (BitsetStorage is the type of the word that contains the bits):

    class BitReference {public:

    BitReference(Bitset::BitStorage* pb, size_t p): p_bits(pb), pos(p) {}

    // Operations will be added later

    protected:Bitset::BitStorage* p_bits; // pointer to the word containing bitssize_t pos; // position of the bit in the word

    };

    Now, operator[] in class Bitset may be defined as follows:

    BitReference operator[](size_t pos) {return BitReference(&bits, pos);

    }

    The actual bit fiddling is performed in the BitReference class. In order to see what we need to

    implement in this class, we must study the results of some expressions involving operator[]:

    bs[3] = true; // bs.operator[](3) = true; =>// BitReference(&bs.bits,3) = true; =>// BitReference(&bs.bits,3).operator=(true);

    From this follows that we must implement the following operator function in BitReference:

    BitReference& operator=(bool x); // for bs[i] = x

    This function should set the bit referenced by the BitReference object to the value ofx (just likethe set function in the original Bitset class). When we investigate (do this on your own) other

    possibilities to use operator[] we find that we also need the following functions:

    BitReference& operator=(const BitReference& bsr); // for bs[i] = bs[j]operator bool() const; // for x = bs[i]

    A4. Use the files bitset.h , bitset.cc, bitreference.h , bitreference.cc, and bitsettest1.cc. Imple-

    ment the functions in bitreference.cc and test. It is a good idea to execute the program line

    by line under control of gdb, even if your implementation is correct, so you can follow

    the execution closely and see what happens (turn off the -O2 switch on the compilation

    command line if you do this).

    2.3 Iterators

    From one of the OH slides: An iterator points to a value. All iterators are DefaultConstruct-

    ible and Assignable and support ++it and it++. A ForwardIterator should additionally be

    EqualityComparable and support *it, both for reading and writing via the iterator.

    The most important requirement is that an iterator should point to a value. A BitsetIterator

    should point to a Boolean value, and we already have something that does this: the class

    BitReference! The additional requirements (++, equality test, and *) can be implemented in a

    derived class. It will look like this:11

    11 This class only contains the constructs that are necessary for the bitset test program. For example, we have notimplemented postfix ++ and comparison with ==.

  • 7/30/2019 Cpp Labs

    33/33

    Programming with the STL 33

    class BitsetIterator : public BitReference {public:

    // ... typedefs, see below

    BitsetIterator(Bitset::BitsetStorage* pb, size_t p): BitReference(pb, p) {}

    BitsetIterator& operator=(const BitsetIterator& bsi);bool operator!=(const BitsetIterator& bsi) const;BitsetIterator& operator++();BitReference operator*();

    };

    The assignment operator must be redefined so it copies the member variables (as opposed to the

    assignment operator in the base class BitReference, which sets a bit in the bitset). The typedefs

    are necessary to make the iterator STL-compliant, i.e., to make it possible to use the iterator with

    the standard STL algorithms. The only typedef that isnt obvious is:

    typedef std::forward_iterator_tag iterator_category;

    This is an example of an iterator tag, and it informs the compiler that the iterator is a forward

    iterator.

    A5. Implement the member functions in bitsetiterator.cc. Uncomment the lines in bitset.h and

    bitset.cc that have to do with iterators, implement the begin() and end() functions. Use

    the program bitsettest2.cc to test your classes.