instructor data structures pasko.org/gip and algorithms
TRANSCRIPT
1
Data Structures and Algorithms
Instructor:
Dr.Pasko Galina
Sprıng 2009, EUL
Instructor
Education
1983 M. Sc., Moscow Engineering Physics Institute (MEPhI), Russia, Moscow
2005 Dr. Eng., Kanazawa Institute of Technology, Tokyo, Japan
Other interests:
Japanese tea ceremony (sado)
Japanese flower arrangement (ikebana)
Japanese poetry (haiku)
pasko.org/gip
Instructor
1 2
3
4
Research/ Education
*Function-based shape modelingand multidimensional modeling
*Visualization *Computer animation
*Digital fabrication (4)*Preservation of cultural heritage
Research Description
Solving actual long-standing
problems in the following computer graphics areas:
(1) Bounded blending operations (2) Space-time blending and
metamorphosis
(3) Trimmed implicit surfaces
pasko.org/gipCourse Evaluation
Assignments – 10%Quizzes – 10%
Attendance – 10% Midterm tests – 30%
Final test - 40%
Course information is at
pasko.org/gip/DSA
2
Info
• The course can be given in Fortran, Pascal, C,
C++, Turbo C, Java, Perl, Lisp, Ada, pseudo-
code and many other languages.
• This course is not about any particular
programming language, the lectures are 90%
language independent.
• The assignments will be done in a C
programming language.
Course topics
• Software development
• Linear structures (lists, stacks and queues)
• Nonlinear structures (trees and graphs)
• Elementary sorting and searching methods
• Basics of algorithm analysis
Data Structures and Algorithms
Introduction to
Data Structures and
Algorithms
Data Structures and Algorithms
SequoiaView
Contents
• ACM Computing Classification System
• Basic terminology
•Data and Data Structures
• Algorithms and Turing machine
• History of computing
3
The ACM Computing Classification System (1998)
• A. General Literature
• B. Hardware
• C. Computer Systems Organization
• D. Software
• E. Data• F. Theory of Computation
• G. Mathematics of Computing
• H. Information Systems
• I. Computing Methodologies
• J. Computer Applications
• K. Computing Milieux
http://www.acm.org/class/1998/TOP.html
Notion of Data
Etymology
The word data is the
plural of Latin datum,
"something given".
Work of Euclid (300 BC)
was titled Dedomena(in Latin, Data).
Euclid(325BC-265BC)
• Representation of facts, concepts, or
instructions in a formalized manner
suitable for communication,
interpretation, or processing by humans
or by automatic means.
• Any representations such as
characters or analog quantities to which
meaning is or might be assigned.
Notion of Data
UsageNotion of Data
In discussions of problems in mathematics
and engineering, the terms givens and
data are used interchangeably.
Such usage is the origin of data as a
concept in science: data are numbers,
words, images, etc., accepted as they
stand.
Usage in science
4
Numerical or other information represented in a form suitable for processing by computer: numbers, characters, images or other outputs from devices to convert physical quantities into symbols.
Usage in computing
Notion of Data
Such data are typically further used by a human or input into a computer, stored and processed there, or transmitted (output) to another human or computer.
Input data Computer Output data
Usage in computing
Notion of Data
Notion of Structure
• Structure is a fundamental and sometimes intangible notion covering the recognition, observation, nature, and stability of patterns and relationships of entities.
• The structure of a thing is how the parts of it relate to each other, how it is "put together".
• Both reality and language have structure. One of the goals of science is to create and use language the structure of which accurately parallels the structure of reality.
Notion of Data Structure
Data Structure is a systematic way of
organizing and accessing data
Real world: Filing cabinet
Cyber world : Hierarchical file system
Data structure is also a way of
storing data in a computer so that it
can be used efficiently
Data structure is an actual
implementation of a particular
abstract data type
5
Types of Data and Data Type
� Categorical
• Nominal
• Ordinal
• Quantitative
� Measurement
• Continuous data
• Discrete data
• Topology/ structure
data
Data type is an
attribute of a data
which tells what
kind of data it is.
Involves setting constraints on the
datum: what values
it can take and what
operations may be
performed upon it.
Types of Data
• The objects being studied are grouped into categories based on some qualitative trait.
Examples:
�Car color- silver, black, red, blue, etc.
�Opinion of staff about internet connection- good, neutral, bad
�Swimming status- swimmer, non- swimmer
• Commonly are summarized using “percentages” or “proportions”:– 11% of students have a car
Categorical
Categorical Data
• Nominal data - a type of categorical data in
which objects fall into unordered categories,members of some class:
[Tokyo, Moscow, Dubai, Lefke] or [orange, apple, grapefruit]
• Ordinal data - a type of categorical data in
which order is important:
[low, medium, high] or [tiny, small, large, huge]
• Quantitative data - precise numerical values:[ 1; 2.5 ; 0.3E-11 ]
Nominal, Ordinal, Quantitative
graphics.stanford.edu/papers/webviz/
Binary data
• A type of categorical data in which there are only two categories.
• Can either be nominal or ordinal.
• Examples:
Attendance: present or absent
Numbers: 0 or 1
Answers: yes or no
6
Measurement data
• The objects being studied are “measured” based on some quantitative trait.
• The resulting data are set of numbers.
• Examples:
� Sugar level
� Weight
� Age
�Time to complete a project
�Number of students late for class
• Measurement data are typically summarized using “averages” or “means”:– Average number of spring semesters students is 200.
Types of Data
• Theoretically, any value within an interval is
possible with a fine enough measuring device.
• Functions: λi = fi (X), where
X = (x1, x2,…, xn), i=1, …, m
xj are independent variables
λi are dependent variables (“parameters”,
attributes)
n>3 multidimensional, multivariate data
m>1 multiparameter data
Measurement Data
Continuous Data
• Only certain values are possible (there are gaps between the possible values).
• Scalar (single integer or real numerical value)
• Scalar array- indexed set of scalar values- 1D (linear) array: samples of y = f(x) - 2D array: image or samples of z = f(x,y)- 3D array: volume data (3D image) or samples
of λ = f(x,y,z)- 4D array: time-dependent volume data or
samples of λ = f(x,y,z,t)- nD array: hypervolume data
Measurement Data
Discrete Data
Discrete data - Gaps between possible values
0 1 2 3 4 5 6 7
0 1000
Continuous data - Theoretically, no gaps
between possible values
Measurement Data
Examples
7
Examples
• Discrete:
�Number of crimes reported to police
�Number of times the word is used in the text
�Generally, discrete data are counts.
• Continuous:
�Sugar level
�Age
�Time to complete a project
�Generally, continuous data come from measurements.
Measurement Data
Topology/structure data
• Sequential (text)
• Network
(hypertext,
molecules, Web)
• Hierarchical
(catalogs)
• Relational
(databases)
Measurement Data
infocult.typepad.com/infocult/weblogs/
• Computers are machines that can be programmed to accept data and transform it into useful information
� Computers can do simple operations incredibly fast if you tell them every step to do
RRRR epetition is the m other of educationepetition is the m other of educationepetition is the m other of educationepetition is the m other of education
Nature of a Computer System
Nature of a Computer System
• Basic parts of a Computer System are
simple in concept:
– A way (or ways) to get information IN
– A way of storing that information, even temporarily
– A way (or ways) of manipulating that information
– A way (or ways) to get resultant information OUT
! This is the classic definition of a “System”
8
Computer consists of only 2 main things:• Storage for data (memory)
• Processor(s) for manipulating data
Types of storage:• Main memory
• External memory: disks, tapes, diskettes, etc.
“Peripheral” input-output devices:
monitor, mouse, keyboard, speakers,
scanners, printers, other drives, etc.
Nature of a Computer System
Difference between hardware and software:
Hardware: The physical elements of a
computing system (circuits, wires, disks)
Software is divided into two general
categories: data and programs.
Nature of a Computer System
Communication
Application
Operating System
System Programming
Hardware
Data/ Information
Layers of a Computing System
Nature of Software
• What is a program?A set of instructions and data, expressed using
programming language, designed to accomplish
a specific task.
• A program is constructed similar to a Computer System. It needs:
– A way (or ways) to get information IN
– A way of storing that information, even temporarily
– A way (or ways) of manipulating that information
– A way (or ways) to get resultant information OUT
9
Nature of a Programming Language
• What is a programming language?- A set of symbols and a grammar for using those
symbols in order to specify the instructions and data.
- Symbols are written using an alphabet (mainly in English)
- Programming languages have inverse evolution of human
languages, but have the same purpose.
• What is that purpose?
• Basic features of a programming language:
� Ways to represent information - data types
� Ways to act on information - operations
Elements of a C Program
Usage in computing
Data Types in C
PRIMARY
1. INT
2. FLOAT
3. CHAR
SECONDARY
1. ARRAY
2. STRUCTURE
3. UNION
4. POINTERS
USER
DEFINED
1. ENUM
2. TYPEDEF
C program is essentially a group of functions which act on data.
Elements of a C Program
• Data are stored in named data structures.
• Variables represent data.
Variables are named data structures:– int age;
– double salary;
– char letter;
• We can build more complicated data structures, but
every data structure has type, and value.
• Named data structure also has name (also called
identifier)
Usage in computing
Declaration of Variables
Every variable used in the program should be declared to the compiler. The declaration does two things:1. Tells the compiler the variables name. 2. Specifies what type of data the variable will hold.
The general format of any declaration
datatype v1, v2, v3, ……….. vn;
Where v1, v2, v3 are variable names. Variables are separated by commas. A declaration statement must end with a semicolon.
Example:
Int sum; Int number, salary; Double average, mean;
10
Elements of a C Program
Example:
struct employee {
char *name;
char *address;
float salary;
int idNumber;
}
struct employee john;
john.salary = 112.59;
A structure in C is
an object that
consist of named
members,
possibly of different
types.
The members have
public access.
Data structures and algorithms
• Usually developed hand-in-hand
Example: Pushing and popping from a stack
• The behavior of an algorithm depends on how
the data is structured
Example: Searching a disc vs. searching a tape
Tape: fast-forward, rewind
Disc: select a track
vs.
Notion of algorithm
• Word algorithm comes from the name of the 9th
century Persian mathematician Abu Abdullah
Muhammad bin Musa al-Khwarizmi.
• Word “algorism” originally referred only to the
rules of performing arithmetic using Hindu-
Arabic numerals but evolved via European Latin
translation of al-Khwarizmi's name into algorithm
by 18th century.
• Word evolved to include all definite procedures
for solving problems or performing tasks.
• In mathematics and computer science, an algorithm is a procedure (a finite set of well-defined instructions) for accomplishing some task which, given an initial state, will terminatein a defined end-state.
• Algorithms have steps that repeat (iterate) or require decisions (such as logic or comparison).
• The concept of an algorithm originated as a means of recording procedures for solving mathematical problems such as finding the common divisor of two numbers or multiplying two numbers.
Notion of algorithm
11
• Concept of algorithm was formalized in 1936 through Alan Turing's Logical Computing Machine and Alonzo Church’s lambda calculus, which in turn formed foundation of computer science.
• Most algorithms can be implemented by computer programs.
Algorithm
Turing Digital Archive
http://www.turingarchive.org/
Turing, Alan Matheson
(1912-1954)
Turing Machine
• A Turing machine
consists of a
control unit with a
read/write head
that can read and
write symbols on
an infinite tape
Turing Machine
Turing machine consists of:
• Infinite tape divided into cells, one next to the
other. Each cell contains a symbol from some
finite alphabet.
• Head that can read and write symbols on the
tape and move left and right only one step at a
time.
A full-scale Turing machine simulator is supplied by Stanford University's Turing's World software.
• A state register that stores the state of the
Turing machine. The number of different states
is always finite and there is one special start
state.
• An action table that tells the machine what
symbol to write, how to move the head and what
its new state will be, given the symbol it has just
read on the tape and the state it is currently in.
Turing Machine
12
The following Turing machine has an alphabet {'0', '1'} with 0 being the blank symbol. It expects a series of 1s on the tape, with the head initially
on the leftmost 1, and doubles the 1s with a 0 in between, i.e., "111"
becomes "1110111". The set of states is {s1, s2, s3, s4, s5} and the start
state is s1. The action table is as follows.
Turing Machine
Action table
Turing Machine
• Why is such a simple machine (model) of any importance?
– It is widely accepted that
anything that is intuitively
computable can be computed
by a Turing machine
– If we can find a problem for
which a Turing-machine solution
can be proven not to exist, then
the problem must be unsolvable
Abacus An early device to record numeric values
Blaise Pascal (1645)The first digital calculator (mechanical device to add, subtract, divide and multiply)
Joseph Jacquard (1801)Jacquard’s Loom, the punched card
Charles Babbage (1823)Analytical Engine
Early History of Computing
Ada Lovelace (1843 )First Programmer, the loop
Alan Turing (1930)Turing Machine, Artificial Intelligence Testing
Harvard Mark I (1944), ENIAC (1946), UNIVAC I (1950), EDVAC (1952)
Mark I was the first large-scale automatic digital computer in the USA.
Early History of Computing
13
Vacuum TubesLarge, not very reliable, generated a
lot of heat
Magnetic Drum Memory device that rotated under a read/write head
Card Readers ���� Magnetic
Tape DrivesDevelopment of these sequential auxiliary storage devices
First Generation Hardware (1951-1959)
funsan.biomed.mcgill.ca
TransistorReplaced vacuum tube, fast,
small, durable, cheap
Magnetic CoresReplaced magnetic drums,
information available instantly
Magnetic DisksReplaced magnetic tape, data can be accessed directly
Second Generation Hardware (1959-1965)
www.ti.com/corp/docs/press/library/index.shtml
Integrated CircuitsReplaced circuit boards, smaller, cheaper, faster, more reliable.
TransistorsNow used for memory construction
TerminalAn input/output device with a keyboard and screen
Third Generation Hardware (1965-1971)
www.ieicorp.com/micro/micro4.htm
Large-scale IntegrationGreat advances in chip technology
PCs, the Commercial Market, WorkstationsPersonal Computers were developed as new companies
like Apple and Atari came into being. Workstations emerged.
Fourth Generation Hardware (1971-)
14
Parallel ComputingComputers rely on interconnected central processing units that increase processing speed.
NetworkingWith the Ethernet small computers could be connected and share resources. A file server connected PCs in the late 1980s.
ARPANET and LANs ���� Internet
Parallel Computing and Networking
Machine LanguageComputer programs were written in binary code
Assembly Languages and translatorsPrograms were written in artificial programming languages and were then translated into machine language
Programmer ChangesProgrammers divide into application programmers and systems programmers
First Generation Software (1951-1959)
High Level LanguagesUse English-like statements and made programming easier:
Fortran, COBOL, Lisp.
High-Level
Languages
AssemblyLanguage
MachineLanguage
Second Generation Software (1959-1965)
Third Generation Software (1965-1971)
• Systems Software
– utility programs,
– language translators,
– and the operating system, which decides which programs to run and when.
• Separation between Users and Hardware
Computer programmers now created programs
to be used by people who did not know how to
program
15
Programmer / User
Applications Programmer
(uses tools)
User with No
Computer Background
Systems Programmer
(builds tools)
Domain-Specific Programs
Computing as a Tool
Application Package
System Software
High-Level Languages
Assembly Language
Machine Language
Third Generation Software (1965-1971)
Structured ProgrammingPascal, C, C++
New Application Software for UsersSpreadsheets, word processors, database management
systems
Fourth Generation Software (1971-1989)
Windows and UnixThe Windows and Unix operating systemdominate the computing world.
Object-Oriented DesignBased on a hierarchy of data objects (C++, Java)
World Wide WebAllows easy global communication
through the Internet
New UsersToday’s user needs no computer knowledge
Fifth Generation Software (1990- present)
graphics.stanford.edu/papers/webviz/
16
References
• Yashavant P. Kanetkar, Data Structures through C, BPB Pub., 1st Ed, Reprinted 2007
• Amit Gupta, Data Structures through C, Galgotia Booksource P. Ltd., 2001.
• Nell Dale “C++ Plus Data Structures”, 4-th ed., John and Bartlett pub., Sudbury, Massachusetts, 2006.
References
• The ACM Computing Classification System [1998 Version], valid in 2006http://www.acm.org/class/1998/
• Turing Machineshttp://plato.stanford.edu/entries/turing-machine/
• On-line encyclopedia Wikipedia www.wikipedia.org
• NIST Dictionary of Algorithms and Data Structureshttp://www.nist.gov/dads/