quantum

Quantum

WHAT IS QUANTUM AND WHAT DOES IT DO?______________________5

Stages in a Quantum run:______________________________________________________5

Basic Elements In Quantum____________________________________________________6

Different Number types that can be used in Quantum:______________________________8Whole numbers____________________________________________________________________8Real numbers______________________________________________________________________8

Variables and arrays____________________________________________________9Data variables__________________________________________________________________9Integer variables_______________________________________________________________10Real variables__________________________________________________________________10Subscription____________________________________________________________________11

Expressions_____________________________________________________________12Arithmetic expressions_________________________________________________________12C o m b i n i n g a r i t h m e t i c e x p r e s s i o n s _________________________________13

Counting the number of codes in a column______________________________________15

Generating a random number_________________________________________________16

Logical expressions__________________________________________________________16

Comparing data variables and data constants____________________________________18

Fields of data variables_______________________________________________________21

C h e c k i n g t h e a r i t h m e t i c v a l u e o f a f i e l d o f c o l u m n s___________________________________________________________________________23

C o m b i n i n g l o g i c a l e x p r e s s i o n s ______________________________24Comparing variables and arithmetic expressions to a list_______________________27

Naming lists________________________________________________________________29

Speeding up large programs___________________________________________________30

How Quantum reads data_____________________________________________________30Types of record______________________________________________________________30Ordinary records_____________________________________________________________30Multicard records____________________________________________________________31

Multicard records with Trailer Cards___________________________________________31Reading data into the C array__________________________________________________31Ordinary records_______________________________________________________________31Multicard records_______________________________________________________________31Ignoring card types_____________________________________________________________32

Processing the data__________________________________________________________32

Changing the contents of a variable__________________________________32

Trailer Cards____________________________________________________________32

Allread________________________________________________________________________33firstread and lastread_______________________________________________________33Reserved variables_________________________________________________________________33

Describing the data structure for Multicard records___________________________34Record type__________________________________________________________________34

Ordinary Records____________________________________________________________34Multicard Records____________________________________________________________34

Record length________________________________________________________________35Serial number location______________________________________________________________35Card type location_________________________________________________________________36Required card types________________________________________________________________36Repeated card types________________________________________________________________37Highest card type number__________________________________________________________38Dealing with alphanumeric card types_______________________________________________38

Merging Data using Quantum_________________________________________________39Merge sequence for Trailer Cards___________________________________________________39Merging data files________________________________________________________________39Merging complete cards___________________________________________________________40Merging a field of data from an external file__________________________________________42

Writing out data____________________________________________________________44Print files________________________________________________________________________44Printing out individual records________________________________________________________44Writing Out Parts of Records_________________________________________________________48Data files________________________________________________________________________49Creating new cards_______________________________________________________________49

Some General Instances for forcecoding cleaning etc.______________________________50Writing to a report file______________________________________________________________50Assignment statements______________________________________________________________51Copying codes____________________________________________________________________51Assignment with and, or and xor______________________________________________________53Adding codes into a column_________________________________________________________53Deleting codes from a column________________________________________________________54Forcing single-coded answers________________________________________________________54Setting a random code in a column____________________________________________________55Reading numeric codes into an array___________________________________________________56Clearing variables_________________________________________________________________59Flow control__________________________________________________________________59Statements of condition_____________________________________________________59Examining records_____________________________________________________________62Holecounts_____________________________________________________________________62Frequency distributions_____________________________________________________63require________________________________________________________________________64Column and code validation_________________________________________________64Comments with require______________________________________________________66Checking codes in columns__________________________________________________66Exclusive codes______________________________________________________________67Automatic error correction__________________________________________________68Validating logical expressions______________________________________________69Testing the equivalence of logical expressions____________________________69Actions when a require statement fails____________________________________70Data correction______________________________________________________________70Forced editing (forced cleaning)____________________________________________71

Introduction to the tabulation_________________________________________72

The hierarchy of the tabulation section____________________________________72Components of a tabulation program______________________________________72Run control statements_____________________________________________________72Defining run conditions______________________________________________________72Table control statements___________________________________________________________74Creating a table______________________________________________________________74commonly used options in tab section_____________________________________75Axis control statements_____________________________________________________76factors________________________________________________________________________79Miscellaneous ‘n’ statements_______________________________________________80More commands to generates counts______________________________________80The col statement______________________________________________________________80The val statement___________________________________________________________81The fld statement____________________________________________________________82

Weighting in Quantum_________________________________________________83Weighting methods__________________________________________________________83Types of weighting__________________________________________________________83

Descriptive statistics___________________________________________________85

Quanvert________________________________________________________________86

Structure of Quantum Spec:_______________________________________89

WHAT IS QUANTUM AND WHAT DOES IT DO?Quantum is a highly sophisticated and very flexible computer language designed to simplify the

process of obtaining useful information from a set of questionnaires. So it converts technical

information collected by using questionnaires into managerial Information by programming

Quantum performs a variety of tasks. It can:

► check and validate the data

► edit and correct the data

► produce different types of lists and reports of data

► produce new data files

► recode data and produce new variables

► generate tables

► Perform statistical calculations.

Stages in a Quantum run:

A. First, the data is read onto a disk. Data on disk can come from a number of different

sources, for example:

o It may be entered directly via a terminal by a telephone interviewer using

Quancept CATI.

o It may be collected over the World Wide Web using software such as Quancept

Web.

o It may be entered directly into a computer by an interviewer conducting a

personal interview using Quancept CAPI.

o It may be entered by a data entry clerk using a data entry package.

B. Next, the tasks to be performed are defined using the Quantum language.

C. Then, Quantum translates these tasks into instructions that the computer can

understand.

D. Finally, the computer itself uses this program to run your job.

Quantum comprises two sections – an edit and tabulation section. The edit section checks and

validates the data, generates lists and reports, corrects data, produces new data files, and

recodes data and creates new variables. The tabulation section produces tables and performs

statistical calculations.

Quantum reads the records in the data file one at a time and passes them through the various

parts of the Quantum program. As long as there are records remaining in the data file, the loop of ‘read a record -edit - tabulate’ is repeated; once the last record has been processed, the tables are ready for printing.

Basic Elements In Quantum

There are three basic elements in Quantum:

o Data constants

o Integer numbers

o Real numbers

Which are stored in variables:

o Data variables store data constants

o Integer variables store whole numbers

o Real variables store real numbers

Individual constants

An individual constant is one or more of the codes 1234567890–& or blank. The – is sometimes

referred to as the 11 or X punch, and & is sometimes called the 12, V or Y punch. Each code

represents one answer to a question. For example, let’s take the question ‘What is your favorite

color?’ which has the response list:

Red 1

Yellow 2

Blue 3

Green 4

Black 5

White 6

These codes are coded into one column. If my favorite color is green, this will appear in the data

file as a 4 in the appropriate column, just as if your favorite color is red, there will be a 1 in that

column. To refer to these answers inside your Quantum program (maybe we only want our table

to include those respondents whose favorite color is blue), type in the code enclosed in single

quotes: ’3’ You will also have to tell Quantum which column to look in. Several codes may be

combined in the same column and are called multicodes.. Multicodes or multicoding mean two or

more codes in the same column. Suppose the next question asks me to choose three colors from

the same list; I pick yellow, black and white. If these answers were all coded in the same column

(a multicoded column),

They would be referred as :

’256’ or ’526’ or ’652’

Or

Any other variation of those three codes. Quantum does not care what the codes are entered in.

If you have a series of consecutive codes in the order &–01234567890–& you may either type

each code separately or you may enter the first and last codes separated by a slash (/) meaning

‘through’, as shown below:

’1/7’ means ’1234567’

’&/4’ means ’&–01234’

’&/9’ means ’&–0123456789 (all 12 codes)

’1/&’ means ’1234567890–& (all 12 codes)

As you can see, the last two examples mean exactly the same thing. However, the notations ’0/&’

and ’0–&’ are not the same: ’0/&’ means ’01234567890–&’ whereas ’0– &’ is ’0’, ’–’ and ’&’ only.

Some combinations of codes represent ASCII characters; that is, they represent characters which

you can type on your screen:

’&1’ is the equivalent of ’A’

’&2’ is the equivalent of ’B’

The only time you would use letters rather than codes (i.e., ’A’ rather than ’&1’) is when the

questionnaire tells you that a column should contain a letter.

Sometimes we may need to write a notation for ‘no codes’ – for instance, if my favorite color does

not appear in the list of choices. To do this, we write ’ ’ (i.e., a blank enclosed in single quotes).

Strings of data constants

To refer to a string of codes in a field of columns, it has to be provided between two “$” signs:

e.g.

$codes$

When data constants are single-coded or the multicodes correspond to ASCII characters (e.g. A’,

’B’) they may be strung together. Strings of data constants are sometimes called literals or

column fields. Strings are enclosed in dollar signs, with the component single codes losing their

single quotes. For example:

$12345$ $ABC$ $916 7&$

The first string is five columns long with 1 in the first column, 2 in the second, 3 in the third, and

so on. The third string is six columns wide with the fourth column being blank.

Instances when strings might be used are:

• When we want to refer to a questionnaire serial number

• When the answers to a question are represented by codes of more than 1 digit. For example, in

a car ownership survey the car make and model owned may be represented by a 3-digit code. To

pick up respondents owning a particular type of car you would need to check whether the relevant

columns contained the code for that car. For instance, to look for owners of Ford Escorts you

might ask Quantum to search for the string $132$ in a particular field of columns.

Different Number types that can be used in Quantum:

Quantum can deal with whole numbers (integers) in the range -2,147,483,647 to

2,147,483,647.

Real numbers are numbers containing decimal points. To be valid, they must have at least one

digit on either side of the decimal point:

0.1 and 1.0 are correct

.1 and 1. are not

Quantum deals with real numbers of any size with accuracy up to six significant figures. Numbers

with more than six significant figures have the sixth figure rounded up or down depending on the

value of the remaining figures.

96.82529 is rounded to 96.8253

189462.1 is rounded to 189462.0

Variables and arrays

There are three types of variables – data, integer and real – each used for storing different types

of information. You may create your own variables with names representing the type of

information stored (e.g., the variable called meals might contain a count of the number of meals

eaten during the day) or you may use the ones offered automatically by Quantum. Sometimes it

is useful for a series of variables to have the same name. Each variable may then be addressed

by its position in the group. This arrangement is known as an array.

Data variables

To define a data variable, type:

data var_name sizes <<Syntax>>

At the start of every job, Quantum provides you with an array of 1,000 data cells called C. This

array is sometimes referred to as the C matrix. The individual cells are called C-variables. Each

C-variable stores one ‘column’ of data. Quantum reads data from your data file into this array.

Let’s say we have a very small questionnaire which uses 43 columns to store the data. Quantum

will read the data for each respondent into cells 1 to 43 of the C array, one respondent at a time.

The codes from column 1 of the data are copied into cell 1 of the C array, the codes from column

2 of the data are copied into cell 2, and so on. When Quantum has finished with that respondent’s

data it clears out the cells in the C matrix and reads the data for the next respondent, placing it in

cells 1 to 43 of the array we can access this data by defining the columns whose contents we

wish to inspect or change.

Let’s take the questions about color that we mentioned earlier. The printed questionnaire tells us

that the respondent’s favorite color will be coded into column 15, to look at this column we would

write:

c15 or c(15)

C-variables are reset to blank before a new respondent’s data is read. Thus, you can be certain that Quantum never muddles the contents of column 10 for the first respondent with those of c10 for the second respondent. As we mentioned above, you may create your own data variables to store specific pieces of data. For instance, in a shopping survey we may want to store data about visits to Sainsburys in an array called ‘sains’ and data about visits to Safeways in an array called ‘safe’ Before we can use these arrays, we must create them. If each array is to contain 100 cells or column of data, we would write: data sains 100s data safe 100s where the s at the end of each statement causes Quantum to recognize that, for example, safe1 is the same as safe(1), just as it knows that c15 and c(15) refer to the same column of data. If you created the arrays without the s, then Quantum would not recognize safe1 as being the same as safe(1).

Integer variables

To define an integer variable, type:

int var_name sizes

To refer to an integer variable, type:

name[cell_number]

Integer variables store whole numbers. Strings of integer variables are called integer arrays, and

each cell in the array may store any whole number from -2,147,483,647 to 2,147,483,647. At

the start of each run, Quantum provides an array of 200 integer variables called T. The first cell in

this array is the integer variable t1 which may store any value within the given range; the second

cell in the array is the integer variable called t2 which may also store any value within the given

range. To illustrate the difference between a data variable and an integer variable, let’s suppose

that our data contains the value of the respondent’s car to the nearest whole pound. If the value is

£6,000, this will take up 4 columns in the data (assuming that we are only concerned with the

digits) – that is, four data variables, the first of which will contain the 6, and the other three of

which will all contains zeroes. If we placed this same value in an integer variable, we would only

need one variable to store the whole value because each variable can store values in the range

from -2,147,483,647 to 2,147,483,647

We have already mentioned that Quantum provides an integer array of 200 integer variables. You

may create your own arrays using statements similar to those shown above for data variables.

Suppose you have a household survey in which you have collected the value of each car that the

family owns. You want to set up an integer array in which to store each value, so you write: int

carval 10s This creates an array called carval which contains ten separate integer variables

called carval1 to carval10. Notice that we have followed the array size with the letter s so that we

can omit the parentheses from the individual variable names. We can then copy the value of the

first car into carval1, the value of the second car into carval2, and so on. If a particular household

owns three cars values at £6,000, £2,500 and £500, then carval1 would have a value of 6,000,

carval2 would be 2,500 and carval3 would be 500. If you create your own integer variables, it is

recommended that you name them with names that reflect their purpose in the run.

Real variables

To define a real variable, type:

real var_name sizes

To refer to a real variable, type:

name[cell_number]

You may define real variables and arrays to store real numbers with accuracy up to six significant

figures. Values with more than six significant figures have the sixth figure rounded up or down

according to the value of the extra figures. As with integer variables, the names of real variables

should give some clue to the type of information they contain. Real arrays are created by

statements of the form: real liters 5s this example creates a real array called liters which has five

real variables named liters1 to liters5. It can store five real values, the first in liters1 and the fifth in

liters5. Quantum also provides a set of 100 real variables named X which you may use. As an

example, let’s say that the data contains information on how long, on average, each person in the

household spent watching television during a given week. We want to manipulate these figures

so we create an array of real variables in which to store the average viewing figures real tvwatch

8s this provides room for up to eight people’s figures. If our household contains four people with

viewing averages of 20.8 hours, 15.75 hours, 9.75 hours and 10.0 hours, then tvwatch1 will have

a value of 20.8, tvwatch2 will have a value of 15.75, tvwatch3 will be 9.75 and tvwatch4 will be

10.0 hours. The rest of the variables in the array have values of 0.0.

Reading real numbers from columns

To read real values from the C array, type:

cx(start_col, end_col)

Data from the questionnaire is read into columns for use during the run. When the data contains

real numbers you will have to tell Quantum that the dot is to be treated as a decimal point rather

than as a multicode representing a number of different answers. The way to do this is to refer to

the field as cx: cx(15,20) cx(131,135) Here we have two fields containing real numbers: the first is

six columns wide including the decimal place, which means that the number itself contains five

digits, whereas the second is only five columns wide with four digits Notice that there is no need

to tell Quantum where the decimal point is

Subscription

As we have shown above, you may refer to specific variables in integer and real arrays and cells

or columns in data arrays by naming their position in the array.

For example:

c1 is the first column of the C array

t5 is the fifth variable in the T array

time3 is the third variable in the array called time

seg(2) is variable 2 of the seg array

Variables within an array may also be referred to using any arithmetic expression. In this case,

parentheses must be used. For example:

c(t1) the column number depends on the value of t1. If t1 has a value of 10, then the

variable is c10; if t1 is 67, the variable is c67.

c(t4,t5) the field delimiters depend on the values of t4 and t5. If t4 has a value of 12 and

t5 has a value of 19, the column field referred to is c(12,19).

t(c4) the variable number depends on the value in c4. If c4 contains a single code in

the range 1 to 9, the integer variable will be one of t1 to t9 depending on the

exact value in c4. If c4 is multicoded, then the result is nonsense.

time(c4*23) the variable number is the result of multiplying the value in c4 by 23 As in the

previous example, c4 must be single-coded in the range 1 to 9 for this example

to make sense. Thus, if c4 contains just a 4, the value of the expression is 92 so

the variable referred to is time92.

When variables are referenced in this way, the value of the expression must be positive. The

expression c(t15) is acceptable as long as t1 is at least 5. If the expression has a zero or

negative value Quantum will issue an array dimension error when it comes to read the data

during the datapass. Also, if the variable refers to columns, the value of the subscript must not

exceed 32,767. These are called subscripted variables and they greatly increase the flexibility with which you can write your edit.

Expressions

Quantum recognizes two types of expression – arithmetic and logical. Arithmetic expressions are used to produce numeric values and logical expressions, when evaluated, produce a value of true or false.

Arithmetic expressions

The simplest form of arithmetic expression is a single positive or negative number such as 10 or

26.5 or an integer or real variable. Although the C Array is data, columns may also be used in

arithmetic when the response coded into those columns is a numeric response, such as a

respondent’s age or the number of different shops he visited. For example, if columns 243 to 247

contain the codes 4,7,2,6 and 0 respectively the value in c(243,247) could be read as 47,260.

Similarly, if columns 45 to 48 contain 7, 8, a dot and 2 respectively, the value in cx(45,48) would

be 78.2. Blank columns in a field are ignored when the codes in those columns are evaluated.

Thus, if columns 20 to 21 contain the codes 6 and 7 respectively, and column 22 is blank, the

codes in c(20,22) will be evaluated as 67. A similar result is produced if the blank column appears

anywhere else in the field. All the examples of c(20,22) below produce an arithmetic value of 67:

+----20----+ +----21----+ +----22----+

6 7

6 7

6 7

The same applies to multicoded columns. If you use a multicoded column as part of an arithmetic

expression, the multicoded column will be ignored. The exception to this is a multicode of a digit

and a minus sign which creates a negative number: a minus sign anywhere in a numeric field

negates the value in the field as a whole, not just then number it is multicoded with.

For example:

2---+----3----+----4

12-4 is -1234

3

4---+----5----+----6

83- is -83

C o m b i n i n g a r i t h m e t i c e x p r e s s i o n s

To combine arithmetic expressions, type:

variable operator variable [operator variable ... ]

where variable is a numeric value or the name of a variable containing a numeric value,

and operator is one of the arithmetic operators , , * (multiply) or / (divide).

More often than not you will want to combine numeric expressions to form a larger expression, for

instance to count the number of records read with a given code in a named column. Arithmetic

expressions are linked with any of the arithmetic operators listed below: Expressions may contain

more than one of these operators, for instance:

t5 + c(134,136) / tot

c(150,152) * 10 + 2.5

Quantum evaluates such expressions in the following order:

1. Expressions in parentheses.

2. Multiplication and division

3. Addition and subtraction

If you wish to change this order you should enclose the expressions which go together in

parentheses. The first expression in the example above will be evaluated by dividing the value in

columns 134 to 136 by otot and adding the result to t5. If you change the expression to:

(t5 + c(134,136)) / tot

this adds the values of t5 and c(134,136) first and then divides that by otot. Let’s substitute

numbers and compare the results. If t5=10, otot=5 and the value in c(134,136) is 125 the two

versions of the expression would read as follows:

10 + 125 / 5 = 35 and (10 + 125) / 5 = 27

Where two integer expressions are combined, the result is integer (any decimal places are

ignored), but if an expression contains a real then the result will be real. Therefore, if t1=5 and

t2=3, then:

t1 + 4 = 9

t1 + 4.0 = 9.0

t1 * t2 = 15

t1 / t2 = 1

t1 * 1.0 = 5.0

t1 * 1.0 / t2 = 1.66

If you use parentheses in expressions which contain both integer and real variables, you need to

take extra care to ensure that your expression is producing the correct results. Let’s look at an

example to illustrate how an expression can look correct but can still produce unexpected results.

If we assume that t40=2 and t41=70, the expression:

t40 * 100.0 / t41

yields a result of 2.8 (i.e., 200.0/70). The final value will be 2.8 if the result is saved in a real

variable, or 2 if it is saved in an integer variable. If we use parentheses:

(t40 / t41) * 100.0

the result is 0.0 (or 0 if saved in an integer variable). The reason for this is as follows Because

Quantum evaluates expressions in parentheses before it deals with the rest of the expression, it

treats that expression as integer arithmetic. The rules for integer arithmetic dictate that real

results are truncated at the decimal point, so the true result of 0.28 becomes 0. Any multiplication

involving zero is always zero, so the final result is zero. If you find that a run gives unexpected

zero results, try looking for expressions of this type and checking whether the parenthesized part

of the expression has been truncated because the integer division results in a decimal number.

Counting the number of codes in a column

To count the number of codes in a column or list of columns, type:

numb(cn1[’codes’], cn2[’codes’], ... )

If any columns are followed by a code reference, only those codes will be counted for those

columns.

The function numb is an arithmetic expression which counts the number of codes in a column or

list of columns. Its format is:

numb(cn1,cn2, ... cnn)

where cn1 to cnn are the columns whose codes are to be counted. So, if we wanted to count the

number of codes in columns 132 to 135 we would type:

numb(c132,c133,c134,c135)

Notice that even though the columns are consecutive, each one is entered separately, with each

column number preceded by a ‘c’. It is incorrect to define only the start and end columns of a field

when using numb. Therefore it is wrong to write numb(c(132,135)) or numb(c(132,135)) and, if

you write statements such as these, Quantum will flag them as errors. Sometimes you will only be

interested in certain codes, for instance you may want to know how many 1, 2 or 3 codes there

are in a group of columns. In this case the function is entered as:

numb(cn’p1’,cn’p2’, ... cnn’pn’)

where p1 to pn are the codes to be counted. Only the named codes are counted – any others

appearing in the columns are ignored. Let’s say our data on card 1 is as follows:

1---+----2---...---5----+----4

1 2 1

6 / /

8 6 7

8

and we want to count the number of codes in column 115 and also the number of codes in the

range ’5/8’ in columns 121 and 157. The expression would be entered as:

numb(c115,c121’5/8’,c157’5/8’)

When Quantum checks these columns and codes, it will tell us that there are 9 codes in these columns which are within the given ranges. These codes are all four codes in column 115 (we did not specify which codes to count in that column), codes 5 and 6 in column 121 (codes 2 to 4 are outside the given range), and codes 5 to 7 in column 157 (codes 1 to 4 are outside the given range).

Generating a random number

To generate a random number in the range 1 to n, type:

random(n)

Quantum can generate random numbers automatically with the random function:

random(n)

where n is the maximum value the random number may take. So, to generate a random

number in the range 1 to 100, the expression would read:

random(100)

The number produced may be saved for later use in an integer variable or column, thus:

rnum=random(32)

c(110,112)=random(156)

When using random with columns, always make sure that the number of columns allocated to the number is sufficient to store the highest possible number that can be generated. In our example, we need three columns in order to store numbers up to 156.

Logical expressions

Logical expressions are used for comparing values, codes and variables.

Comparing values to compare the values of two arithmetic expressions, type:

<<arith_exp>> log_operator <<arith_exp>>

where log_operator is one of the operators .eq., .gt., .ge., .lt., .le or .ne

Values are compared when you need to check whether an expression has a given value – for

example, did the respondent buy more than 10 pints of milk?

Values are compared by placing arithmetic expressions on either side of one of the following operators:

Exp. Value

.eq. equal to

.gt. greater than

.ge. greater than or equal to

.lt. less than

.le. less than or equal to

.ne not equal to / unequal to

If the number of pints of milk that the respondent bought is stored in columns 114 and 115, the

expression to check whether he bought more than ten pints would be:

c(114,115) .gt. 10

If the number in these columns is greater than ten the expression is true, otherwise it is false.

Earlier we have said that integer variables may take numeric values or the logical values true and

false depending upon whether or not the value is zero. To check whether the respondent bought

any packets of frozen vegetables, we can either write:

fveg .gt. 0

To check the numeric value of the variable fveg, or we can simply say:

fveg

to check whether the logical value of fveg is true.

To check whether fveg is false (i.e. zero), we would write

.not. fveg

Comparing data variables and data constants

In virtually every Quantum run you will want to check which codes occur in which columns. This is

easily done using logical expressions. There are several forms of expression depending on

whether you are checking a column or a field of columns.

Data variables

To test whether a data variable contains at least one of a list of codes, type:

var_name’codes’

To test whether a data variable contains none of the listed codes, type:

var_namen’codes’

To test whether a data variable contains exactly the given codes and nothing else, type:

var_name = ’codes’

To test whether two data variables contain identical codes, type:

var_name1 = var_name2

To test whether a data variable contains codes other than those listed, type:

var_nameu’codes’

To test whether two data variables do not contain identical codes, type:

var_name1uvar_name2

To check whether a column or data variable contains certain codes, place the codes, enclosed in

single quotes, immediately after the name of the column or data variable:

e.g. c1’1’ c156’23’ brand’5’

The expression: Cn’p’ checks whether a column (n) contains a certain code or codes (p). The

expression is true as long as column n contains at least one of the given codes. It does not matter

if there are other codes present since these are ignored. For example, to check whether column

6 contains any of the codes 1 through 4 we Would type:

c6’1/4’

The expression is true if C6 contains any of the codes 1, 2, 3 or 4 or any combination of those

odes, regardless of what other codes may also be present. For instance:

----+----1 ----+----1 ----+----1

1 1 1

6 2 3

8 3 0

- 4

&

are true, but:

----+----1

5

7

9

—

is false.

In our original example we chose the codes 1 through 4. You can, of course, use any codes you

like and they may be entered in any order.

The opposite of Cn’p’ is:

cnN’p’

which checks that a column does not contain the given code or codes. The expression is true as

long as the column does not contain any of the listed codes. For example: c478n’5/7&’ is true as

long as column 478 does not contain a 5, 6, 7 or & or any combination of them.

A multicode of ’189’ returns the logical value true, because it does not contain any of the codes

’5/7&’ whereas a multicode of ’1589’ makes the expression false because it contains a ’5’.

The ’=’ operator is used to check that the contents of a column are identical to the given codes.

The expression: c312=’1/46’ is true as long as c312 contains all of the codes 1 through 4 and 6,

and nothing else. The expression: c142=’ ’ checks that column 142 is blank. The equals sign is

optional when checking for blanks, so we could simply write:

c142’ ’

to check whether column 142 is blank. The ’=’ operator may also be used to compare the

contents of two data variables. For example: c56=c79 checks whether c56 contains exactly the

same codes as c79. If so, the expression is true, otherwise it is false. If we have

+----6----+ ... +----8----

1 1

5 5

the expression is true, but:

+----6----+ ... +----8----

1 1

5 5

9

yields the value false because column 79 contains a ’9’ when column 56 does not. If you have

defined your own data variables, you could write a statement of the form: brand1=c79 to check

whether the data variable called brand1 contains the same codes as c79.

The opposite of ’=’ is ’U’ (unequal):

cnU’p’

This checks whether column n contains something other than just the code ’p’. Suppose we have

two sets of data:

----+-----5 ----+-----5

1 1

4 5

7 9

and we write:

c44u’7’

The expression is true for both sets of data. In the first example, the ’7’ is multicoded with a ’1’

and a ’4’, while in the second example, column 44 does not contain a ’7’ at all. The only time this

expression is false is when column 44 contains a ’7’ and nothing else

Fields of data variables

To test whether a field contains a given list of codes, type: var_name(start, end) = $codes$

To test whether two fields contain identical strings, type:

var_name1(start1, end1) = var_name2(start2, end2)

To test whether the codes in one field differ from a given string, type:

var_name(start, end)u$codes$

To test whether the codes in one field differ from those in another, type:

var_name1(start1, end1)uvar_name2(start2, end2)

The contents of data fields must be enclosed in dollar signs with each code in the string referring

to a separate column in the field. For instance, to check whether columns 47 to 50 contain the

codes –, 6, 4 and 9 respectively we would type:

c(47,50)=$–649$

The only data for which this expression is true is:

+----5-----+

-649

However, if our data read:

+----5-----+

-529

164&

the expression would be false because all columns are multicoded. All our examples have used

columns, but the same rules apply to data variables that you define yourself.

For example:

rating(1,4)=$1234$

checks whether the field rating1 to rating4 contains the codes 1, 2, 3 and 4 in that order That is, it

checks whether rating1 contains a 1, whether rating2 contains a 2, and so on. When checking the

contents of fields in this way, make sure that you enter as many columns as there are codes in

the string (i.e. five codes require five columns). The exception to this rule occurs when you are

checking for blanks when the expression may be shortened to:

c(50,80)=$ $

This type of statement may also be used to compare two fields, to check whether the second field

contains exactly the same codes as the first field. When you compare one field with another,

Quantum takes each column in the first field in turn and looks to see whether the corresponding

column in the second field contains exactly the same codes. For example, if the first column of

the first field contains a code 1 and a code 2 and nothing else, then Quantum will check whether

the first column of the second field also contains a code 1 and a code 2 and nothing else. If all

columns of the second field are identical to their counterparts in the first field, then the expression

is true; otherwise it is false. Here is an example:

c(129,132)=c(356,359)

For this expression to be true, column 129 must contain exactly the same codes as column 356,

column 130 must be exactly the same as column 357, and so on. Once gain the two expressions

on either side of the equals sign must be the same length Comparisons of one data variable

against another are concerned with columns and codes: they are not concerned with the

arithmetic values of the codes in the fields as a whole.

If we have:

----+----3----+----

02 2

the expression:

c(24,25)=c(34,35)

is false because the string $02$ is not the same as the string $2$. If you want to compare fields

arithmetically (i.e., is 02 the same as 2) then you will need to use the eq. operator:

c(24,25).eq.c(34,35)

to test whether the value in c(34,35) was equal to the value in c(24,25). The .eq. operator is

described in the section entitled "Comparing values"

To check whether the codes in one field match a given string or the codes in another field, we can

use the = (equals) operator: c(m,n)=$codes$ cm=cn c(m,n)=c(m1,n1) If codes in the field c(m,n)

match the given string or the codes in c(m1,n1) then the expression is true. If the two fields are

not identical, then the expression is false

Let’s look at an example of the unequals operator. The statement:

c(67,69)u$123$

is true at all times unless our data reads:

The expression:

c(67,69)uc(77,79)

is true as long as columns 67 to 69 differ by at least one code from columns 77 to 79. If our data

is:

+----7----+----8

123 256

the expression is true because each of columns 77 to 79 differ from columns 67 to 69 Also, if we

have:

+----7----+----8

123 123

5

the expression is true because column 77 is multicoded ’15’. The only time the expression is false

is when columns 67 to 69 are identical to columns 77 to 79.

Checking the ar i thmetic va lue of a f ie ld of co lumns

To test whether a value in a field is within a specified range, type:

range(start, end, minimum, maximum)

Blanks at the start of the field cause this statement to give a false result. To ignore leading

blanks, type:

rangeb(start, end, minimum, maximum)

The logical expression range checks whether the number in a field of columns is within a given

range. If so, the expression is true, otherwise it is false. The format of this statement is:

range(start,end,min,max)

where start and end are column numbers and min and max are the range delimiters.

For example, the statement:

range(137,139,100,150)

will return the value true if the number in columns 37 to 39 of card 1 is in the range 100 to 150.

A variation of range is rangeb which allows columns to the left of the field to be blank if the

number is right-justified in the field. In all other respects it is exactly the same as range. If our

data is:

----+----2

123 6

the expression: rangeb(17,18,1,10) will be true because the string $ 6$ will be read as 6. With

range the value would be false.

However, the expression:

rangeb(15,18,2000,3000)

returns false because of the blank in c17.

Combining logica l express ions

To combine logical expressions, type:

expression operator expression

where operator is one of .or., .and., or .xor.

Two or more logical expressions may be combined into a single expression using the operators:

and. both/all true

or. one or the other or both/all true

not. negates (reverses) an expression

Any number of subexpressions may be combined to form a larger expression, but whether the

result is true or false depends upon the values of the subexpressions and also upon the operators

used to combine them

The .and. operator requires that all the expressions preceding and following the .and. be true for

the whole expression to be true. Thus, the statement:

int1.eq.9 .and. c116’1’

is true if the integer variable int1 has a value of 9 and column 116 contains a 1. If either

subexpression is false, the whole expression is false too By comparison, the .or. operator

requires that one expression or the other, or both, be true in order for the whole expression to be

true.

c(249,251)=$159$ .or. numb(c132,c135) .gt. 4

For this expression to be true, columns 249 to 251 must contain nothing but a ’1’, ’5’ and ’9’

respectively or the number of codes in columns 132 to 135 must be greater than 4. It is also true

if both expressions are true. However, if both are false, the overall result is false.

Expressions are reversed (negated) simply by preceding them with the keyword .not. Although it

is not wrong to use it with a single variable, it is more generally used to reverse an expression

containing the keywords .and. and .or.. Thus, it is not wrong to write .not.c15’1/5’ but it is much

simpler to write this as c15n’1/5’.

Example:

The .and. operator requires that all the expressions preceding and following the .and. be true for

the whole expression to be true. Thus, the statement:

int1.eq.9 .and. c116’1’

is true if the integer variable int1 has a value of 9 and column 116 contains a 1. If either

subexpression is false, the whole expression is false too.

By comparison, the .or. operator requires that one expression or the other, or both, be true

in order for the whole expression to be true.

c(249,251)=$159$ .or. numb(c132,c135) .gt. 4

For this expression to be true, columns 249 to 251 must contain nothing but a ’1’, ’5’ and ’9’

respectively or the number of codes in columns 132 to 135 must be greater than 4. It is also true

if both expressions are true. However, if both are false, the overall result is false.

Expressions are reversed (negated) simply by preceding them with the keyword .not. Although it

is not wrong to use it with a single variable, it is more generally used to reverse an expression

containing the keywords .and. and .or.. Thus, it is not wrong to write .not.c15’1/5’ but it is much

simpler to write this as c15n’1/5’.

Take care when using .not. with the .eq. operator. Statements of the form:

.not. c(1,3) .eq. 100

are incorrect and will not work. They should be written as either:

(not.(c(1,3).eq.100))

with the expression to be reversed enclosed in parentheses, or:

(c(1,3).ne.100)

Any of the operators .and., .or, and .not. may appear in a statement more than once, as long as

you use parentheses to define the order of evaluation. For example:

(c15’1/47’ .or. c16’3579’) .and. c22’&’

causes Quantum to check whether the .or. condition is true before dealing with the .and Suppose

our data is:

----+----2----+

13 &

79

The first expression (c15’1/47’) is true because column 15 contains a 1 and a 7 and the second

expression (c16’3579’) is also true since the codes it contains are amongst those listed as

acceptable. Thus, the .or. condition is true. Column 22 contains an ampersand so the last

expression is also true, therefore the expression as a whole is true regardless If both expressions

in the parentheses were false, the whole expression would be false not. with .and. and .or.

When you use .not. with expressions in parentheses, be very careful that what you write is what

you mean. Let’s take the conditions male and married and forget about columns and codes for

the minute. The condition:

(Male .and. Married)

refers only to married men. The opposite of this is:

.not. (Male .and. Married)

which refers to unmarried men and all women. This can also be written as:

not.Male or.not.Married

The first .not. collects all the women, the second collects everyone who is not married (e.g.

single, widowed etc), and together they collect people who are female and unmarried. We

use .or. instead of .and. here because the latter will gather unmarried women but will ignore the

unmarried men and married women.

Reversing .or. expressions works in exactly the same way. The expression:

(Male .or. Married)

means anyone who is Male, or anyone who is Married, or anyone who is Male and Married. The

opposite of this is:

.not. (Male .or. Married)

which means anyone who is not Male or is not Married or is not both; that is, anyone who

is a woman and is unmarried. This can be written as:

.not. Male .and. .not. Married

Thus, we can summarize, as follows:

Positive Negative Is the Same as

(A .and. B) .not. (A .and. B) .not. A .or. .not. B

(A .or. B) .not. (A .or. B) .not. A .and. .not. B

Here is an example using columns and codes:

.not. (c(135,137)=$519$ .or. c160’6/0’)

If our data is:

3----+----4----+----5----+----6----+

519 1

9 &

the expression is true because c(135,137) do not contain just the codes 5, 1 and 9 (c135 is

multicoded), and c160 does not contain any of the codes 6 through 0. The expression will only be

false if:

A) Column 135 contains a 5 only, column 136 contains a 6 only and column 137 contains a

9 only, and Column 160 contains any of the codes 6 through 0, either singly or as a

multicode.

We could therefore write the expression as:

.not. c(135,137)=$519$ .and. .not. c160’6/0’

Comparing variables and arithmetic expressions to a list

To compare the value of a variable or an arithmetic expression to a list of numbers, type:

item .in. (value1, value2, ... )

Ranges of numbers may be entered in the list as start:end. If the item is a reference to a field

containing blanks, enter the values as strings of codes enclosed in dollar signs.

Example:

C(3,5).in.($123$,$765,$ 26$)

C(120,122).in.(100,110,200:250)

From time to time you may need to check whether a variable or arithmetic expression has one of

a given list of values. For example, if the questionnaire codes brands of frozen vegetables as 3-

digit codes into columns 145 to 147 we might want to check that only valid codes appeared in this

field. This is achieved using the logical expression .in. as follows:

variable-name .in. (list) or

arithmetic-exp .in. (list)

where variable-name is that of the variable to be checked and list is a list of permissible values.

The arithmetic expression is an expression consisting of data or integer variables, arithmetic

operators and integer values as described earlier in this chapter. If the variable or arithmetic

expression has one of the listed values, the expression is true, if not, it is false. The left-hand side

of the expression may contain integer variables, columns or data variables containing whole

numbers, or expressions using these types of variables. If it is a data variable, then the list may

contain codes enclosed in dollar signs. Quantum will then compare the codes in the data variable

with the codes inside the dollar signs. We could therefore check that the frozen vegetables have

been coded correctly by keying in a statement which says:

c(145,147) .in. ($205$,$206$,$207$,$210$,$215$,$220$)

Quantum will flag any records in which c(145,147) does not contains exactly 205, 206,

207, 210, 215 or 220 (i.e. three single-coded columns) as incorrect.

If the data variable contains a valid positive or negative whole number, then the list may also

contain such values. Ranges of values may be entered in the form min:max, where min is the

lowest acceptable value and max is the highest. Since the frozen vegetables have numeric

codes, we could write the expression as:

c(145,147) .in. (205:207,210,215,220)

Any columns in the field which contain non-numeric data (e.g. multicodes) will be flagged as

incorrect, as will any which contain values which do not match the specification Sometimes,

though, the codes and numbers will not be interchangeable. If you have 2- digit codes in a 3-

column field, the statement:

c(206,2 09) .in. ($ 10$,$ 11$,$ 12$,$ 13$)

is not the same as:

c(206,209) .in. (10:13)

unless column 206 is always blank. If the 2-digit codes have been padded on the left with zeroes

instead of blanks (i.e., 010, 011) or if they all start in column 206 (i.e., $10 $, $11 $), then the first

expression will be false, even though the second one will still be true.

If the left-hand side of the expression is an integer variable or an arithmetic expression, the list

may contain positive or negative whole numbers: total .in. (100,200,500:1000) Lists may contain

up to 247 values or codes, which may be entered in any order. In our examples, we have always

entered them in ascending order, but this is not a requirement of Quantum. You may enter codes

in a list in any order you like. The exception is numeric ranges which must be entered in the form

lowest:highest

Naming lists

To assign a name to a list of values, type:

definelist name=(list)

where list is a comma-separated list of numbers, ranges or code strings enclosed in dollar signs.

If you have a list that is used more than once you may give it a name and refer to it by that name

instead of typing in the complete list each time. To name a list, write:

definelist name=(list)

For example:

definelist fveg=(205:207,210,215,220)

To use a defined list, simply replace the list with the name:

c(145,147) .in. fveg

Speeding up large programs

To speed up your Quantum program by converting expressions of the form c(1,4)=$1234$ into C

in a more efficient way, type:

inline n

where n is the maximum field width to be converted in this manner. This statement must appear

at the start of the edit.

If you have a large edit, you can speed up the time it takes to run by including the inline

statement in your edit. This instructs the Quantum compiler to convert expressions of the form

c(1,4)=$1234$ into statements in the C programming language in a different way to the way it

normally does. You need not worry about these different methods of conversion, apart from

deciding whether or not to use them.

If you want to speed your program up, place a statement of the form:

inline n

at the beginning of the edit section, where n is the maximum field width to be converted in the

special way. For example:

inline 6

Here we are saying that fields of six columns or less should be converted in the special way

rather than in the normal way.

How Quantum reads data

In order for the answered questionnaire to be processed, the information contained on the questionnaire must be read into the computer into a location where Quantum can access it. This is done by reading the data into the data variable array called C which is supplied automatically with every Quantum run. You may then access this data by addressing this array. Different types of records are read into the C Array in different ways.

Types of record

Quantum deals with three types of record: ordinary, multicard and multicard with trailer cards.

Ordinary records

These are strings of codes and numbers, one per respondent, up to a maximum of 32,767 characters per respondent.

Multicard records

When data originates from punched cards and each questionnaire requires more than 80

columns, the data is spread over several cards. So that all cards belonging to a particular

respondent may be easily identified, each questionnaire is assigned a serial number which is

entered as part of the data for each card. Within this, each card has a unique card type or card

number to distinguish it from others in the group. It is important that both the serial number and

card type be in the same relative positions on all cards in the file, since this is the only way that

Quantum can tell which data belongs to which respondent. If the questionnaire serial number is in

columns 1 to 4 of each card and the card type is in column 5, and we are looking at questionnaire

1005, we will see that it has two cards whose first five columns are 10051 and 10052

respectively. Quantum can deal with records that contain up to 327 cards per respondent.

occasionally you may have multicard records in which each ‘card’ is greater than 80 columns.

The notes that follow refer to multicard records of up to 100 columns per card.

Multicard records with Trailer Cards

Sometimes a record contains very repetitive data which is tabulated over and over again in the

same way. For instance, a shopping survey may ask the respondent a series of identical

questions for each store he visited. In this case, there may be a separate card for each store.

Processing this type of data is often easier if we treat all cards containing the same questions as

if they were, in fact, one card with one card number. These cards are called Trailer Cards Thus, if the respondent visited five stores, and the questions about these stores are coded on a card 2, the record for that respondent would contain five cards of type 2. If demographic details were stored on a card 1, the whole record would be 6 cards in all. In Quantum, the demographic data would be described as the higher level and the stores as the lower level.

Reading data into the C array

Data is read into the C Array automatically, one record at a time. The way data is read depends upon the record structure.

Ordinary records

Ordinary records are read into cell 1 onwards of the array. Therefore, for example, the 50th column is referenced as c50 and the 200th cell as c200.

Multicard records

Records are read into c101 to c200 for card 1, c201 to c300 for card 2, and so on. For example,

80-column cards are read into c101 to c180 for card 1 and c201 to c280 for card 2. Columns 181-

200, 281-300, etc remain blank. In this case, the C Array may be pictured as ten rows of 100 cells

each. Column 50 of card 1 is then accessed by referring to it as c150, and column 67 of card 8 is

referred to as c867.

Ignoring card types

It is also possible to read cards into the array sequentially regardless of card type: the first card

goes in c(101,200), the second in c(201,300), the third in c(301,400), and so on.

Processing the data

Each time an ordinary record or set of cards comprising a multicard record is read in, hat data is

processed first by the edit section and then by the tabulation section of your program. The

complete record is edited and tabulated in one go. The exception to this is the trailer card record

where processing can take place a number of times within each record for each lower level.

To ensure that only the part of the edit section applying to a particular level is used, the edit

section is defined separately for each level. Similarly, the table instructions specify the level at

which the table should be incremented.

Changing the contents of a variable

This section describes how to assign values to variables and the statements emit, delete and

priority, all of which may be used to alter the contents of a variable. Emit, delete and priority are used only with columns whereas assignment statements can deal with character, integer and real variables. When we say that these statements change the contents of a column we mean that they change the contents of that column as it exists during the run: at no time do they change the corresponding column in the data file.

Trailer Cards

By using the Levels facility, the user need not know how Quantum deals with trailer card data

internally. However, there are occasions when it may be necessary to edit or tabulate the data

without using levels. To do this, it is necessary to know more about how trailer cards are

processed.

Quantum deals with trailer cards in a number of ‘reads’. Cards are read into the appropriate rows

of the C Array until:

a) a card is located with a card type matching that of the previous card (e.g., two

consecutive card 2’s), or

b) a card is read with a type lower than its predecessor and matching one of the card types

already read in during the current ‘read’ (e.g., a card 2, a card 3, and then another card

2).

In order to produce useful tables, you will need to know which cards are currently in the

C Array.z`

Quantum has four reserved variables – thisread, allread, firstread and lastread – which it uses to

keep track of which cards it has read for each respondent.

thisread

The array called thisread is used to check which cards have been read in during the current

read. thisread1 will be true (or 1) if a card type 1 has just been read in; thisread2 will be true if a

card 2 has just been read, and so on.

There are nine such variables (thisread1 to thisread9) available unless extra card types have

been specified using the max= option In this case, these variables will be numbered 1 to max; if

there are 13 cards, we will have thisread1 to thisread13.

Allread

allread notes which cards have been read in so far for this questionnaire. If cards 1, 2 and 3

have been read so far, allread1, allread2 and allread3 will all be true. Additionally, each cell of

allread will contain the number of cards of the given type read in – for instance, if two cards of

type 3 have been read, allread3 will be true and it will contain the number 2.

As with thisread, there are nine allread variables available unless extra card types have been

specified with max=.

firstread and lastread

The variables firstread and lastread become true when the first and last cards in a record have

been read in.

Reserved variables

Other reserved variables associated with reading in data:

lastrec set to true when the last record in the file has been read or, in the case of trailer cards, the

last read of the last record has occurred. rec_count stores the number of records read in so far.

card_count counts the number of cards read so far.

Describing the data structure for Multicard records

To describe the structure of the data, type:

struct; options

All programs dealing with multicard records must contain a struct statement unless the data

contains trailer cards which will be read and tabulated using the levels facility. In this case you

may choose between using a struct statement or using a levels file. If the run has no struct

statement and no levels file, Quantum assumes that the data contains ordinary records to be read

into c1 onwards of the C array.

The struct statement is used to define the type of records, the location of the serial number and

card type in the record and the number of the highest card type if greater than 9. Its format is:

struct;options

Record type

To define the record type, type:

struct; read=n

where n is 0 for ordinary records, 2 to read multicard records in sections according to the card

type, or 3 to read multicard records in all in one go.

Quantum recognizes two types of record: single card and multicard. The type of record is defined

by the keyword read= on the struct statement:

Ordinary Records

Ordinary records are defined using read=0. Each record is read into c1 onwards of the array.

Since it is the default, you need only use it when other options are required; for example, when

the records contain serial numbers and you wish to have the serial number printed out as part of

the record, or when you are working with long records of more than 100 columns.

Multicard Records

Multicard records are identified by the keyword read=2. Each card in the record is read into the

row corresponding to the card type of that card – that is, card 1 in c(101,200), card 2 in

c(201,300), and so on. We mentioned briefly that it is possible to read all cards in a multicard

record in at once and ignore the card type. The first card goes in c(101,200), the second in

c(201,300), and so on. This is achieved with read=3.

Record length

To define the record length of records greater than 100 columns, type:

struct; reclen=n

The keyword reclen=n defines the maximum number of characters to be read into the C rray, the

number of cells to be reset to blanks and the number of cells to be written out

by the write statement.

With ordinary records reclen may take any value, but with multicard records the maximum is

reclen=1000. In both cases, the default is reclen=100. When data is being read into the matrix,

any record which is longer than reclen characters is truncated to that length and a warning

message is printed.

When ordinary records are written out with write or split, cells c1 to c(reclen) are copied, with any

trailing blanks being ignored. For instance, if we have:

struct;read=0;reclen=200

and the current record is only 157 characters long, the record written out will be 157 characters

long. This length can be overridden by an option on a filedef statement. When multicard records

are written out, columns c101 to c(100+reclen), c201 to c(200+reclen), and so on will be output.

Thus, if we write:

struct;read=2;reclen=70

and we have 2 cards per record, Quantum will write out c(101,170) and c(201,270). Finally, with

ordinary records cells c1 to c(reclen) are reset to blanks between records, but with multicard

records cells c101 to c(100+reclen), c210 to c(200+reclen), and so on are reset.

Serial number location

To define the location of the serial number in each record, type:

struct; ser=c(m,n)

The keyword ser=c(m,n) defines the field of columns containing the respondent serial number.

For example, if the serial number is in columns 1 to 5 of an ordinary record we would write:

struct;read=0;ser=c(1,5)

Similarly, if it is in columns 1 to 5 of a multicard record the statement would be:

struct;read=2;ser=c(1,5)

Notice that even with multicard records we only give the actual column numbers containing the

serial number, rather than card type and column number as is usually the case when identifying

columns in such records. This is because the column numbers refer to all cards in the data set

rather than to a single card in the file.

Card type location

To define the location of the card type in the record, type:

struct; crd=cn

Defining the card type location is much the same as defining the position of the serial number in

the record. The keyword is crd=cn for a single digit card type or crd=c(m,n) for a card type of

more than one digit. Once again, m and n are column numbers only, not card type and column

number.

For example:

struct;read=2;ser=c(1,4);crd=c5

tells us that we have a multicard record with serial numbers in columns 1 to 4 and the card type in

column 5 of each card. Each card will be read into the row corresponding to its card number.

Required card types

To define cards which must be present in each record, type:

struct; req=card_numbers

where card_numbers is either a comma-separated list of card numbers, or a range of sequential

card numbers in the form start:end or start/end.

Sometimes some cards will be optional and others mandatory. You may define those cards which

must appear in every record by using the keyword req= followed by the numbers of the cards

that each respondent must have.

For example:

req=1,2

tells us that cards 1 and 2 must be present in each record for that record to be accepted. Any

other cards are optional. If a record is read without one of these cards, the error message ‘Card

Missing in Set’ and a note of the record’s position in the file are printed and the record is ignored.

If you have ranges for required card types, you may type the numbers of the lowest and highest

cards separated by a slash (/) or a colon (:) rather than listing each card type separately. For

example, if cards 1 to 4 are all required, you may type:

req=1,2,3,4 or req=1/4 or req=1:4

Repeated card types

To define cards which may appear more than once in a record, type:

struct; rep=card_numbers

where card_numbers is either a comma-separated list of card numbers, or a range of sequential

card numbers in the form start:end or start/end. If the data contains trailer cards and the Levels

facility is not used, you must list their card types with the keyword rep=. For instance, if card 2 is

a trailer card we would write

rep=2. Where there is more than one trailer card, each card type is listed separated by a

comma. If cards 2, 3 and 4 are all trailer cards we could write:

rep=2,3,4

If you have ranges for required card types, you may type the numbers of the lowest and

highest cards separated by a slash (/) or a colon (:) rather than listing each card type

separately.

For example, if cards 2 to 4 are all required, you may type:

rep=2,3,4 or rep=2/4 or rep=2:4

If rep= is not used and a record is read with two or more cards of the same type, the last

card of that type will be accepted and the message ‘Identical duplicate’ or ‘Non-identical

duplicate’ and a note of the record’s position in the file will be printed. For example:

Record structure error: serial 026, card 234 in run, card 234 in dfile

card type 2 – non-identical duplicate

Because rep= refers to trailer cards only, it will be ignored if read=2 and crd= are not

both present on the struct statement.

Highest card type number

To define the highest card type in the record, if there are more than nine cards per record,

type:

struct; max=n

The only time you need to inform Quantum of the highest card type is when you have

records with more than nine cards. This is so that Quantum can allocate sufficient cells

in the C array to store the extra cards. The highest card type is defined with max=n, where

n is the number of the highest card type. Cells 1 to max*reclen are then cleared between

respondents. For example, to read a data set with 11 cards per respondent we might write:

struct;read=2;ser=c(1,4);crd=c5;req=1,2,3,4;max=11

If you forget max=, and a record is read with more than nine cards, the message ‘Too

many cards per record’ is printed and the record is rejected. On the other hand, if a card

is read with a card type higher than that defined with max=, the record is rejected with

the message ‘Card number out of range’.

Dealing with alphanumeric card types

To define the location in the C array of cards with alphanumeric card types, type:

struct; order=card_types

where card_types is a list of card type numbers and letters in the order they are to appear

in the C array.

From time to time you may need to read in records with alphabetic as well as numeric

card types. This generally happens in a multicard data set containing more than nine cards

per record where only one column has been allocated to the card type.

Quantum can deal with this data but first you will have to say where in the C array the

alphabetic card types should go. This is done with the keyword:

order=n

where n is one or more of the codes ’1234567890–&’ or the letters A to Z (in upper or

lower case) not separated by spaces.

The card type bearing the first number in the list is read into c(101,200), the card bearing

the second code in the list is read into c(201,300) etc. For example, suppose each record

has ten cards – 1 to 9 and A – our struct statement might say:

struct;read=2; ser=c(1,4);crd=c4;max=10;order=123456789A

Data from card A would be read into cells 1001 to 1100 of the C array.

Merging Data using Quantum

Merge sequence for Trailer Cards

To define the location of the merge sequence number in trailer cards, type:

struct;seq=cn

When trailer card data is merged during a run with the merge facility, you may wish

trailer cards to be merged in a specific order, according to a sequence number entered as

part of the data. The location of this sequence number can be defined with the keyword

seq=cn for a single column code or seq=c(m,n) for a multicolumn code. For more

information on merging data see the next section.

Merging data files

When we say that Quantum allows you to merge data files, we do not mean that Quantum

takes data from a number of files and merges it to create a new file. Rather, we mean that

data can be read from a series of files during a Quantum run. Of course, the merged data

can then be written out to a new file for future use.

Quantum provides two methods for merging data. The first is designed for studies where

you have different card types in different files; for example, cards 1 and 2 in the file data1

and card 3 in the file data2. In this case, merging is by serial number and, optionally, card

type and trailer card sequence number.

The second method is designed for situations where you want to merge a field of data

from an external file into records from the main data file. For example, you may have a

file of manufacturers’ codes which refer to a number of products. If each record in the

main data file contains the product the respondent preferred, you may wish to merge the

appropriate manufacturer’s code from the external file into the main data in the C array.

In this case, merging is based on finding matching keys in the main record and the records

in the external file.

Both options are described in detail below.

Merging complete cards

Data for a study may be spread across a number of files. This is particularly useful with

large surveys because it means that you can put each card type in a different file and

simply merge in the cards required for the current batch of tables. For example, if we

require tables from cards 4 and 5, we need not even read in cards 1, 2, 3 and 6.

Data from up to 16 files may be merged; that is, the main data file and 15 others. It may

be merged on serial number and, within that, on card type. With trailer card data, you also

have the option of merging trailer cards according to a sequence number entered as part

of the data.

In order for the merge to be successful, all files must be sorted in ascending order with

the serial number, card type and sequence number in the same position. Quantum reads

the locations from the keywords ser=, crd= and seq= on the struct statement.

To merge data files you must create a file called merges telling Quantum which items to

merge on, and which files to merge. The type of merge is represented by a number:

1 merge on serial number. Cards are read in from each data file according to their

serial number only – the card type and sequence number, if any, are ignored. You

might use this option when you have two files, dat01 containing cards of type 1 and

dat02 containing cards of type 2, and you want the files to be merged so that card

type 1 is read into the C-Array, followed by card type 2.

3 merge on serial number and card type (default). With this option, cards with the

same serial number read from different data files are merged to form a single record

by comparing the serial number and card type. Cards within a record are then sorted

sequentially from 1 so that each card is read into the appropriate cells of the

C-Array. For example, if dat01 contains cards 1 and 3, and dat02 contains cards of

type 2, the merge will produce records containing cards 1, 2 and 3 in that order.

5 merge on serial number, card type and sequence number. This is similar to merge

type 3, except that trailer cards are merged according to their sequence number. For

example, if dat01 contains cards 1 and 2, where card 2 is a trailer card with a

sequence number of 2, and dat02 contains cards 2 and 3, where card 2 is a trailer

cards with a sequence number of 1, the merged record will contain cards 1, 2/1, 2/2,

and 3, in that order.

This is the first item in the merges file, and is followed by the names of the files to be

merged with the main data file named in the Quantum command line. Items may be

entered on separate lines or all on the same line separated by semicolons. For example,

if we want to merge data in files dat02 and dat03 with data in the main file, dat01, by

serial number, card type and sequence number, the merges file would look like this:

5; dat02; dat03

Notice that we have not mentioned dat01 in the merges file because it will be named on

the Quantum command line instead.

Merging a field of data from an external file

To merge extra data from an external data file into the data currently in the C array, type:

int_variable=mergedata($ex_file$, key_field, key_start, copy_to, data_start)

where

ex_file is the name of the file containing the extra data.

key_field is the location of the key in the main data file, entered using the standard

Quantum notation for columns and fields

key_start is the start column of the key in the external data file.

copy_to is the field in the main data record in which to place the external data. The

field is defined using the standard Quantum notation for columns and fields.

data_start is the start column of the data to be copied.

This statement returns in int_var_name a 1 if a match was found or 0 if not.

The mergedata statement merges a field of data from an external file with the main data

at the datapass stage of the Quantum run. Merging is by means of a data key present in

both the main records and the records in the external file. If a record in the external file

has a key which matches that of a record in the main data file, the external data will be

merged into a user-defined field of the main record when it is read into the C array.

In order for data to be merged correctly, both the main data file and the external file must

be sorted in ascending order by key value. If the key is the record serial number then the

data file will already be sorted in the correct order (assuming, of course, that the data is

sorted by serial number). If you are using a key that is not the record serial number you

must sort the data file so that it is ordered by key rather than by serial number.

The syntax for mergedata is:

int_variable=mergedata($ex_file$, key_field, key_start, copy_to, data_start)

where

int_variable

is the name of an integer variable in which the function can place its return

value.

ex_file is the name of the file containing the extra data. It must be enclosed in dollar

signs.

key_field is the location of the key in the main data file, entered using the standard

Quantum notation for columns and fields.

key_start is the start column of the key in the external data file, for example, 1 if the

key starts in column 1. The length of the key is taken from the length of

key_field.

copy_to is the field in the main data record in which to place the external data. The

field is defined using the standard Quantum notation for columns and fields.

data_start is the start column of the data to be copied. Quantum copies as many

columns as are defined by copy_to.

For example:

t1 = mergedata($manuf_codes$,c(178,180),15,c(168,175),1)

tells Quantum to compare the key in columns 178 to 180 of the main record with the key

which starts in column 15 of the external records in the file manuf_codes.

Because the key field in the main record is 3 columns long, Quantum reads columns 15

to 17 of each external record to obtain its key. If the keys match, Quantum copies the data

from the external record into columns 168 to 175 of the main record in the C array. The

external data to be copied starts in column 1 and, since the destination field is 8 columns

long, Quantum copies 8 columns starting at that column.

This statement returns a value of 1 if a match was found (i.e., merging took place), or 0

if not.

There is no limit on the number of mergedata statements in a specification, but you may

only merge data from up to nine different files per record.

Writing out data

There are three ways of writing out your data once it has been read into the C-Array. You

may:

a) create a new data file

b) copy records to a print file

c) write information to a report file

Data and print files are both accessed by the write statement, but the exact format of the

statement varies according to the type of file and the information being written. Report

files are written to with the report statement.

Print files

Print files are printouts of records or parts of records with headings, descriptive texts and

page numbers. They cannot be used as data for subsequent Quantum runs.

Printing out individual records

To write a record or part of a record to a print file, type:

write [file_name] [field] [$text$]

The word write by itself prints out a whole record in the form it is when the write

statement is executed, together with a ruler showing which codes fall in which columns,

the line number of the record in the data file and the message ‘write’ indicating that the

record was generated by a write statement. Any multicodes in the record are shown as

asterisks, but you may change this with an option on the filedef statement.

If the record contains more than one card, each card is listed separately beneath the ruler.

For example, the statement:

write

by itself might give us:

Quantum edit report

1 in file

----+----1----+----2-- ... --9----+----0

column 1 - 100 are |12345

write

2 in file

----+----1----+----2-- ... --9----+----0

column 1 - 100 are |23456

write

Each write statement will produce a line in the default print file, out2, telling you how

many records were written out, as follows:

2 (1%) write

The example above was very simple; more often than not your program will contain

several write statements and you will want some way of identifying which records were

printed by which statement and why. If the write is dependent upon some other statement

– for instance, it is part of an if statement – the whole statement is printed underneath each

record, thus:

Here, as you can see, we are checking that column 14 contains a 1/4. This record has been

printed out because it contains a ’5’ instead.

67 in file

----+----1----+----2-- ... --9----+----0

column 1 - 100 are |0015263-16*735 *837361 ... 79&

if (c14n’1/4’) write

Here, as you can see, we are checking that column 14 contains a 1/4. This record has been

printed out because it contains a ’5’ instead.

Sometimes it is more helpful to have an explanatory text printed instead of the statement

itself. In this case all that is necessary is to follow the word write with the text to be

printed enclosed in dollar signs:

if (c308n’1/5’) write $C308 incorrect$

if (numb(c117,c118,c119).gt.3) write $too many choices$

might give us:

Quantum edit report

Record 17 51 in file

----+----1----+----2-- ... --9----+----0

column 101 - 200 are |00170116548986131*46*1 ...

column 201 - 300 are |0017026464515 875 ** ...

column 301 - 400 are |0017031929-5897231 ...

C308 incorrect

too many choices

Record 32 94 in file

----+----1----+----2-- ... --9----+----0

column 101 - 200 are |003201837021 **53798 ...

column 201 - 300 are |0032021353452 763736 ...

column 301 - 400 are |003203212 & ...

too many choices

Our first statement writes out all records in which column 308 does not contain any of

the codes 1/5, and the second picks up all records having more than 3 codes in columns

117 to 119.

Normally all output from write goes to the default print file, and whenever the current

record is written to this file, the variable printed_ becomes true. You may change the

output file by following the word write with the name of the file to write to. For example:

write pfile $First Print$

writes to the file ‘pfile’, whereas;

write errors $Second Print$

writes to a file called ‘errors’.

All files named on write statements must be defined on a filedef statement before they are

used.

If two or more write statements apply to a single record, the record is printed out once in

the state it was when the first applicable write was read, with all relevant write statements

or texts listed below it. If a record satisfies two or more write statements which write to

different files, Quantum will write the record out once for each statement, in the state it

is when each write is executed.

Writing Out Parts of Records

Often you will not want to write out the whole record, especially if it contains several

cards. Therefore Quantum allows you to include a field specification in a write statement

to print only selected portions of an incorrect record. For example:

if (c110’2’.and.c119’2’) write c(110,120) $Married woman$

checks that columns 110 and 119 both contain a 2, and if so prints out columns 110 to

120 in the print file, followed by the text Married woman. If you are writing out less than

ten columns, Quantum does not print a ruler above the codes.

If you are dealing with multi-card records, you may prefer to use this form of write to

have only the card containing the error printed, rather than all cards in the record. If we

take our previous example where we were checking the contents of column 308:

if (c308n’1/5’) write $c308 incorrect$

prints all three cards in the record, whereas:

if (c308n’1/5’) write c(301,380) $C308 incorrect$

prints only card 3.

To write selected parts of a record to a particular file the notation is:

write filename c(m,n) [$text$]

Data files

To write records or fields to a data file, type:

write file_name [c(start_col, end_col)]

write may also be used to copy records to a data file. This is useful if you want to separate

a particular card type from the rest of the data, or if you want to correct errors and save

the corrected data in a new file for later tabulation.

To write records to a data file the command is:

write filename

to write the whole record to the named file, or

write filename c(m,n)

to write columns m to n only.

Creating new cards

New cards can be created by copying information into spare columns of the C-Array. To save

these as part of a new data file you will have to give each new card the same respondent serial

number as the rest of the data in the array and a card type which may or may not be unique. In

the example below, we are moving some information from card 1 of a 2-card data set into a new

card 3. The comments explain what each statement is doing.

/* Copy the data into the new card

c(310,341)=c(148,179)

/* Delete it from its original place

c(148,179)=$ $

/* Give it a serial number and card type

c(301,304)=c(101,104); c380’3’

/* Set thisread true for card 3

thisread3=1

/* Define pfil as a data file

filedef pfil data

/* Copy cards 1, 2 and 3 to pfil

write pfil

Some General Instances for forcecoding cleaning etc.

Writing to a report file

To write information to a report file, type:

report[n] file_name variable_names

where variable_names is a comma-separated list of the variables and texts to print.

Use reportn rather than just report to start a new line each time the statement is executed.

A report file is a special type of print file in which you can print out records, fields or

variables in the format of your choice. To write information in a report file, use the report

statement, as follows:

report filename parameters

where filename is the name of the file to be written to, and parameters define exactly what

is to be written.

Lines in a report may be up to 1024 characters long. Report does not start a new line

automatically at the end of each write, but you may tell it to do so by following the

keyword report with the letter n:

reportn filename parameters

In both cases, the named file must be identified as a report file using a filedef statement,

as mentioned below.

The parameter list defines what is to be printed in the report file. It may contain variables,

texts, and special characters representing tabs and spaces.

Assignment statements

• to copy codes from one column into another.

• to replace certain codes in one column with those from a second column.

• to assign the value of an arithmetic expression to a variable.

• to copy codes from groups of columns into another column using the logical

operators and, or and xor.

In spite of the diversity of these functions the basic format of any assignment statement

is:

variable=item

where item defines what is to be copied into the variable.

Remember that comments can be identified by a capital C in column 1. If the first

variable in your statement starts with a C, make sure that you type it in lower case

otherwise the whole line will be read as a comment and ignored. For example:

col 1

c(15,16)=$12$ is correct, but

C(15,16)=$12$ will be read as a comment even though the syntax is correct

Alternatively, you may precede assignment statements with the word set, thus:

set c(15,16)=$12$

Copying codes

To copy codes into a single data variable, overwriting the variable’s original contents,

type:

variable=’codes’

To copy a string of codes into a field, type:

var_name(start,end)=$codes$

To copy the contents of one variable or field into another, type:

variable1 = variable2

Assignment statements are most commonly used to copy codes into a column or to copy

the contents of one variable into another. For instance:

c121=’159’

c121=c134

You can also copy strings of characters into fields of columns. Let’s say we want to copy

the code 59642 into columns 76 to 80 of card 3; we would write:

c(376,380)=$59642$

Partial column replacement

To replace a code or set of codes in one data variable with a code or set of codes in a

second data variable, type:

variable1’codes1’=variable2’codes2’

codes1 and codes2 must contain the same number of codes, and the codes must be in

superimposable order

Storing arithmetic values

To store the value of an arithmetic expression in a variable, type:

variable = expression

To copy a real value into a data variable, type:

var_name(start,end) :dp = expression

where dp is the number of decimal places required.

For example, if x5=10.22, the statement:

cx(15,19):2=x5

results in:

10.22

Assignment with and, or and xor

To copy codes which are present in at least one of a list of columns, type:

data_var_name=or(cnum1[’codes1’], cnum2[’codes2’], ...)

To copy codes which are present in all of a list of columns, type:

data_var_name=and(cnum1[’codes1’], cnum2[’codes2’], ...)

To copy codes which are present in only one of a list of columns, type:

data_var_name=xor(cnum1[’codes1’], cnum2[’codes2’], ...)

The final type of assignment is copying codes from a set of columns. The codes copied

depend upon the type of operator used:

and Copy codes present in all columns

or Copy codes present in one or more columns

xor Copy codes present in one column only

The format of the statement is:

column = operator(ca,cb,cc, ...)

where ca, cb, and cc are the columns whose codes are to be compared. Note that even if

you are comparing codes in consecutive columns, each column must be identified

separately,

For example:

the statement c181=and(c137,c138,c139) results in:

copying of codes into c181,that present in all columns c137,c138 and c139

the statement c182=or(c137,c138,c139) results in:

c182 contains a list of all codes present in AT LEAST ONE of the named columns.

Adding codes into a column

To add codes into a column in addition to those that are already there, type:

emit cn1’codes1’ [, cn2’codes2’ ...

Emit inserts codes into a column leaving the original contents intact. Its format is:

emit cn’p’

More than one column may be entered on each line, provided that each one is separated

by a comma.

emit c567’7’, c110’2’

emit can only be used with single columns; string variables are not valid: emit

c(100,110)$99$ does not work.

Deleting codes from a column

To delete selected codes from a column, type:

delete cn1’codes1 [, cn2’codes2’ ... ]

The delete statement is the opposite of emit in that it deletes codes from a column leaving

the remainder intact. Its format is:

delete cn’p’

More than one deletion may be effected with the same delete statement as long as each

column is separated by a comma.

delete c110’5’, c179’56’

Forcing single-coded answers

To force single-coding of a multicoded columns, type:

priority cn’code1’, ’code2’ ,’code3’,[cn2’code1a’, ’code2a’ ,’code3a’, ... ]

where a code at the start of the list should be accepted in preference to any later in the list.

The statement used for this is:

priority cn’code1’, ’code2’ ,’code3’,[cn2’code1a’, ’code2a’ ,’code3a’, ... ]

where cn is the column whose codes are to be checked and ’p1’ to ’pn’ are the positions

to check, entered in order of priority, the most important first.

priority checks only the listed positions; if any other codes are present they are

ignored.

the statement: priority c249’5’, ’4’, ’3’, ’2’, ’1’

causes Quantum to scan column 249 to see first whether it contains a ’5’ and, if so,

to delete all subsequent codes in the list. If c249 contains a ’5’ and nothing else, obviously

there will be no extra codes to delete; this does not matter. If there is no ’5’ in c249,

Quantum then checks whether it contains a ’4’; if so, any other codes in the range ’1/3’

are deleted, otherwise the program skips to the next code in the list and checks for that.

If none of the listed codes are found, the column remains unchanged.

Setting a random code in a column

To choose a random code from a list of codes, type:

data_var_name=rpunch(’codes’)

To choose a random code from the codes present in a column, type:

data_var_name=rpunch(col_number)

For example:

c115 = rpunch(’1/5’)

will place one of the codes 1 through 5 in column 115.

Alternatively, you may use rpunch with another C-variable, thus:

c115 = rpunch(c120)

Once this statement has been executed, column 115 will contain one of the codes present

in column 120.

Reading numeric codes into an array

To set up an array based on numeric codes in the data, type:

field array_name=column_spec [,code=cell_number, ...]

column_specs are references to the fields containing the numeric codes. code is a

non-numeric code present in those fields and cell_number is the cell of the array which

should be incremented whenever that code is encountered.

Cells in the array are reset to zero at the start of each new record. To prevent this

happening, enter the statement name as fieldadd rather than field. The rest of the

statement is as shown.

The format of the field statement is:

field output_array = column_specs [,special_specs]

output_array is the name of the array in which you wish to store the counts of responses.

You can use spare columns in the C array, but you may find your program is easier to

read if you define an integer array of your own with a name which reflects the type of

information it contains. For example, if you want an integer array called films, you might

write:

int films 5s

ed

field films = .....

When you define the integer array, make sure that you request as many cells as there are

codes in the data. In this example there are five films so you define the array as having

five cells. Quantum automatically creates an extra cell (cell 0) which it uses to count

responses for which there is no cell allocated. If there were six films, for example,

Quantum would increment cell 0 each time it found code 06 in the films columns. You

might like to check the value of this cell as a means of reporting on invalid codes:

if (films0 .gt. 0) write c(1,20) $Bad film code$

Negative and zero values also cause cell zero to be incremented. Codes which are shorter

than the field width are accepted as long as they are padded with blanks or zeroes.

The input_specs part of the statement defines the columns to read. You have a number of

choices here. First, you may list each column or field reference one after the other,

separated by commas. The list must be enclosed in parentheses. In our example this

would be:

field films = (c(12,13), c(14,15), c(16,17))

Second, if you have sequential fields as you do here, you can type the start columns of

each field followed by the field length. The list of start columns is separated by commas

and enclosed in parentheses, and the field length comes after the closing parenthesis and

starts with a colon. If you use this notation for the film example you would write:

field films = (c12, c14, c16) :2

If you wish, you can abbreviate this further by typing just the start columns of the first

and last fields, followed by the field length.

field films = c12, c16 :2

Third, if the fields are not sequential, you list the start columns and field width of each

group of columns (as shown above) and separate each group with a slash. For example,

to read data from columns 12 to 17 and 52 to 57, with each field being two columns wide,

you would type:

field films = c12, c16 / c52, c56 :2

This reads c(12,13), c(14,15), c(16,17), c(52,53), c(54,55) and c(56,57).

You can also use this notation for single non-sequential fields. For example:

field films = c23 / c36 / c71 :2

means c(23,24), c(36,37) and c(71,72).

The special_specs part of the statement is optional. You use it when a field contains

non-numeric codes such as $&&$ for None of these films. If you want to count codings

of this type, you must remember to allocate cells in the array for each code or group of

codes you wish to count. You then include the notation:

code = cell_number

to count those codes. For example:

int films 6s

ed

field films = (c12, c14, ch16) :2, $&&$=6

If you want to count more than one non-numeric code, list each one individually,

separated by commas.

Quantum normally resets the cells of the integer array to zero at the start of each record.

If you want counts to continue from one record to another, use a fieldadd statement

instead of field. For example:

fieldadd films = (c12, c14, c16) :2

Clearing variables

To remove values from variables, type:

clear var_name1, var_name2, var_name3

Changing the contents of a variable – Chapter 8 / 103

Variables of any type may be cleared using a clear statement:

clear var1, var2, .... varn

where var1 to varn are any valid Quantum variable or range of variables. For example:

clear c(109,180), t(1,200), myarray(29,33), myint, myreal

Data variables are reset to blank, integer variables are reset to 0 and real variables are reset to

0.0.

Variables can also be cleared using assignment statements (e.g., t1=0), but there are advantages

to using clear instead. Firstly, clear is much easier to write. Secondly, with clear the compiler

checks that the subscripts are in the correct range (e.g., 1 to 33 if ‘myarray’ has only 33 cells);

this is not possible with the loop method because the subscript is a variable. However, if you use

variables as subscripts with clear (e.g., clear c(t1,t1+5) subscript checking once again cannot be done.

Flow control

Statements in the edit section are usually dealt with in the order in which they occur in the program. Quantum provides statements which may be used to alter this normal order of execution, for example, by missing out a statement or repeating a group of statements a number of times.

Statements of condition

1) Ed -Defines start of edit section of a quantum run. The statement is essential if a Quantum run

contain an edit section

2) End -Defines the end of the edit section. This statement is a must if the run contains An edit

section.

1) If -To define statements to be executed if a certain condition is true For example:

if (numb(c10,c11,c12).gt.3) emit c20’9’

2) Else -To define statements to be executed if a given condition does not exist, For example:

if (c115’1’); else; emit c140’2’

3) go to - Ensures Quantum program will include statements which refer to certain respondents

only; For example:

The statement:

if (c121n’1’) go to 50

causes Quantum to go immediately to the statement labeled 50 if column 121 does not contain a

’1’ Any statements between this if statement and statement 50 are ignored whenever a record is

read where c121n’1’ is true.The statement labeled 50 may be any Quantum statement, but many

people just write:

50 continue

4) continue- This statement is a dummy statement whose sole purpose is to join various bits of a

program together. It is often used with a statement label as a destination for routing with go to, or

to identify the end of a loop.

5) Loops- Are used to define repetitive statements. Loops are extremely important structures

because they enable the same set of basic statements to be executed over and over again on a

changing series of numbers, columns or codes. Their use can reduce the work involved in

checking data. The statement which introduces a loop is do which is formatted as follows:

1. The word do.

2. A label number identifying the last statement in the loop.

3. An integer variable (for numbers or columns) or a letter (for codes) whose value is to be

used by the statements in the loop.

4. An equals sign.

5. A list of whole numbers, integer variables or codes which are the values the integer

variable or letter is to take. These may be entered in two ways

Loops should be terminated by any statement other than go to, stop, return, another do or an if

containing any of these words. The main purpose of the terminating statement is to identify the

end of the loop and send the program back to the start of the loop. Go to and return send the

record elsewhere, stop terminates the run and another do indicates the start of another loop. The

statement most often used to terminate a loop is the dummy statement continue. Any statement

that terminates a loop must be preceded by a label number.

Thus, the usual format of a loop is:

do label.number int.var = value list

- - statements to be executed - -

label.number statement

For example:

do 20 t5 = 125,145,5

if (c(t5,t5+4).gt.3000) c(t5,t5+4)=$ $

20 continue

6) Reject- To reject a record from the rest of the edit Normally all records are passed straight from

the edit to the tabulation section regardless of whether or not they contain errors. Reject tells

Quantum to continue editing the record but not to include it in the tables.

For instance, we might write:

if (c73’8’) reject

if (c80’1’) t5=t5+1

end

to reject records in which column 73 contains an ’8’ from the tabulations but not from the rest of

the edit. Therefore, even if c73’8’, the record is still checked for a ’1’ in column 80 and if one is

found, t5 is incremented.

7) Return - To send the record to the tabulation section, The word return in Quantum bears no

relation to the same word in English. It does not mean go back to the start of the edit or anything

like that, rather it means ‘terminate the edit immediately and jump to the tabulation section’. Once

the record is tabulated Quantum reads in another record as usual. If there is no tabulation

section, the next record is read in straight away.

Return is very often used with reject to reject a record without finishing the edit. For example:

if (c73’8’) reject; return

if (c80’1’) t5=t5+1

end

Here any records in which c73’8’ are rejected from the tables, but, because reject is

followed by return which sends records to the tabulation section, editing is terminated

immediately. Thus, only records in which c73n’8’ will be tested for a ’1’ in column 80.

8) Stop -To stop editing records and start tabulating records read so far Stop tells Quantum to

stop the run and print tables once editing has been completed on the current record.

For example,

we may want test tables for first 100 people,so we set up a counter and terminate the run when it

reaches 100:

The statement:

if (rec_count.eq.100) stop

will stop editing records and start tabulating records read so far

9) Process - To send a record temporarily to the tab section Process is an edit statement which is

similar to return but must not be confused with it. When return is executed, the record is sent on

to the tabulation section; after the tables are completed for that record, the program returns to the

start of the edit section and the next record is read in.

When process is executed, the record is also sent immediately to the tabulation section where it

is used in table creation. However, after the record has been tabulated, control is passed back to

the edit section to the statement immediately following the word process.

The record continues through the edit and any statements after process applicable to the record

are executed. At the end of the edit the record is passed through the tabulation section again.

10) Split - To write correct records out to a clean data file and incorrect records out to a dirty data

file

Clean and dirty data files are the terms used to refer to files of correct and incorrect or rejected

records created automatically by the edit statement split.

Examining records

Holecounts

Holecounts are used to obtain an overall picture of the data before you write your edit program.

For each column they show:

o a distribution of the codes – e.g., how many respondents have a 2 in column 56

o the density of coding – i.e., how many respondents have 1, 2 or 3 or more codes ineach

column

o the total number of codes for the whole data file.

Creating a holecount

To create a holecount, type:

count c(start_col, end_col) [$text$]

where text is the holecount title.

To create a holecount you will use the count statement:

count c(start_col,end_col) [$text$]

where text is the heading to be printed at the top of each page. This is optional; if it is omitted the

holecount will simply be headed ‘Holecount’. Our example was created by the statement:

count c(1,16) $Demonstration Holecount$

Frequency distributions

A frequency distribution enables you to inspect the contents of a field of columns containing

alphabetic or numeric data. For example, in a shopping survey the price the respondent paid for a

bottle of mineral water may be stored in columns 112 to 114. A frequency distribution will tell you

how many respondents bought mineral water at particular price. This is very useful for

determining how the values in these fields should be grouped for tabulation, as well as for rough

estimates of medians.

To create a frequency distribution sorted in alphabetic and rank orders, type:

list c(start_col, end_col) [$text$]

where text is the heading to be printed.

To produce a frequency distribution sorted in alphabetic order only, type lista instead of list. For a

distribution sorted in rank order only, type listr instead of list.

Here are some examples:

listr c(107,108) $Contents of cols 7 and 8$

lista c(100,104) $First Set of Car Brands$

The first example produces a frequency distribution of the contents of c(107,108) sorted in

numeric order; the second example generates a list of car brands which will be sorted in

alphabetic order.

Data validationIn earlier section we discussed ways of examining the data for a set of records (with count) or for

an individual record (with write). In general, however, we want to check the validity of the data for

individual records by putting in the edit a set of testing sentences which will tell us not only

whether a record contains an error but also what that error is. There are two types of checking

sentence.

The first involves checking whether a column contains the correct type of coding (single-coding/

multi coding) and whether the codes in that column are valid. Take the question on a

respondent’s sex which may be Male, coded c106’1’, or Female, coded c106’2’. c106 must be

single-coded since no person can have two sexes, and the only codes which may appear in that

column are 1 and 2.Any record in which c106 is not single-coded with a 1 or a 2 will be flagged as

incorrect.

The second type of checking involves making sure that columns whose contents depend on the contents of other columns contain the correct codes. For instance, suppose the questionnaire asks whether the respondent has ever used a particular brand of washing up liquid. The answer is coded into c125 as ’1’ for Yes and a ’2’ for No. If the answer is Yes, the next questions concerning price and quality are asked. If c125’2’ indicating that the respondent has not used that brand of washing up liquid, the following columns must be blank. Conversely, if c125’1’, the following columns must be coded according to the codes on the questionnaire.

require

Both tasks listed above can be carried out using if but sometimes they can become very

complicated and repetitive. Therefore, Quantum has an additional testing statement, require,

specifically designed to increase the efficiency of this checking process.

Require is used in three types of sentence:

Column Validation

Tests columns against a given set of characteristics and deals with records not meeting the

requirements according to a specified action code.

Testing the Validity of a Logical Expression

Tests a logical expression and, if it is true, continues with the next statement. If the expression is

false, the record is dealt with according to the given action code.

Testing the Equivalence of Logical Expressions

Compares the logical value of a group of logical expressions. If all are true or all are false, the run

continues with the next statement, otherwise if the expressions yield a mixture of values the

specified error action is carried out.

The require statement has three forms, depending upon the function it performs, and these are

described in the subsequent sections. Each one must start with the word require which may be abbreviated to R.

Column and code validation

To validate columns and codes, type:

require[/code/] condition col1 [,col2 ...]

where code is the error action code, condition is the type of coding required, and col1 and col2

are the columns or fields to be tested.

This form of the require statement has four basic parts:

1. The word require or the letter r followed by a space.

2. An optional error action code enclosed in slashes.

3. A code defining the type of coding required.

4. The column or columns to be checked, separated by commas.

Checking type of coding

Checking with require can be as simple or complex as you like. In this section, we will start with

the simplest checks and deal with each extra feature in turn. We will assume, unless otherwise

stated, that the error action code is the default Print and Reject (code 3) and will omit it from most

of the examples accordingly

The most basic form of the require statement simply checks whether the column or field of

columns contains the correct type of code; it does not check the individual codes themselves.

Code types may be:

b Blank

nb Not blank (i.e., single-coded or multi coded)

sp Single-coded (literally, single-punched)

spb Single-coded or blank

One of these types must follow the word require since it tells Quantum what to check for. All that

remains is to say which columns are to be inspected; just list each column or field of columns at

the end of the statement. If more than one column or field is defined, each one must be separated

by a comma.

Here are some examples in which the record to be checked is:

----+----1----+----2----+----3----+----4----+

002411123481231&- *1927235537*&& 1 1 1

The statement:

require nb c10, c(25,35)

checks that columns 10, and 25 to 35 inclusive are not blank – they may contain any number of

codes. This record satisfies both conditions so it passes on to the next statement in the edit.

The statement:

r sp c11, c15, c23, c41

looks to see whether columns 11, 15, 23 and 41 are single-coded. In our record they are, but if

this were not the case (say c11’123’) the record would be printed out and rejected from any tables

that may be produced. Additionally, Quantum would tell us ‘Column 11 is 123’.

Comments with require

To define a message to be printed when a record fails a test, type:

r [/err_code/ ] condition columns $message$

When incorrect records are printed out, require automatically prints a short text describing the

error. Normally, it tells you what codes were found in the column which is wrong, but if this is not

what you want, you may define your own error text by entering it enclosed in dollar signs at the

end of the statement. This text will then be printed in place of the default text when errors are

found. For example, if c329 is multicoded when it should be single-coded, the statement:

r sp c329

will print the whole record and tell us which codes were found in that multicode:

Column 329 is 13

Instead of being told which codes the column contains, you may prefer to see a message linking

the error to a question on the questionnaire. In this case you will need to add your own error text

as follows:

r sp c329 $q21a not sp$

These texts may be as long or short as you like.

Checking codes in columns

To check for specific codes in a column, type:

r [/err_code/] condition col1’codes1’ [, col2’codes2’ ... ]

where codes1 are to codes to be tested for in column or field col1, and codes2 are the codes to

be tested for in column or field col2.

Any codes which are present in col1 but are not listed in codes1 are ignored. The same applies

to any other column and code pairs listed.

Sometimes it is not sufficient to check just the type of coding, and you will want to know whether

the codes found are valid for that column. To do this, we use the information given in the previous

section as a base, and add on our first ‘optional extra’. To check whether a column or field of

columns contains specific codes, follow the column specification with the codes to be checked,

enclosed in single quotes. For example:

r /5/ sp c223’1/5’

tells us that column 223 should be single-coded within the range of codes 1 through 5. Any other

codes in this column are ignored. Thus, a record in which c223’14’ is incorrect because it contains two of the listed codes, whereas a record in which c223’27’ is correct because it contains only a 2 from the range ’1/5’. Of course, any record which does not contain a 1, 2, 3, 4 or 5 at all is also incorrect, regardless of whether or not it is single-coded: c223’9’ is just as wrong as c223’789&’.

Exclusive codes

To check that a column or field contains no codes other than those listed, type:

r [/err_code] condition col1’codes1’o

If col1 contains any codes other than those given in codes1, the test is false.

Now that you know how to check codes, the next thing to discuss is how to check that all other

code positions are blank.

We have said that statements of the form:

r sp ca’p’

accept all records containing only one of the codes ’p’ in column a, regardless of what other

codes are also present. To check that a column contains only the listed codes and nothing else,

follow the code specification with the letter O (for only) in upper or lower case. For example, to

indicate that c356 must be single-coded in the range ’1/5’ and that all other positions (’6/&’) must

be blank, you should type:

r sp c356’1/5’o

which is the same as

if (c356’6/&’.or.numb(c356).ne.1) write; reject

Any of the following would cause the record to be printed and rejected:

c356’34’ c356’59’ c356’8’ c356’ ’

Require may define conditions for more than one column. Just follow each column with the code

positions to be checked and separate each set with a comma:

r sp c164’12-’, c165’1/70’, c166’1/3’, c167’1/9-’, c168’1/5’

Here the columns to be checked are consecutive but have been listed separately because they

each have different sets of valid codes. If all columns could be single-coded in the range 1 to 7

we might abbreviate this to:

r sp c(164,168)’1/7’ $q10a/e$

since this notation means that each column in the field must be single-coded within the given

range rather than that the field as a whole may contain only one of those codes.

Automatic error correction

To define a correction code to be used as a replacement for codes which fail the required

condition, type:

r [/err_code/] condition col1’codes1’ :’new_code’

new_code is the code or codes to be inserted in col1 if it fails the test condition. Any codes

already in that column are overwritten.

As you know, records found to have errors are printed, coded and/or rejected according to the

error action code. When the run is finished you will look at these records and, if possible, correct

the errors by using the on-line edit or correction file facilities.

Occasionally you will know in advance what to do with certain types of error; say, for instance, the

respondent’s sex has been miscoded. You may decide or be told to recode this person as a ’3’ in

the appropriate column indicating that the sex was not known. The way to do all this in one go is

to write the normal require statement that checks columns and codes, and to follow the code

specification with a colon (:) and the replacement code (in this case ’3’) enclosed in single quotes,

thus:

r /2/ sp c106’12’ :’3’

Any record in which c106 is not single-coded with either a ’1’ or a ’2’ will have the contents of

c106 overwritten with a ’3’.

The equivalent using if and an assignment statement would be written:

if (numb(c106’12’).ne.1) c106’3’;

+write $c106 incorrect$

Once again, the require is shorter and quicker.

When working with fields, it is not possible to define replacement strings for the field as a whole.

You should, however, note that if a single replacement code is given for a field of columns, any

incorrect columns in that field will be overwritten with the replacement code. The correct columns

remaining untouched. If we have:

+----4----+

1927

and we write c(237,240)’1/5’ :’&’" we will have:

+----4----+

1&2&

Validating logical expressions

This type of require also has four parts, two of which are optional:

1. The word require or the letter r followed by a space.

2. An optional action code enclosed in slashes.

3. A logical expression enclosed in parentheses.

4. An optional error text enclosed in dollar signs.

For example:

r /3/ (c133’4’ .and. c140n’5’) $Cols 33/40 incorrect$

says that c133 must contain a ’4’ and c140 must not contain a ’5’. If one or other or both expressions are false, Quantum prints the record out with the message ’Cols 33/40 incorrect’ and rejects it from the tables.

Testing the equivalence of logical expressions

To test whether a group of logical expressions all have the same logical value, type:

r = (expression1) (expression2) ...

There must be a space between r and the = sign.

Require can evaluate groups of expressions and perform given tasks depending on whether all

expressions are true or all are false. When all the expressions have the same value (i.e., all true

or all false) Quantum continues with the next statement in the program, whereas if some are true

and some are false, the record being tested will be dealt with according to the given (or default)

error action code.

This statement has five parts:

1. The word require or the letter r.

2. An equals sign which must be preceded by a space.

3. An optional action code.

4. The expressions to be evaluated, each one enclosed in parentheses .

5. Optional error text enclosed in dollar signs.

This type of statement is generally used to check routing patterns. For example: if a ’2’ in c125

means that the respondent did not try Brand A washing powder, we would expect columns 126 to

145 which record his opinion of it to be blank. On the other hand, if he tried the washing powder,

we would expect to find his opinions about it coded in columns 126 to 145. This can be written:

r = (c125’2’) (c(126,145)=$ $)

which says that to be accepted, a record must either have a ’2’ in column 125 and blanks in columns 126 to 145, or something other than a ’2’ in c125 with at least one code somewhere in c(126,145).

Actions when a require statement fails

When Quantum executes a require statement, it sets the variable failed_ to True if the data fails

the require statement or to False if the record passed the requirement. You can then test

whether failed_ is True and take whatever actions you wish. For example, if you are checking

that the respondent’s sex is coded as a ’1’ or a ’2’ only, you may wish to blank out the column if it

contains any other code or codes. You could write this as:

r sp c123’12’

if (failed_) set c123’ ’

The test for failure is made on the last require statement executed for the current record.

This may not always be the most recent require statement in the program, and it may not be the

require statement you intend Quantum to execute. If you write:

r sp c112’1/5’

if (c115’1’) r b c116


the test for failure could apply to either of the previous statements. If column 115 does not contain

a ’1’, the second require statement will not be executed and failed_ will be True if column 112 is

not single-coded in the range ’1/5’. If column 115 contains a ’1’, then failed_ will be True if

column 116 is not blank.

You can get around this potential problem by setting failed_ to zero (the equivalent of False) just

before the require statement you wish to test. For instance:

r sp c112’1/5’

failed_ = 0

if (c115’1’) r b c116


Data correction

There are four ways to correct data:

o Correct the data in the original data file.

o Correct the data in the C array interactively.

o Replace the incorrect codes with specific codes using edit forcing statements.

o Write a file of corrections to be merged with the original data when it is read in by a

Quantum program.

Forced editing (forced cleaning)

This section does not introduce any new keywords; instead it tells you how to combine the

statements that you already know in order to clean your data.

A record which generates too many error messages, or which is clearly incorrect can be

removed, as noted. Suppose its serial number is 2004. Then we have:

if (c(101,104)=$2004$) reject; return

This rejects the record from the rest of the edit and the tabulation section as well. This statement

should be at the beginning of the edit to avoid unnecessary editing of a useless record.

Columns within a record can be removed by blanking them out or setting them to a common

reject code, often a minus or ampersand. For example:

if(c125n’12’) c125’&’; c(126,145)=$ $

All records in which c125 contains neither a 1 or a 2 will have the contents of that column

replaced with an ampersand, and whatever is in c(126,145) blanked out. As a real-life example,

suppose a 1 in c125 means that the respondent visited the market, and a 2 in that column means

he did not. Information about purchases made at the market are stored in c(126,145). If column

125 contains neither a 1 or a 2, we cannot clearly establish whether or not the respondent visited

the market so we set c125 to a special code and blank out any information about purchases.

Inserting correct data is generally more difficult than removing invalid data, because you very

often don’t know what the correct data is. However, if you do know, you can correct the data

record by record, or make the same correction for any record which is incorrect. For instance:

if(c(101,104)=$2222$) c112’2’; c(113,114)=$ $

corrects the record whose serial number is 2222 by setting a 2 into c112 and blanking out

c(113,114).

If you do not know what the correct data is, you may decide to replace the incorrect code or

codes with a valid code chosen at random. For example:

if (c(101,104)=$3625$) c145=rpunch(’1/5’)

replaces whatever was in column 145 with one of the codes 1 through 5 for the record whose serial number is 3625.

Introduction to the tabulation

When a record has passed through the edit without being rejected, it is passed to the tabulation section, if one exists. At this point, data, integer and real variables are available to create tables. The program deals with one complete record at a time. The tabulation section consists of a series of statements which determine the contents of the tables. Each table may be thought of as a matrix of cells. Each cell of this table is defined by two conditions, one from the row and one from the column.

The hierarchy of the tabulation section

The tabulation process is hierarchical in characteristics can be defined at one level which will apply to that and all lower levels.

Components of a tabulation program

A tabulation run consists of three sets of control statements:

Run control statements

Run control statements determine the overall characteristics of the run, and contain the text which is constant for all tables. Filters may be defined, applicable either to all tables in the run or to all tables defined before another general filter statement is read. Titles are entered in various ways depending upon their position in the table.

Defining run conditions

To define global and default conditions for the run, type:

a;opt1[; opt2; ... ]

at the start of the tabulation section.

Global run conditions, if any, are defined on the a statement. If used, it must be the first statement

in the tabulation section. Its format is:

a;options

where options are keywords defining the global characteristics of the run. You may list as many

keywords as you like, provided that they are separated by semicolons (;), for example:

a;dsp;op=12;date;dec=1

Some of the commonly used options and it’s functions are :

colwid=n Defines the width of columns in the printed tables where no p statements exist in the

column

csort Sort tables column-wise (i.e., horizontal sorting rather than vertical row-wise sorting).

date By default, tables are printed without a date. Use of the keyword date causes the current

date to be printed in the top right-hand corner of each table. The date is in the format dd mm yy

dec=n This determines the number of decimal places for absolute figures. If

dec= is not used, the default of no decimal places is assumed.

decp=n This sets the number of decimal places required for percentages. The default is decp=1

meaning one decimal place. This applies when op=0, 2, 7 or & (see below). Any number of

decimal places are allowed, as long as you make each column wide enough to accommodate

them.

dsp This leaves one blank line between each row of data in a table. Without this, one line follows

directly underneath another.

flt=name Invokes the filter conditions and titles named on the flt= statement. If the filter defines

conditions, the rules governing data options apply.

flush Causes rows containing percentages to be printed with the percentages directly below the

absolutes rather than one column to the right.

indent=n Where a row text is longer than the space allocated to the row text in the table,

Quantum breaks the line in between words and contin ues the

text on the next line. To have these continuation lines indented from the left margin, specify the

amount of indentation required with indent=.

Texts may be indented by between 0 and 15 spaces: the default is indent=0.

op=n This keyword governs the type of output in the tables. Output types are & Total

percentages. The value in the cell is percentaged against

the number in the upper left-hand corner of the table (normally the base) rather than on the totals

in the relevant column or row. If the table contains more than one base element, percentages are

calculated using the leftmost figure in the most recent base element.

- Row rank figures are printed below each cell. Figures are ranked within rows, using 1 for

the largest figure. Where two or more numbers have the same rank, they are all assigned the

lowest rank possible. Thus, if the previous rank was 2 and the next value to be ranked occurs in

the row three times, those numbers will all be ranked 5.

1. 0 Row percentages.

2. 1 Absolute figures (default).

3. 2 Column percentages.

4. 3 Column rank figures are printed below each cell. Figures are ranked within columns,

using 1 for the largest figure. Where two or more numbers have the same rank, they are

all assigned the lowest rank possible. Thus, if the previous rank was 2 and the next value

to be ranked occurs in the column three times, those numbers will all be ranked 5.

5. 5 Prints the text 100% on each cell of the base row.

6. 6 Used with op=2 to produce two percentages for each cell.

7 Cumulative percentages.Indices. The index for a cell is generated by dividing

the row percentage in the cell by the row percentage in the base row.

8 Prints absolutes and percentages side by side.

age This option invokes automatic page numbering. Since this is the default – pages are

numbered from 1 automatically – this option is generally used in its negative form of nopage

which suppresses automatic page numbering.

paglen=n This determines the number of lines printed on each page. The default is paglen=60

lines but any value between 10 and 10,000 is valid.

pagwid=n Normally tables can be up to 132 characters wide. pagwid= enables you to decrease

the page width or to extend it to a maximum of 10,000 characters.

pc This prints percent signs after percentage figures. This is the default, so this option is usually

used negatively – nopc – to print percentage figures without percent signs.

sort: Creates sorted or ranked tables.

wm=n This keyword names the weighting matrix to be used.

Table control statements

Table control statements name the questions to be cross-tabulated against each other to create

tables. In Quantum, these questions are called axes. The most important table control statement

is the tab statement which lists the axes to be used to create an individual table. These statements may also specify the text and overall characteristics of each table.

Creating a table

To create a table, type:

tab [axis1] [axis2] [axis3] [axis4] row_axis column_axis [;options]

In order to create a table, Quantum needs to know which is the column axis and which isthe row

axis. If the table has more than two dimensions you will need to say which axes should be used

for the extra dimensions. Each table must be created separately using a tab statement, as

follows:

tab row-axis column-axis

Tab statements must precede the axes definitions in your program file.

multidimensional tables

Multidimensional tables are ones created from more than two axes. They occur when a series of

tables has the same rows and columns, but each table in the group has additional characteristics

which are themselves the conditions of other axes. This sounds complicated, so let’s take an

example.

Our basic table is of age by sex created by the tab statement: tab age sex

We have been asked to produce a separate table of age by sex for each region of the country.

Whereas before each cell had two conditions (age and sex) it now has three (region, age and

sex).

There are two ways of writing this specification. You may either:

a) write as many tab statements as there are regions, and filter each table of age by sex to

include only those respondents resident in a given region, or

b) write a single tab statement to create a three-dimensional table.

Both methods produce the same results – the main advantage of (b) over (a) is that (b) involves

you in a lot less work.

The tab statement to create the multidimensional table is:

tab region age sexcommonly used options in tab section

sid place this table to the right of the previous one

und place this table underneath the previous one

add add this table to the previous one

div divide the previous table by this one

To place tables side by side, type a tab statement for the first table and follow it with:

sid row_axis column_axis [;options]

Options are any of anlev=, c=, celllev=, inc=, maxim, means, median, minim and wm=. To

place tables one underneath the other, type a tab statement for the first table and follow it with

for example: the statement

tab region sex;c=250’1’

sid region age;c=254’1’

will place two tables side by side

To place tables one underneath the other, type a tab statement for the first table and follow it

with:

und row_axis column_axis [;options]

Options are any of anlev=, c=, inc=, maxim, means, median, minim and wm=. for example:

the statement

tab region sex;inc=c(25,28)

und region age;inc=c(35,38)

will place the second table underneath the first one

To add tables, type a tab statement for the first table and follow it with:

add[col_offset[,row_offset] ] axis_names

where axis_names is the same number of axis names as appears on the tab statement. for

example:

tab ax01 bk01

add ax02 bk02

Here we are creating the table ax02 by bk02 and adding it to the table ax01 by bk01.

To divide one table by another, define the top table on a tab statement followed by:

div axis_names [;options]

where axis_names is a list of as many axis names as there are on the tab statement, and

options is any of the keywords anlev=, c=, inc=, maxim, means, median, minim or wm=. The

statements:

tab ax06 brk1

div ax07 brk2

Defines the denominator of a table to be produced by dividing the table specified On the previous tab statement by that on the div line.

Axis control statements

Broadly speaking, an axis is Quantum’s way of defining questions from the questionnaire. Each

axis consists of a set of statements which establish the conditions and text for the rows and

columns of a table.

The axis is an integral part of your tabulation program: without it there can be no tables. At its

simplest level an axis represents a question on the questionnaire, and contains statements which

define the responses to that question and the codes by which Quantum can identify them.

Each axis may be used to create one or more of the following:

o the rows of a table

o the columns of a table

o a page in a set of tables

o a set of pages in a group of tables

Types of elements within axes

There are four types of element in an axis:

o Text and condition elements

o Text elements

o Arithmetic elements

o Statistical elements

Text and condition elements

These elements contain text and conditions which define the characteristics a respondent must

have to be included in the element. In a simple axis each element will refer to one response to a

question and will produce a row, column or table of figures telling you how many people gave that

response.

The general format of a condition is:

c=logical expression

c=cn’p’ is true if column contains the code ‘p’ and false if does not

Most commonly used count-creating elements for tabulation are:

Count-creating elements are the basis of any table since they tell you how many respondents

gave which responses. There are several statements which will create numeric elements; which

you use will depend upon the type of data to be read and the complexity of the condition defining

eligibility for inclusion in the element. Statements are:

n01 used for simple or complex conditions

n15 same as n01 except that the element is not printed

n10 creates a base for percentaging

n11 same as n10 except that the element is not printed

col used for simple conditions

val used for numeric data

fld used for numeric codes

bit a variant of fld

Text elements

These elements create nothing but text; no cells containing counts or values are created from

these elements.

There are three statements which are used within an axis to create text-only elements. These

are:

n03 create a text-only element

n23 create a subheading

n33 continue long element texts

If you would like subheadings to be underlined, place one of the options unl1, unl2 or unl3 on

the n23. The hdlev= keyword allows you to define various levels of subheading, starting at level

1 for the top subheading down to level 9 for the lowest level. If you would prefer the text to be left

justified above the columns to which it refers, add the option hdpos=l to the n23. If you would

prefer the text to be right justified, use hdpos=r instead. (hdpos=c is also available for centered

text but since this is the default you are unlikely to need it).

Arithmetic elements

These are elements which contain arithmetic values rather than counts. For example, one

element may tell you the number of times a product was bought rather than the number of people

who bought it.

Statistical elements

Part of Quantum’s power lies in the fact that it offers you the ability to create various types of

statistical output without having to know the formulae necessary to calculate them. These

elements contain totals, subtotals or statistical functions such as means and standard deviations.

Statements which perform statistical calculations are:

n07 average

n12 mean

n13 sum of factors

n17 standard deviation

n19 standard error of the mean

n20 error variance of the mean

n30 medians

n04 total

n05 subtotal

To define incremental values for means, standard deviations, standard errors and error

variances, type:

n25[element_text; inc=arith_expr [;c=log_expr] [; row] [; col]

The n25 does not normally print anything in the table. Use row and/or col to print these values as the rows and/or columns of the table.

factors

fac= defines factors when the numbers in the data are not to be used (e.g., the data may be

multicoded) whereas inc=, also mentioned in the Data Options section, reads the data from the

column and uses that as the factor for each row. What to use when is best illustrated by

examples,

although in general you should try to use fac= whenever possible since, in processing terms, it is

more efficient than inc=.

The respondent has been asked to say how much he agrees or disagrees with a particular

statement. If he agrees very much, he has a code ’1’ in, say, C210. If he agrees somewhat, he

has a ’2’; if he neither agrees nor disagrees he is coded as ’3’; disagrees somewhat, a ’4’ and

disagrees very much, a ’5’. People who refuse to answer are coded as C210’&’. We wish to

obtain a numerical mean value of these opinions using factors of +2 for agrees very much down

to –2 for disagrees very much. These are not the same as the codes representing these

responses in the data, so we enter them with fac=. People who refused to answer will appear in

the table but will not be included in the mean.

So the axis will look like

l vers1

n01Agrees Very Much;c=c210’1’;fac=2

n01Agrees Somewhat;c=c210’2’;fac=1

n01Neither Agrees Nor Disagrees;c=c210’3’;fac=0

n01Disagrees Somewhat;c=c210’4’;fac=-1

n01Disagrees Very Much;c=c210’5’;fac=-2

n01Refused;c=c210’&’

n12Mean;dec=2

Miscellaneous ‘n’ statements

To define a condition that applies to a group of consecutive elements, type:

n00;c=logical_expression

An n00 defines a condition applicable to all subsequent rows until another n00 is read or

until the end of the axis, whichever is the sooner. Its format is:

n00[;c=condition]

Where the condition is any valid logical expression.

To override the automatic page turnover within an axis, insert the statement:

n09[Text]

at the point at which a new page is required. ‘Text’ is an optional text which will be printed beneath the table headings at the top of the next page.

More commands to generates counts

The col statement

To define a list of elements with codes all in the same column, type:

col number;[base;] elm_txt1[=’codes1’] [; elm_txt2[=’codes2’] ... ]

If several consecutive statements in an axis have conditions defined by a code or codes in the

same column, you can save yourself a lot of time and effort by replacing the individual n01

statements with a single col statement.

One of the simplest col statements you can write is:

col n;[base];Rtext1[=’p1’];Rtext2[=’p2’]

where n is the column containing the codes for this question, base creates a base element, and

Rtext1=’p1’, Rtext2=’p2’ and so on define the texts and conditions for the individual elements.

To explain more clearly how the col statement works, let’s take the axis mstat that we wrote

earlier and rewrite it using a col statement. Originally it consisted of five

statements:

n10Base

n01Single;c=c109’1’

n01Married;c=c109’2’

n01Divorced;c=c109’3’

n01Widowed;c=c109’4’

We can replace these with the line:

col 109;Base;Single;Married;Divorced;Widowed

The val statement

Val is used when the conditions defining eligibility for inclusion in an element are positive

numbers or ranges of positive numbers rather than codes; that is, where the question in the

questionnaire requires a numeric response rather than a single or multicoded answer; for

example, the number of people in the household, or the number of telephone calls made.

To define elements whose condition is that a variable contains a specific value, type:

val variable; = ;number1 [element_txt1];number2 [element_txt2] ...

If the elements contain text as well as a number, the number may appear anywhere in the text. If

the value is not part of the text, type:

val variable; = ;element_text = number; ...

The base, hd=, tx= and =rej options described for col statements are also valid on val

statements of this type.

Val can be used to test whether the value of a variable is equal to a given value. If it is equal, the

cell count is incremented by 1. The format is:

val variable;[Base];[hd=Text];=;[tx=Text];n1 [Text1]; ... ;nn [Textn]

where variable is the data, integer or real variable whose value is to be tested, n1 to nn are the

values against which the variable is to be compared, and Text1 to Textn are the row

descriptions to be printed in the table.

The equals sign indicates that the test is for arithmetic equality rather than ranges. Base, hd=

and tx= are optional and create the base, sub-heading and text-only rows of the table as

described for col statements.

Let’s work through an example to illustrate this. Suppose c(110,111) contains data on the number

of people in the household, and we wish to set up a table showing how many respondents live in

households containing 1, 2, 3, 4, 5 or 6 people, so we write:

val c(110,111);Base;Hd=Number in Household;=;1 Person;2 People;

+3 People;4 People;5 People;6 People

The fld statement

To define elements whose condition is that a field contains a specific numeric code, type:

fld column_specs;element_txt1[=code[,code ...] ]; ...

The base, hd=, tx= and =rej options described for col statements are also valid on fld

statements.

The column specs on a fld statement define the columns to be read. There are three ways of

entering them. First, you may list each column or field reference one after the other, separated by

commas. The list must be enclosed in parentheses. In our example this would be:

fld (c(12,13), c(14,15), c(16,17))

Second, if you have sequential fields as you do here, you can type the start columns of each field

followed by the field length. The list of start columns is separated by commas and enclosed in

parentheses, and the field length comes after the closing parenthesis and starts with a colon. If

you use this notation for the film example you would write:

fld (c12, c14, c16) :2

If you wish, you can abbreviate this further by typing just the start columns of the first and last

fields, followed by the field length. This time you do not use parentheses:

fld c12, c16 :2

Third, if the fields are not sequential, you may list the start columns and field width of each

group of columns (as shown above) and separate each group with a slash. For example, to read

data from columns 12 to 17 and 52 to 57, with each field being two columns wide, you would

type:

fld c12, c16 / c52, c56 :2

This reads c(12,13), c(14,15), c(16,17), c(52,53), c(54,55) and c(56,57).

You can also use this notation for single non-sequential fields. For example:

fld c23 / c36 / c71 :2

means c(23,24), c(36,37) and c(71,72).

The element specs part of the statement defines the element texts and the codes which represent

those responses. If you enter element texts by themselves, Quantum assumes that the first text is

code 1, the second text is code 2, and so on. The codes apply to all fields named in the column

specs part of the statement. Therefore, to define elements which will count the number of people

who saw each film, you would write:

fld c12,c16:2;Columbus;Aliens 3;Pretty Woman;

+Green Card;Batman 2

Weighting in Quantum

Sometimes in surveys we treat the respondents as representatives of the total population of

which they are a sample. Normally, tables reflect the attitudes of the people interviewed, but we

may want the tables to reflect the attitudes of the total population instead, so that it seems as if

we had interviewed everyone rather than just a sample of the population. This, of course,

assumes that the people interviewed are a truly representative sample.

If we take a sample of 380 from a population of 10,000 middle-aged housewives, and discover

that 57 members of this sample buy cheddar cheese, we may want the number of middle-aged

housewives who buy cheddar cheese to read 1,500 in our tables, not 57. Moving from 57 to 1,500

is the fine art of weighting. In this case, each middle-aged housewife has a weight of 10,000/380.

Since 57 of them buy cheddar cheese, the number in the cell will be:

10000 / 380 * 57 = 1,500

Weighting is also used to correct biases that build up during a survey. For example, when conducting interviews by telephone you may find that 60% of the respondents were women. You may then want to correct this ratio of men to women to make the two groups more evenly balanced.

Weighting methods

Quantum is sufficiently flexible to allow more than one set of weights for a given set of

respondents. Which set is applied is determined by options on the a,sectbeg, flt or tab

statement or on the statements which create the individual rows or columns of a table. Each set

of weights, however, will apply one weight for each respondent. There are two ways of calculating

weights:

a) The weight for each respondent may be part of the data for that respondent, or it may be

calculated in the edit and passed to the tabulation section as a variable.

b) The more common method of weighting is to define a set of characteristics and apply specific weights to respondents satisfying those characteristics.

Types of weighting

Quantum offers factor, target and rim weighting, preweights, postweights, weighting using

proportions and weighting to a given total.

Factor weighting

With factor weighting, every record which satisfies a given set of conditions is assigned a specific

weight. You would generally use it when the weights are calculated outside of Quantum – for

instance, you may be told that all unemployed people in London require a weight of 10.5,

whereas unemployed people in the rest of the country need a weight of 7.3.

Target weighting

Target weights may be used when you know the exact number of respondents you want to

appear in each cell of the weighted table. For example, in a table of age by sex, you may know

the exact number of men under 21, women under 21, and so on, to appear in the table once it

has been weighted. The weights that you define in your matrix are therefore the values to appear

in the weighted table rather than the weights to be applied to each respondent of a given age and

sex.

Rim weighting

Rim weighting is used when:

a) you want to weight according to various characteristics, but do not know the relationship of the

intersection of those characteristics, or

b) you do not have enough respondents to fill all the possible cells of the table if you were to

weight the data using the multidimensional technique described above.

For example, you may want to weight by age, sex and marital status and may know the weights

for each category of those characteristics (e.g. people aged 25 to 30; men; single people).

However, you may not know the weights for, say, single men aged between 25 and 30, married

women aged between 31 and 40, and so on.

Entering weights as proportions (input weighting)

When we were talking about target weighting, we said that sometimes you might not know the

actual counts of respondents in a group, even though you may know that the group is a certain

percentage or proportion of the total population. For instance, you may know that 60% of the

population is women, but you may not know how many women that represents.

When this happens, you can enter the percentages or proportions as the weights for each group,

and use the keyword input to indicate that these figures should be used as targets. For example,

in a table of age by sex you would enter the proportion or percentage that each combination of

age and sex is of the total population, and Quantum would calculate what weight to assign to

each respondent in each category.

Weighting to a given total

When you define targets which add up to more than the number of respondents in your sample,

Quantum will calculate the weights for each respondent such that the total for the weighted table

equals the total of the figures in the weighting matrix. You may define your own total figure

(usually the number of respondents in your sample) using the keyword total=n, where n is the

required weighted total. Quantum will then calculate the weights according to the values in the

weighting matrix and will then adjust them to match the total you have defined.

Preweights

Preweights, stored as part of each respondent’s data or created during the edit, are applied to

individual records before target or factor weighting is applied. When the characteristic weights are

targets, the preweights are used in the calculation of the weight for each respondent.

Postweights

The opposite of preweights are postweights, which are applied after all other weights have been applied, and therefore have no effect on the way in which targets are reached. They are generally used to make a final adjustment to a specific item.

Descriptive statistics

Quantum provides facilities for calculation of a set of basic statistics from the figures produced in

Quantum tabulations. They include the statistics most commonly used for testing hypotheses

about the values of proportions (percentages) and the locations (average values) of variables,

and about differences in these between two or more subsets of the data. There are also chi-

squared statistics for testing hypotheses about a single distribution or about differences between

two or more distributions.

The statistical tests available are:

o One-dimensional, two-dimensional and single classification chi-squared tests

o Four tests of differences between proportions (Z-tests)

o Two tests of differences between means (T-tests)

o Friedman’s test of differences in location between a set of related samples (sometimes

known as ‘Friedman’s two-way analysis of variance’)

o Kolmogorov-Smirnov test of differences between two samples

o McNemar’s test of the significance of changes

o F Test for testing differences between a set of means (one-way analysis of variance

(ANOVA))

o Newman Keuls test of differences between means

o For each statistic, Quantum also calculates and prints an associated significance level so that you can readily see the results of the tests you have performed.

Quanvert

Quanvert is the Windowed version of quantum database. In other words , it is the GUI for

Quantum . Quanvert can process surveys of any type, size or complexity. Whether it's a survey

with hundreds of questions, or millions of respondents, or one that's been conducted on a regular

basis for years - Quanvert can handle it fast.

Quanvert has been specifically designed for the market researcher. You don't have to be a data

processing or computer expert, or a statistician -you just have to be interested in your survey

results! And you can investigate your data from your desktop, without having to search through

volumes of printed reports. There is no need to predict what analyses you will require before you

receive your data

Any table can be created based on any variable or question. You can test out any hypothesis,

and dig as deep into the data as you wish. For instance, you may want to examine the age group

of people who responded positively to an advertisement. You can then take this a stage further

and produce a series of tables filtered on those females interviewed. Quanvert is especially

powerful for analyzing individual responses to verbatim or "open" questions.

How is the database produced?Quanvert databases are specified and created using quantum- SPSS MR's leading package for

editing, weighting and tabulating survey data. Quantum is already renowned in its own right as

the most powerful tabulation system available today. You can create the database yourself using

Quantum.

Preparing Quanvert database using Quantum

Before you can convert a Quantum spec and data file into a Quanvert database there are several

tasks you may need to carry out first. These include checking the Quantum program to ensure

that it will create the required information in the appropriate places, and setting up subdirectories

if variables are not to be stored in the main project directory. If you have a large database from

which you require only a few variables, you may use the raw Quantum data rather than creating a

full Quanvert database.

To create a Quanvert database,

Following command needs to be given at the command prompt

:

quantum –v [–pd dir_1] [–td dir_2] [prog_file] [data_file]

The –v parameter tells Quantum not to produce tables but, when it reaches the output

stage, to run the flip program instead.

The –pd and –td parameters allow you to read files from and create temporary files in directories

other than the directory in which you are running Quantum.

All Quanvert projects originate from Quantum. Although Quanvert produces tables identical to

those generated by Quantum, it does not normally use the raw data and Quantum program files.

Instead, it uses a series of compressed data and axis files, one pair per axis, derived from the

Quantum files. These individual databases are referred to as inverted or transposed databases,

and the process which creates them is called flipping. In databases with simple axes it is

possible to run Quanvert almost immediately on the raw Quantum data.

Files created by flip

File creates a number of files. The ones which are important to Quanvert are: The sex axis, for

instance, will have two sex.ax containing the element texts and sex. fli containing the inverted

data for that axis.

Filename Contents

*.ax axes text files

*.fli inverted data files

*.inc numeric variables (inc) files

*.mul values for numeric variables in axes

*.bit bit files for named filters

*.btx text for named filters

*.alp text (alphanumeric) variables files

axes.inf names of axes present in the database

incs.inf names of numeric variables present in the database

alpha.inf names of text variables present in the database

bits.inf names of named filters present in the database

qvinfo levels and weighting information

qvlvmn levels cross-reference files defining the relationship between the

higher level m and the lower level n

seg1.qv default run conditions and titles from the a statement

wmvalsn.q weights for weight matrix n

The sex axis, for instance, will have two sex.ax containing the element texts and sex. Fli

containing the inverted data for that axis.

To tidy a directory once the database has been created, type:

flipclean [–a] under Unix or: flipclea [–a] under DOS.

This deletes any temporary files created during the flip process but leaves intact any files which

are needed for Quanvert. Example

Structure of Quantum Spec:A typical program might look like this:

Struct;read=2;ser=c(5,8);crd=c(9,10);max=32 Structure of the Record

*include vars External Variables and Arrays are declared in a file

called Vars and included before including edit section

ed

*include edit Edit section will have calculations of counts, column settings to get

end counts which are not straight-forward.

a;dsp;spechar=–*;decp=1;flush;wm=0;axttr; Global commands which

+dec=0;rinc;acr100;dp;nsw;nopage;notype; control the overall

+paglen=64;pagwid=145; characteristics of a run

wm1 wax1 wax2;rim;input;

+20;30;50; Weighting of the dat in the output ( if required )

+50;50;

+33;33;33

*include tabs Will have details of what to be tabulated

with what in order to get a table

*include axes Contains the definitions of all variables used as Rows

*include breaks Contains the definitions of all variables used as Columns

quantum

Documents

fields of data variables

arithmetic expressions

data constants

integer variables

real variables

multicard records

logical expressions

quantum run