sas fundamentals 1.1

Upload: bommineni-pramodh-reddy

Post on 05-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 SAS Fundamentals 1.1

    1/69

    SAS Fundamentals

    Prepared By :G . NarasimhanTCS , Sholinganallur07/07/2003

  • 8/2/2019 SAS Fundamentals 1.1

    2/69

    SAS Overview

    SAS stands for Statistical Analysis System

    Developed by SAS Institute

    High Level Language

    Fourth Generation Language

  • 8/2/2019 SAS Fundamentals 1.1

    3/69

    SASSoftwares

    BASE SAS

    SAS / ACCESS

    SAS / AF

    SAS / CONNECT

    SAS / FSP

    SAS / GRAPH

    SAS / NVISION

    SAS / SHARE

    There are a variety of SAS softwares. Few mostly used are ,

  • 8/2/2019 SAS Fundamentals 1.1

    4/69

    SAS Softwares

    SAS Software Applications

    BASE SAS Report Generation , Mathematical and Data Analysis

    SAS / ACCESS Interface to DB2

    SAS / AF User friendly windowing applications

    SAS / CONNECT Communication with remote SAS sessions

    SAS / FSP Interactive data entry & retrieval facilities

    SAS / GRAPH Visual representation of data analysis

    SAS / NVISION Creation of 3D objects , animation & prototyping

    SAS / SHARE Concurrent update access to SAS files

  • 8/2/2019 SAS Fundamentals 1.1

    5/69

    Applications

    Statistical & Mathematical analysis

    Report Generation

    Graphics

    Business Forecasting

    Animation

    Modelling

    Data Analysis

    Operations Research

  • 8/2/2019 SAS Fundamentals 1.1

    6/69

    Features of SAS

    Portability

    Free format language

    Statements are not case sensitive

    Enables Macro facility to simplify code development

    Key words can be used as variable names

    Declaration of variables not required

    Statements can span more than one line

    More than one statement can be placed on a single line

  • 8/2/2019 SAS Fundamentals 1.1

    7/69

    Understanding SAS Dates

    In SAS every date is an unique number

    Dates before January 1 , 1960 are considered

    as negative numbers

    Dates after January 1 , 1960 are considered

    as Positive numbers

  • 8/2/2019 SAS Fundamentals 1.1

    8/69

    Understanding SAS Dates

    Jan. 1, Jan. 1, Jan. 1,

    1959 1960 1961

    -365 0 366

  • 8/2/2019 SAS Fundamentals 1.1

    9/69

    Every statement must end with a semicolon

    Variable names must begin with an alphabet and should not

    exceed eight characters

    Variable names should not have embedded blanks

    Variable names should not have SAS automatic variable names

    Comments can be demarked by /* and */ symbols or by *

    Rules of SAS Language

  • 8/2/2019 SAS Fundamentals 1.1

    10/69

    SAS variables can be classified as follows:

    Numeric Variables

    Character Variables

    Macro Variables

    SAS Variables

  • 8/2/2019 SAS Fundamentals 1.1

    11/69

    Numeric Variables

    Stores a maximum of 8 bytes

    Character Variables

    Stores a maximum of 200 bytes

    Macro Variables

    Represented with a & prefix

    SAS Variables

  • 8/2/2019 SAS Fundamentals 1.1

    12/69

    Every SAS variable can have a maximum of six attributes assignedto it . They are ,

    NAME - Variable Name TYPE - Variable Type LENGTH - Number of bytes variable can store FORMAT - Instructions that tell SAS system to write data

    values

    INFORMAT - Instructions that tell SAS system to read datavalues into variables

    LABEL - Label assigned to a variable

    Attributes of a SAS Variable

  • 8/2/2019 SAS Fundamentals 1.1

    13/69

    Informats are special instructions used to read data values intoa variable.

    Some of the Informats are :

    $CHARw. Reads character data with blanks

    COMMAw.d Removes embedded characters

    DATEw. Reads Date in the form of ddmmyy

    $W. Reads standard character data

    Selected INFORMATS

  • 8/2/2019 SAS Fundamentals 1.1

    14/69

    Formats are special instructions SAS System uses to write data valuesinto variables.

    E.g :

    $CHARW. Writes standard character data

    BESTW. Default format for writing numericvalues

    COMMAW.d Writes numeric values with commas

    separating every three digits

    Selected FORMATS

  • 8/2/2019 SAS Fundamentals 1.1

    15/69

    DATEW. Writes data in the form of dates

    ( ddmmmyy )

    DOLLARW.D Writes numeric values with dollar

    signs,commas and decimal points

    Selected FORMATS

  • 8/2/2019 SAS Fundamentals 1.1

    16/69

    Two Automatic variables ( numeric ) are created for each DATA stepProcessing . They are ,

    _N_

    _ERROR_

    _N_ : Initially set to 1 . Increments by 1 each time the DATAstep iterates.

    _ERROR_ : Default value is 0 . Set to 1 whenever an error isencountered.

    SAS Automatic Variables

  • 8/2/2019 SAS Fundamentals 1.1

    17/69

    SAS Operators can be classified as :

    Arithmetic Operators

    Comparison Operators

    Logical Operators

    SAS Operators

  • 8/2/2019 SAS Fundamentals 1.1

    18/69

    Operation Symbol E.g Meaning

    Addition + X=Y+Z; Adds Y and Z

    Subtraction - X=Y-Z; Subtracts Z

    from Y

    Multiplication * X=Y*Z; Multiplies Y

    by Z

    Division / X=Y/Z; Divides Y by Z

    Exponentiation ** X=Y**Z; Raises Y to the Zpower

    Arithmetic Operators

  • 8/2/2019 SAS Fundamentals 1.1

    19/69

    Symbol MnemonicOperator

    Meaning

    = EQ Equal to

    ~= or ^= NE Not equal to

    > GT Greater than

    < LT Less than

    >= GE Greater than or Equal to

  • 8/2/2019 SAS Fundamentals 1.1

    20/69

    Symbol Mnemonic Operator

    & AND

    | OR

    ~ or ^ NOT

    Logical Operators

  • 8/2/2019 SAS Fundamentals 1.1

    21/69

    Symbol Mnemonic Operator

    ||

    ( Concatenation)

    >< MIN( )

    MAX( )

    Other Operators

  • 8/2/2019 SAS Fundamentals 1.1

    22/69

    SAS Program

    A SAS Program is a collection of SAS steps. SAS steps can eitherbe

    DATA step ( or )

    PROC Step

    DATA step starts with the key word DATA. It is used for creatingSAS

    Data sets

    PROC step is used for accessing the SAS Datasets

  • 8/2/2019 SAS Fundamentals 1.1

    23/69

    SAS Step Boundaries

    SAS Step boundary can be identified in two ways.

    RUN statement

    Beginning of next SAS step

    A run statement marks the end of a SAS step. In case no runstatement is present , then the beginning of next SAS step marksthe

    end of previous SAS step.

  • 8/2/2019 SAS Fundamentals 1.1

    24/69

    SAS Step Boundaries

    SAS step boundary terminated by RUN statement

    E.g. : /* SAS step1 */

    DATA sample ;

    INFILE ext;

    INPUT @001 name @030 age;

    RUN;

    /* SAS step2 */

    PROC PRINT DATA = sample;

    RUN;

  • 8/2/2019 SAS Fundamentals 1.1

    25/69

    SAS Step Boundaries

    SAS step boundary terminated with out a RUN statement

    E.g. : /* SAS step1 */

    DATA sample ;

    INFILE ext;

    INPUT @001 name @030 age;

    ;

    /* SAS step2 */

    PROC PRINT DATA = sample;

    RUN;

  • 8/2/2019 SAS Fundamentals 1.1

    26/69

    Batch Mode Interactive Mode

    SAS Program

    Running SAS Programs

  • 8/2/2019 SAS Fundamentals 1.1

    27/69

    Batch ModeIn this mode , SAS Programs can be submitted through JCLs

    Interactive ModeThis mode demands for a SAS session called Display ManagerSession ( DMS ). This is a user friendly environment .DMS screen is split into two halves namely SAS Log and ProgramEditor.

    The upper half contains SAS Log and the lower half containsthe Program Editor. SAS Programs can be keyed in and executedfrom Program Editor.

    Running SAS Programs

  • 8/2/2019 SAS Fundamentals 1.1

    28/69

    SAS Output

    When you execute a SAS program, the output generatedby SAS is divided into two major parts namely ,

    ( i ) SAS Log & ( ii ) Output

    SAS Log

    Contains information about the processing of the SAS program,including any warning and error messages

    OutputContains reports generated by SAS Procedures and DATAsteps

    In Batch SAS , Output is routed to SASLIST by default .The outputcan also be routed to an external file with the help of FILEstatement.

  • 8/2/2019 SAS Fundamentals 1.1

    29/69

    SAS Output

  • 8/2/2019 SAS Fundamentals 1.1

    30/69

    SAS Data Library General

    Classification

    SAS Data Library

    SAS CatalogsOther FilesSAS Datasets

    Membersof typeData

    Membersof typeView

    Membersof type

    Catalog

    Membersof typeAccess

    Membersof type

    Program

  • 8/2/2019 SAS Fundamentals 1.1

    31/69

    SAS Data Libraries

    SAS Data Libraries is a collection of SAS Files.

    Each SAS Library can be Classified as

    Work (or) Temporary Library

    Permanent Library

    A work Library is gets deleted after the end of a SAS session and is notavailable for further Sessions. It is referenced by a one level name orwith the first qualifier as WORK

    A permanent library does not get deleted after a SAS session and isavailable all the time. It is referenced by a two level name . The firstlevel name should not be WORK

  • 8/2/2019 SAS Fundamentals 1.1

    32/69

    Concept of Data sets

    A SAS Dataset can be realised as a rectangular structurehaving a number of records in it with rows and columns.

    Each row is referred to as an Observation ( Logically one record)

    and the column names represent field names.

    A SAS Dataset is different from a Mainframe Dataset

    SAS Datasets are recognised only by the SAS System.

    SAS Datasets can not be accessed by any other programminglanguages other than SAS.

    SAS System can however fetch data from Mainframe Datasets.

  • 8/2/2019 SAS Fundamentals 1.1

    33/69

    Descriptor

    Information

    Observation

    Variable

    SAS Data Set

  • 8/2/2019 SAS Fundamentals 1.1

    34/69

    Descriptor Portion

    The descriptor portionof a SAS data set contains :

    General information about the SAS data set

    ( Data set name, number of observations, and so on)

    Variable attributes

    ( Variable name, type, length, position, informat,

    format, label )

  • 8/2/2019 SAS Fundamentals 1.1

    35/69

    SAS Dataset Types

    SAS Dataset

    SAS Data File( Contains

    Data Values )

    SAS Data View( Does not contain

    Data Values )

  • 8/2/2019 SAS Fundamentals 1.1

    36/69

    Parts of a Dataset

    Every SAS Data set has three elements namely

    Libref Data set name Member TypeGeneral form of a SAS Data set is :

    libref.data-set-name.member type

    Where ,libref is the logical name of a SAS data library .data-setname is the dataset name member type is DATA for SAS data filesand VIEW for SAS data views.( This is assigned by the SAS system )

  • 8/2/2019 SAS Fundamentals 1.1

    37/69

    Referencing a SAS Data set

    A SAS Data set can be referenced by a Two - Level or One Levelname.

    Two - Level NameSAS Data sets are stored permanently in a SAS library

    General form : libref.data-set-name

    One Level NameSAS Data sets gets deleted at the end of the current SASsession

    General form : data-set-name

    These Data sets are assigned to a a scratch library calledWORK library.

    G tti I f ti f R

  • 8/2/2019 SAS Fundamentals 1.1

    38/69

    DataDATAStep

    SASDatasets

    PROCSteps

    Information

    GettingInformation from RawData

  • 8/2/2019 SAS Fundamentals 1.1

    39/69

    Where do the data come from ?

    Input to a SAS Dataset can be any external file , TSO file , PS ,member of a PDS , a VSAM file, SAS Dataset or even an

    Excel file.

    Data retrieved from external sources SHOULD be converted toa SAS Dataset. This is because SAS system will recognise onlySAS Datasets.

  • 8/2/2019 SAS Fundamentals 1.1

    40/69

    DATA Step

    SAS sores information in the form of SAS Datasets.

    A SAS Dataset is created by using DATA statement.

    E.G : DATA details ;

    The above SAS creates a SAS Dataset called details .

    Data to a SAS Dataset can be supplied through INFILE or CARDSor

    CARDS4 statement

  • 8/2/2019 SAS Fundamentals 1.1

    41/69

    NULL DATA Step

    _NULL_ This is a Dataset which will not have any observations in it . Used for printing purposes Used for writing data onto the external files

    E.g. : The following file will write the contents of a SAS datasetcalledinto the details into an external file called outfile . Physical

    nameof outfile will be mentioned on JCL.

    DATA _NULL_ ;SET details;FILE outfile ;PUT @001 EMPNO

    @005 NAME;RUN;

  • 8/2/2019 SAS Fundamentals 1.1

    42/69

    DATAn Naming Convention

    When no dataset name is specified on a DATA statement , SASAutomatically names the Dataset created as

    DATA1,DATA2DATAn.

    This is called as DATAn Naming convention.

  • 8/2/2019 SAS Fundamentals 1.1

    43/69

    Referencing the external file

    The INFILE statement is used to reference the external file

    where the raw data is available. Using INPUT statement

    the data can be retrieved from the external file.

    E.g. : DATAdetail ;

    INFILEuhgxn0.extrnl.file

    INPUT

    @001 Name@032 Age

    ;

    RUN ;

  • 8/2/2019 SAS Fundamentals 1.1

    44/69

    These statements will create a SAS Dataset called detailcontaining

    two fields Name and Age. The values for Name will be taken

    column 1 of external file ( uhgxn0.extrnl.file ) and values for Agewill

    be taken from column 32 of external file.

    Referencing the external file

  • 8/2/2019 SAS Fundamentals 1.1

    45/69

    Reading a SAS Data Set

    Model AircraftID InServiceMF4000 010012 10890

    LF5200 030006 10300

    LF5200 030008 11389

    Model AircraftID InServiceMF4000 010012 10890

    LF5200 030006 10300

    LF5200 030008 11389

    DATAia.aircraftcap;

    SET ia.aircraftdata;

    RUN;

    DATA Step

    IA.AIRCRAFTDATA

    IA.AIRCRAFTCAP

  • 8/2/2019 SAS Fundamentals 1.1

    46/69

    Combining Datasets

    Various types of combining Datasets are :

    Concatenation

    Merging

    Interleaving

    Updating

  • 8/2/2019 SAS Fundamentals 1.1

    47/69

    ConcatenationCombines two or more datasets one after the other into a singledataset . This is accomplished using SET statement

    Interleaving

    Combines individual sorted datasets into one sorted dataset.This is accomplished using SETBY statement

    MergingCombines observation from two or more datasets into a singleobservation in a new dataset .This is accomplished by MERGE and

    MERGEBY statements.

    Combining Datasets

  • 8/2/2019 SAS Fundamentals 1.1

    48/69

    Updating

    Replaces the value of variables in one dataset ( Master Dataset)with non missing values from another dataset ( Transaction

    Dataset ). This is accomplished using UPDATEBY statement.

    Combining Datasets

  • 8/2/2019 SAS Fundamentals 1.1

    49/69

    DATAlist3;

    SET list1 list2;

    RUN;

    Concatenation

    LIST2Name AgeArijit 20

    Mohan 22

    Kumar 54

    LIST1

    Name AgeArijit 20

    Mohan 22Kumar 54Rohit 18Raj 33Sekar 17

    Name AgeRohit 18

    Raj 33

    Sekar 17

    LIST3

  • 8/2/2019 SAS Fundamentals 1.1

    50/69

    Merging can be classified as ,

    One- to-One Merging

    Match Merging

    Merging

  • 8/2/2019 SAS Fundamentals 1.1

    51/69

    One- to-One Merging

    In One -to-One merging , no BY statement is used. The SAS systemcombines the first observation in all datasets named in MERGEStatement into first observation in new dataset , the second

    observation

    in all datasets into second observation in new data set and so on.

    Match Merging

    In Match merging Datasets are merged according to the variables

    mentioned in the BY statement

    Merging

  • 8/2/2019 SAS Fundamentals 1.1

    52/69

    DATAnewpay;MERGEpayroll increase;

    RUN;

    One-to-One Merging

    Name Age Sex

    Anil 22 M

    Roopa 21 F

    Name Salary

    Anil 34500

    Roopa 26000

    Kiran 22000

    INCREASEPAYROLL

    NEWPAY

    Name Age Sex Salary

    Anil 22 M 34500

    Roopa 21 F 26000

    Kiran . 22000

  • 8/2/2019 SAS Fundamentals 1.1

    53/69

    DATAwork.three;MERGEwork.one work.two;

    BYX;

    RUN;

    Match Merging

    WORK.TWOX Y

    1 A

    2 B

    3 C

    X Z

    1 A1

    1 A2

    2 B1

    3 C13 C2

    WORK.ONE

    X Y Z1 A A11 A A22 B B13 C C13 C C2

    WORK.THREE

  • 8/2/2019 SAS Fundamentals 1.1

    54/69

    DATAlist3;

    SET list1 list2;

    BY name; RUN;

    Interleaving

    Name Age

    Anil 33

    Sunil 12

    Name Age

    Karthik 17

    Prakash 43

    LIST2LIST1

    LIST3

    Name Age

    Anil 33Karthik 17

    Prakash 43

    Sunil 12

  • 8/2/2019 SAS Fundamentals 1.1

    55/69

    DATAnewpay;

    UPDATEpayroll increase;

    BYName;

    RUN;

    UPDATING

    Name Salary

    Anil 24500

    Hari 32000

    Kiran 16000

    INCREASEPAYROLL

    NEWPAY

    Name Salary

    Anil 34500

    Hari

    Kiran 26000

    Name Salary

    Anil 34500

    Hari 32000

    Kiran 26000

  • 8/2/2019 SAS Fundamentals 1.1

    56/69

    PROCEDURES

    Procedures are used to perform operations on SAS Data set

    SAS has got a number of Procedures

    Procedures are represented by the keyword PROC

    Few commonly and most widely used PROCS such as

    CONTENTS , SORT , PRINT are discussed below.

  • 8/2/2019 SAS Fundamentals 1.1

    57/69

    CONTENTS Procedure

    Used to browse the Descriptor portion

    Gives information about Data set and the variables present inthe Data set

    General form of the CONTENTS procedure:

    PROC CONTENTS DATA=SAS-data-Set;RUN;

  • 8/2/2019 SAS Fundamentals 1.1

    58/69

    PROC CONTENTS

    E.g. :

    DATA ONE;

    INPUT @001 NAME $15. @020 AGE 2.;

    CARDS;RAMESH 12

    GOPAL 34

    RAJU 07

    ;

    RUN;

    PROC CONTENTS DATA = ONE;

    RUN;

  • 8/2/2019 SAS Fundamentals 1.1

    59/69

    PROC CONTENTS

  • 8/2/2019 SAS Fundamentals 1.1

    60/69

    PROC CONTENTS

  • 8/2/2019 SAS Fundamentals 1.1

    61/69

    Rearranges the observations in a SAS data set

    Creates a new SAS data set containing the

    rearranged observations

    Sort on multiple variables

    Sort in ascending (default) or descending order

    Do not generate printed output Treats missing values as the smallest possible

    value

    SORT Procedure

  • 8/2/2019 SAS Fundamentals 1.1

    62/69

    Sorting a SAS Data Set

    General form of the PROC SORT step:

    PROC SORT DATA=input-SAS-data-setOUT=output-SAS-data-set;

    BY by-variable(s);OPTIONS

    RUN;

  • 8/2/2019 SAS Fundamentals 1.1

    63/69

    PROC SORT

    E.g. :

    DATA ONE;

    INPUT @001 NAME $15. @020 AGE 2.;

    CARDS;

    RAMESH 12GOPAL 34

    RAJU 07

    RAJU 07

    ;

    RUN;

    PROC SORT DATA = ONE NODUPLICATES;

    BY NAME;

    RUN;

    PROC SORT

  • 8/2/2019 SAS Fundamentals 1.1

    64/69

    PROC SORT

    The Data set Work.one will containonly three records. Duplicate

    record RAJU is deleted.

    Name Age

    GOPAL 34

    RAJU 07

    RAMESH 12

    PRINT P d

  • 8/2/2019 SAS Fundamentals 1.1

    65/69

    Used to display the contents of a SAS Data set

    PRINT Procedure

    PROC PRINT < option list >;VARvariable-list;ID Variable-list;BY Variable-list;PAGEBY BY-Variable;SUMBY BY-Variable;SUM variable-list;

    RUN;

    S l SAS P

  • 8/2/2019 SAS Fundamentals 1.1

    66/69

    Sample SAS Program

    S l SAS P

  • 8/2/2019 SAS Fundamentals 1.1

    67/69

    Sample SAS Program

    S l O t t

  • 8/2/2019 SAS Fundamentals 1.1

    68/69

    Sample Output

  • 8/2/2019 SAS Fundamentals 1.1

    69/69

    Thank You !