sas procedure export

56
The EXORT Procedure Presented by: Sri Nikhil Gupta Gourisetti

Upload: srinikhil-gupta

Post on 16-Apr-2015

34 views

Category:

Documents


0 download

DESCRIPTION

highly used

TRANSCRIPT

Page 1: SAS Procedure Export

The EXORT ProcedurePresented by: Sri Nikhil Gupta Gourisetti

Page 2: SAS Procedure Export

PROC EXPORT-Overview

It reads data from a SAS data set and writes it to an external data source.

External data sources can include such files as Microsoft Access Databases, Microsoft Excel Workbooks, Lotus spreadsheets, and delimited files.

In delimited files, a delimiter such as a blank, comma, or tab separates columns of data values.

The EXPORT procedure uses one of these methods to export data: generated DATA step code generated SAS/ACCESS code translation engines

Page 3: SAS Procedure Export

Inside the software

Options can be used to control the results.

Information about a particular export can be found in SAS log.

Export wizard can also be used: File > Export Data

Page 4: SAS Procedure Export

PROC EXPORT-Syntax

This is available only for Windows, Unix, open VMS for Integrity Services.

PROC EXPORT DATA=<libref.>SAS data-set <(SAS data-set-options)>

OUTFILE="filename" | OUTTABLE="tablename"

<DBMS=identifier> <REPLACE><LABEL>;

<data-source-statement(s);>

Page 5: SAS Procedure Export

Required Arguments: DATA=<libref.>SAS data-set

DATA=<libref.>SAS data-set: Identifies input SAS dataset. If one-level, the procedure either uses the user or work library. The data target should support the SAS dataset format. The amount of data must be with in the data target limitations. SAS dataset should be identified else the recently created dataset stored in

_LAST_ will be exported. (SAS data-set-options): This specifies SAS data set options.

If the exporting dataset has password, use: ALTER, PW, READ or WRITE options. To export data subset meeting specified conditions, use: WHERE option.

Page 6: SAS Procedure Export

Dataset Option Table

ALTER BUFNO BUFSIZE

CNTLLEV COMPRESS DLDMGACTION

DROP ENCRYPT FILECLOSE

FIRSTOBS GENMAX GENNUM

IDXNAME IDXWHERE IN

INDEX KEEP LABEL

OBS OBSBUF OUTREP

POINTOBS PW PWREQ

READ RENAME REPEMPTY

REPLACE REUSE SORTEDBY

SPILL TOBSNO TYPE

WHERE WHEREUP WRITEApp

Page 7: SAS Procedure Export

Required Arguments: OUTFILE= ’filename’

OUTFILE= ’filename’: Specifies the complete path, filename or a fileref for the output PC file,

spreadsheet, or delimited external file. ‘ ’ can be omitted if there are no special/lowercase characters, spaces. fileref is associated with physical location of file. The EXPORT procedure does not support device types or access methods

for the FILENAME statement except for DISK. For example, the EXPORT procedure does not support the TEMP device type,

which creates a temporary external file.

Page 8: SAS Procedure Export

Required Arguments: OUTTABLE=’tablename’

OUTTABLE=’tablename’: Specifies the table name of the output DBMS table. ‘ ’ can be omitted if there are no special/lowercase characters, spaces. DBMS table might be case sensitive Options must be specified when a DBMS table is exported. Output external source data availability depends upon:

Operating environment License for SAS/ACCESS interface to PC Files.

Page 9: SAS Procedure Export

Options: DBMS= identifier; LABEL; REPLACE

DBMS=identifier: Specifies the type of data to export. Valid DBMS identifier must be specified to export DBMS table. CSV, DLM, TAB are valid identifiers for delimited data files. In case of DBMS=DLM: Space is the default delimiter character. DELIMITER=‘char’ can be used.

LABEL: Specifies a variable label name. SAS writes these to the exported table as column names. If the label names do not already exist, SAS writes them to the exported table.

REPLACE: Overwrites an existing file. If REPLACE is not specified, the EXPORT proc does not overwrite an existing file.

Page 10: SAS Procedure Export

DBMS=option:

Identifier Output Data Source Extension Host Availability

CSVdelimited file

(comma separated values)

.csv Open VMS, Unix, MS Windows

DLMDelimited file

(default delimiter is a blank)

.*Open VMS, Unix, MS

Windows

TAB delimited file (tab-delimited values) .txt

Open VMS, Unix, MS Windows

Page 11: SAS Procedure Export

Data Source Statements

DELIMITER=’char’ | ’nn’x: Specifies the delimiter to separate columns of data in the output file. Delimiter can be specified as a single or hexadecimal character.

DELIMITER= ‘&’ : To separate data columns by ampersand. No option is specified, EXPORT proc assumes the delimiter as blank. Do not specify delimiter option if:

DBMS=CSV DBMS=TAB Output filename has an extension of

.CSV .TXT

Page 12: SAS Procedure Export

Example ProblemsBrief Demonstration of the Material

Page 13: SAS Procedure Export

Example-1: Exporting a Delimited External File

This example exports the SAS data set SASHELP.CLASS to a delimited external file.

Obs Name Sex Age Height Weight

1 Alfred M 14 69 112.5

2 Alice F 13 56.5 84

3 Barbara F 13 65.3 98

4 Carol F 14 62.8 102.5

5 Henry M 14 63.5 102.5

6 James M 12 57.3 83

7 Jane F 12 59.8 84.5

8 Janet F 15 62.5 112.5

9 Jeffrey M 13 62.5 84

10 John M 12 59 99.5

11 Joyce F 11 51.3 50.5

12 Judy F 14 64.3 90

13 Louise F 12 56.3 77

14 Mary F 15 66.5 112

15 Philip M 16 72 150

16 Robert M 12 64.8 128

17 Ronald M 15 67 133

18 Thomas M 11 57.5 85

19 William M 15 66.5 112

Page 14: SAS Procedure Export

Ex-1: Procedure Features

The EXPORT procedure statement arguments:

DATA=

DBMS=

OUTFILE=

Data source statement:

DELIMITER=

Page 15: SAS Procedure Export

Ex-1: Program

proc export data=sashelp.class

outfile=’c:\myfiles\class’

dbms=dlm;

delimiter=’&’;

run;

SAS Log displays information about successful export + SAS DATA step.

Page 16: SAS Procedure Export

Ex-1: External File

Name&Sex&Age&Height&Weight

Alfred&M&14&69&112.5

Alice&F&13&56.5&84

Barbara&F&13&65.3&98

Carol&F&14&62.8&102.5

Henry&M&14&63.5&102.5

James&M&12&57.3&83

Jane&F&12&59.8&84.5

Janet&F&15&62.5&112.5

Jeffrey&M&13&62.5&84

John&M&12&59&99.5

Joyce&F&11&51.3&50.5

Judy&F&14&64.3&90

Louise&F&12&56.3&77

Mary&F&15&66.5&112

Philip&M&16&72&150

Robert&M&12&64.8&128

Ronald&M&15&67&133

Thomas&M&11&57.5&85

William&M&15&66.5&112

Page 17: SAS Procedure Export

Example-2: Exporting a Subset of Observations to a CSV File

Procedure features:

The EXPORT procedure statement arguments:

DATA=

DBMS=

OUTFILE=

REPLACE

Program:

proc export data=sashelp.class (where=(sex=’F’))

outfile=’c:\myfiles\Femalelist.csv’

dbms=csv

replace;

run;

Page 18: SAS Procedure Export

APPENDIXDataset Option Description

REFRETURN

Page 19: SAS Procedure Export

ALTER=alter-password

alter-password must be a valid SAS name

The ALTER= option applies to all types of SAS files except catalogs. You can use this option to assign a password to a SAS file or to access a read-protected, write-protected, or alter-protected SAS file.

When replacing a SAS data set that is protected with an ALTER password, the new data set inherits the ALTER password. To change the ALTER password for the new data set, use the MODIFY statement in the DATASETS procedure.

Note: A SAS password does not control access to a SAS file beyond the SAS system. You should use the operating system-supplied utilities and file-system security controls in order to control access to SAS files outside of SAS.

RETURN

Page 20: SAS Procedure Export

BUFNO= n | nK | hexX | MIN | MAX

n | nK specifies the number of buffers in multiples of 1 (bytes); 1,024 (kilobytes). For example, a value of 8 specifies 8 buffers, and a value of 1k specifies 1024 buffers.

hexX specifies the number of buffers as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, the value 2dx sets the number of buffers to 45 buffers.

MIN sets the minimum number of buffers to 0, which causes SAS to use the minimum optimal value for the operating environment. This is the default.

MAX sets the number of buffers to the maximum possible number in your operating environment, up to the largest four-byte, signed integer, which is 231-1, or approximately 2 billion.

RETURN

Page 21: SAS Procedure Export

BUFSIZE= n | nK | nM | nG | hexX | MAX

n | nK | nM | nG specifies the page size in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); or 1,073,741,824 (gigabytes). For example, a value of 8 specifies a page size of 8 bytes, and a value of 4k specifies a page size of 4096 bytes.

The default is 0, which causes SAS to use the minimum optimal page size for the operating environment.

hexX specifies the page size as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, the value 2dx sets the page size to 45 bytes.

MAX sets the page size to the maximum possible number in your operating environment, up to the largest four-byte, signed integer, which is 231-1, or approximately 2 billion bytes.

RETURN

Page 22: SAS Procedure Export

CNTLLEV=LIB | MEM | REC

LIB specifies that concurrent access is controlled at the library level. Library-level control restricts concurrent access to only one update process to the library.

MEM specifies that concurrent access is controlled at the SAS data set (member) level. Member-level control restricts concurrent access to only one update or output process to the SAS data set. If the data set is open for an update or output process, then no other operation can access the data set. If the data set is open for an input process, then other concurrent input processes are allowed but no update or output process is allowed.

REC specifies that concurrent access is controlled at the observation (record) level. Record-level control allows more than one update access to the same SAS data set, but it denies concurrent update of the same observation.

RETURN

Page 23: SAS Procedure Export

COMPRESS=NO | YES | CHAR | BINARY

NOspecifies that the observations in a newly created SAS data set are uncompressed (fixed-length records).

YES | CHAR specifies that the observations in a newly created SAS data set are compressed (variable-length records) by SAS using RLE (Run Length Encoding). RLE compresses observations by reducing repeated consecutive characters (including blanks) to two-byte or three-byte representations.

BINARY specifies that the observations in a newly created SAS data set are compressed (variable-length records) by SAS using RDC (Ross Data Compression). RDC combines run-length encoding and sliding-window compression to compress the file.

RETURN

Page 24: SAS Procedure Export

DLDMGACTION=FAIL | ABORT | REPAIR | NOINDEX | PROMPT

FAIL stops the step, issues an error message to the log immediately. This is the default for batch mode.

ABORT terminates the step, issues an error message to the log, and terminates the SAS session.

REPAIR automatically repairs and rebuilds indexes and integrity constraints, unless the data file is truncated. You use the REPAIR statement in PROC DATASETS to restore a truncated data set. It issues a warning message to the log. This is the default for interactive mode.

NOINDEX automatically repairs the data file without the indexes and integrity constraints, deletes the index file, updates the data file to reflect the disabled indexes and integrity constraints, and limits the data file to be opened only in INPUT mode. A warning is written to the SAS log instructing you to execute the PROC DATASETS REBUILD statement to correct or delete the disabled indexes and integrity constraints.

PROMPT displays a dialog box that asks you to select the FAIL, ABORT, REPAIR, or NOINDEX action.

RETURN

Page 25: SAS Procedure Export

DROP=variable-1 <...variable-n>

variable-1 <...variable-n>lists one or more variable names. You can list the variables in any form that SAS allows.

RETURN

Page 26: SAS Procedure Export

ENCRYPT=YES | NO

YES encrypts the file. This encryption uses passwords that are stored in the data set. At a minimum, you must specify the READ= or the PW= data set option at the same time that you specify ENCRYPT=YES. Because the encryption method uses passwords, you cannot change any password on an encrypted data set without re-creating the data set.

CAUTION: Record all passwords when using ENCRYPT= YES. If you forget the passwords, you cannot reset it without assistance from SAS. The process is time-consuming and resource-intensive.

NO does not encrypt the file.

RETURN

Page 27: SAS Procedure Export

FILECLOSE=DISP | LEAVE | REREAD | REWIND

DISP positions the tape volume according to the disposition specified in the operating environment's control language.

LEAVE positions the tape at the end of the file that was just processed. Use FILECLOSE=LEAVE if you are not repeatedly accessing the same files in a SAS program but you are accessing one or more subsequent SAS files on the same tape.

REREAD positions the tape volume at the beginning of the file that was just processed. Use FILECLOSE=REREAD if you are accessing the same SAS data set on tape several times in a SAS program.

REWIND rewinds the tape volume to the beginning. Use FILECLOSE=REWIND if you are accessing one or more previous SAS files on the same tape, but you are not repeatedly accessing the same files in a SAS program.

Operating Environment Information: These values are not recognized by all operating environments. Additional values are available on some operating environments. See the appropriate sections of the SAS documentation for your operating environment for more information about using SAS libraries that are stored on tape.

RETURN

Page 28: SAS Procedure Export

FIRSTOBS= n| nK | nM | nG | hexX | MIN | MAX

n | nK | nM | nG specifies the number of the first observation to process in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); or 1,073,741,824 (gigabytes). For example, a value of 8specifies the 8th observation, and a value of 3k specifies 3,072.

hexX specifies the number of the first observation to process as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, the value 2dx sets the 45th observation as the first observation to process.

MIN sets the number of the first observation to process to 1. This is the default.

MAX sets the number of the first observation to process to the maximum number of observations in the data set, up to the largest eight-byte, signed integer, which is 263-1, or approximately 9.2 quintillion observations.

RETURN

Page 29: SAS Procedure Export

GENMAX=number-of-generations

number-of-generations requests generations for a data set and specifies the maximum number of versions to maintain. The value can be from 0 to 1000. The default is GENMAX=0, which means that no generation data sets are requested.

RETURN

Page 30: SAS Procedure Export

GENNUM=integer

Integer is a number that references a specific version from a generation group. Specifying a positive number is an absolute reference to a specific generation number that is appended to a data set's name. Specifying a negative number is a relative reference to a historical version in relation to the base version, from the youngest to the oldest. Typically, a value of 0 refers to the current (base) version.

Note: The DATASETS procedure provides a variety of statements for which specifying GENNUM= has additional functionality:

For the DATASETS and DELETE statements, GENNUM= supports the additional values ALL, HIST, and REVERT.

For the CHANGE statement, GENNUM= supports the additional value ALL.

For the CHANGE statement, specifying GENNUM=0 refers to all versions rather than just the base version.

RETURN

Page 31: SAS Procedure Export

IDXNAME=index-name

index-name specifies the name (up to 32 characters) of a simple or composite index for the SAS data set. SAS does not attempt to determine whether the specified index is the best one or whether a sequential search might be more resource efficient.

RETURN

Page 32: SAS Procedure Export

IDXWHERE=YES | NO

YES tells SAS to choose the best index to optimize a WHERE expression, and to disregard the possibility that a sequential search of the data set might be more resource-efficient.

NO tells SAS to ignore all indexes and satisfy the conditions of a WHERE expression with a sequential search of the data set.

Note: You cannot use IDXWHERE= to override the use of an index to process a BY statement.

RETURN

Page 33: SAS Procedure Export

IN=variable

Variable names the new variable whose value indicates whether that input data set contributed data to the current observation. Within the DATA step, the value of the variable is 1 if the data set contributed to the current observation, and 0 otherwise.

RETURN

Page 34: SAS Procedure Export

INDEX=(index-specification-1 ...<index-specification-n>)

index-specification names and describes a simple or a composite index to be built. Index-specification has this form: index= (variable(s) </UNIQUE> </NOMISS>)

Index is the name of a variable that forms the index or the name you choose for a composite index.

variable or variables is a list of variables to use in making a composite index.

UNIQUE specifies that the values of the key variables must be unique. If you specify UNIQUE for a new data set and multiple observations have the same values for the index variables, the index is not created. A slash (/) must precede the UNIQUE option.

NOMISS excludes all observations with missing values from the index. Observations with missing values are still read from the data set but not through the index. A slash (/) must precede the NOMISS option.

RETURN

Page 35: SAS Procedure Export

KEEP=variable-1 <...variable-n>

variable-1 <...variable-n> lists one or more variable names. You can list the variables in any form that SAS allows.

RETURN

Page 36: SAS Procedure Export

LABEL='label'

'label‘ specifies a text string of up to 256 characters. If the label text contains single quotation marks, use double quotation marks around the label, or use two single quotation marks in the label text and surround the string with single quotation marks. To remove a label from a data set, assign a label that is equal to a blank that is enclosed in quotation marks.

RETURN

Page 37: SAS Procedure Export

OBS= n | nK | nM | nG | nT | hexX | MIN | MAX

n | nK | nM | nG | nT specifies a number to indicate when to stop processing observations, with n being an integer. Using one of the letter notations results in multiplying the integer by a specific value. That is, specifying K (kilo) multiplies the integer by 1,024, M (mega) multiplies by 1,048,576, G (giga) multiplies by 1,073,741,824, or T (tera) multiplies by 1,099,511,627,776. For example, a value of 20 specifies 20 observations, while a value of 3m specifies 3,145,728 observations.

hexX specifies a number to indicate when to stop processing as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, the hexadecimal value F8 must be specified as 0F8x in order to specify the decimal equivalent of 248. The value 2dx specifies the decimal equivalent of 45.

MIN sets the number to indicate when to stop processing to 0. Use OBS=0 in order to create an empty data set that has the structure, but not the observations, of another data set.

MAX sets the number to indicate when to stop processing to the maximum number of observations in the data set, up to the largest 8-byte, signed integer, which is 263-1, or approximately 9.2 quintillion. This is the default.

RETURN

Page 38: SAS Procedure Export

OBSBUF=n

n specifies the number of observations that are read into the view buffer at a time.

CAUTION: The maximum value for the OBSBUF= option depends on the amount of available memory. If you specify a value so large that the memory allocation of the view buffer fails, an out-of- memory error results. If you specify a value that is larger than the amount of available real memory and your operating environment allows SAS to perform the allocation using virtual memory, the result can be a decrease in performance due to excessive paging.

RETURN

Page 39: SAS Procedure Export

OUTREP=format

formatspecifies the data representation, which is the form in which data is stored in a particular operating environment. Different operating environments use different standards or conventions for storing floating-point numbers (for example, IEEE or IBM Mainframe); for character encoding (ASCII or EBCDIC); for the ordering of bytes in memory (big Endian or little Endian); for word alignment (4-byte boundaries or 8-byte boundaries); for integer data-type length (16-bit, 32-bit, or 64-bit); and for doubles (byte-swapped or not).

Native data representation refers to an environment in which the data representation is comparable to the CPU that is accessing the file. For example, a file that is in Windows data representation is native to the Windows operating environment.

By default, SAS creates a new SAS data set by using the native data representation of the CPU that is running SAS. Specifying the OUTREP= option enables you to create a SAS data set within the native environment that uses a foreign environment data representation. For example, in a UNIX environment, you can create a SAS data set that uses Windows data representation.

RETURN

Page 40: SAS Procedure Export

POINTOBS=YES | NO

YES causes SAS software to produce a compressed data set that might be randomly accessed by observation number. This is the default.

Examples of accessing data directly by observation number are:

the POINT= option of the MODIFY and SET statements in the DATA step

going directly to a specific observation number with PROC FSEDIT.

NO suppresses the ability to randomly access observations in a compressed data set by observation number.

RETURN

Page 41: SAS Procedure Export

PW=password

password must be a valid SAS name.

The PW= option applies to all types of SAS files except catalogs. You can use this option to assign a password to a SAS file or to access a password-protected SAS file.

When replacing a SAS data set that is protected by an ALTER password, the new data set inherits the ALTER password. To change the ALTER password for the new data set, use the MODIFY statement in the DATASETS procedure.

Note: A SAS password does not control access to a SAS file beyond the SAS system. You should use the operating system-supplied utilities and file-system security controls in order to control access to SAS files outside of SAS.

RETURN

Page 42: SAS Procedure Export

PWREQ=YES | NO

YES specifies to display a dialog box.

NO prevents a dialog box from displaying. If a missing or invalid password is entered, the data set is not opened and an error message is written to the SAS log.

RETURN

Page 43: SAS Procedure Export

READ=read-password

read-password must be a valid SAS name.

The READ= option applies to all types of SAS files except catalogs. You can use this option to assign a password to a SAS file or to access a read-protected SAS file.

Note: A SAS password does not control access to a SAS file beyond the SAS system. You should use the operating system-supplied utilities and file-system security controls in order to control access to SAS files outside of SAS.

RETURN

Page 44: SAS Procedure Export

RENAME=(old-name-1=new-name-1 <...old-name-n=new-name-n>)

old-name the variable you want to rename.

new-name the new name of the variable. It must be a valid SAS name.

RETURN

Page 45: SAS Procedure Export

REPEMPTY=YES | NO

YES specifies that a new empty data set with a given name replaces an existing data set with the same name. This is the default.

NO specifies that a new empty data set with a given name does not replace an existing data set with the same name.

RETURN

Page 46: SAS Procedure Export

REPLACE=NO | YES

NO specifies that a new data set with a given name does not replace an existing data set with the same name.

YES specifies that a new data set with a given name replaces an existing data set with the same name.

RETURN

Page 47: SAS Procedure Export

REUSE=NO | YES

NO does not track and reuse space in compressed data sets. New observations are appended to the existing data set. Specifying the NO argument results in less efficient data storage if you delete or update many observations in the SAS data set.

YES tracks and reuses space in compressed SAS data sets. New observations are inserted in the space that is freed when other observations are updated or deleted.

If you plan to use procedures that add observations to the end of SAS data sets (for example, the APPEND and FSEDIT procedures) with compressed data sets, use the REUSE=NO argument. REUSE=YES causes new observations to be added wherever there is space in the file, not necessarily at the end of the file.

RETURN

Page 48: SAS Procedure Export

SORTEDBY=by-clause </ collate-name> | _NULL_

by-clause < / collate-name>indicates how the data is currently sorted.

by-clause names the variables and options that you use in a BY statement in a PROC SORT step.

collate-name names the collating sequence that is used for the sort. By default, the collating sequence is that of your operating environment. A slash (/) must precede the collating sequence.

_NULL_ removes any existing sort indicator.

RETURN

Page 49: SAS Procedure Export

SPILL=YES | NO

YES creates a spill file for non-sequential processing of a DATA step view. This is the default.

NO does not create a spill file or reduces the size of a spill file.

RETURN

Page 50: SAS Procedure Export

TOBSNO=n

n specifies the number of observations to be transmitted.

RETURN

Page 51: SAS Procedure Export

TYPE=data-set-type

data-set-type specifies the special type of the data set.

RETURN

Page 52: SAS Procedure Export

WHERE=(where-expression-1<logical-operator where-expression-n>)

where-expression is an arithmetic or logical expression that consists of a sequence of operators, operands, and SAS functions. An operand is a variable, a SAS function, or a constant. An operator is a symbol that requests a comparison, logical operation, or arithmetic calculation. The expression must be enclosed in parentheses.

logical-operator can be AND, AND NOT, OR, or OR NOT.

RETURN

Page 53: SAS Procedure Export

WHEREUP=NO | YES

NO does not evaluate added observations and modified observations against a WHERE expression.

YES evaluates added observations and modified observations against a WHERE expression.

RETURN

Page 54: SAS Procedure Export

WRITE=write-password

write-password must be a valid SAS name.

The WRITE= option applies to all types of SAS files except catalogs. You can use this option to assign a password to a SAS file or to access a write-protected SAS file.

Note: A SAS password does not control access to a SAS file beyond the SAS system. You should use the operating system-supplied utilities and file-system security controls in order to control access to SAS files outside of SAS.

RETURN

Page 55: SAS Procedure Export

References

[1] Base SAS® 9.2 Procedures Guide.

[2] http://support.sas.com/documentation/

Page 56: SAS Procedure Export

THANK YOU