sas procedure export
DESCRIPTION
highly usedTRANSCRIPT
The EXORT ProcedurePresented by: Sri Nikhil Gupta Gourisetti
PROC EXPORT-Overview
It reads data from a SAS data set and writes it to an external data source.
External data sources can include such files as Microsoft Access Databases, Microsoft Excel Workbooks, Lotus spreadsheets, and delimited files.
In delimited files, a delimiter such as a blank, comma, or tab separates columns of data values.
The EXPORT procedure uses one of these methods to export data: generated DATA step code generated SAS/ACCESS code translation engines
Inside the software
Options can be used to control the results.
Information about a particular export can be found in SAS log.
Export wizard can also be used: File > Export Data
PROC EXPORT-Syntax
This is available only for Windows, Unix, open VMS for Integrity Services.
PROC EXPORT DATA=<libref.>SAS data-set <(SAS data-set-options)>
OUTFILE="filename" | OUTTABLE="tablename"
<DBMS=identifier> <REPLACE><LABEL>;
<data-source-statement(s);>
Required Arguments: DATA=<libref.>SAS data-set
DATA=<libref.>SAS data-set: Identifies input SAS dataset. If one-level, the procedure either uses the user or work library. The data target should support the SAS dataset format. The amount of data must be with in the data target limitations. SAS dataset should be identified else the recently created dataset stored in
_LAST_ will be exported. (SAS data-set-options): This specifies SAS data set options.
If the exporting dataset has password, use: ALTER, PW, READ or WRITE options. To export data subset meeting specified conditions, use: WHERE option.
Dataset Option Table
ALTER BUFNO BUFSIZE
CNTLLEV COMPRESS DLDMGACTION
DROP ENCRYPT FILECLOSE
FIRSTOBS GENMAX GENNUM
IDXNAME IDXWHERE IN
INDEX KEEP LABEL
OBS OBSBUF OUTREP
POINTOBS PW PWREQ
READ RENAME REPEMPTY
REPLACE REUSE SORTEDBY
SPILL TOBSNO TYPE
WHERE WHEREUP WRITEApp
Required Arguments: OUTFILE= ’filename’
OUTFILE= ’filename’: Specifies the complete path, filename or a fileref for the output PC file,
spreadsheet, or delimited external file. ‘ ’ can be omitted if there are no special/lowercase characters, spaces. fileref is associated with physical location of file. The EXPORT procedure does not support device types or access methods
for the FILENAME statement except for DISK. For example, the EXPORT procedure does not support the TEMP device type,
which creates a temporary external file.
Required Arguments: OUTTABLE=’tablename’
OUTTABLE=’tablename’: Specifies the table name of the output DBMS table. ‘ ’ can be omitted if there are no special/lowercase characters, spaces. DBMS table might be case sensitive Options must be specified when a DBMS table is exported. Output external source data availability depends upon:
Operating environment License for SAS/ACCESS interface to PC Files.
Options: DBMS= identifier; LABEL; REPLACE
DBMS=identifier: Specifies the type of data to export. Valid DBMS identifier must be specified to export DBMS table. CSV, DLM, TAB are valid identifiers for delimited data files. In case of DBMS=DLM: Space is the default delimiter character. DELIMITER=‘char’ can be used.
LABEL: Specifies a variable label name. SAS writes these to the exported table as column names. If the label names do not already exist, SAS writes them to the exported table.
REPLACE: Overwrites an existing file. If REPLACE is not specified, the EXPORT proc does not overwrite an existing file.
DBMS=option:
Identifier Output Data Source Extension Host Availability
CSVdelimited file
(comma separated values)
.csv Open VMS, Unix, MS Windows
DLMDelimited file
(default delimiter is a blank)
.*Open VMS, Unix, MS
Windows
TAB delimited file (tab-delimited values) .txt
Open VMS, Unix, MS Windows
Data Source Statements
DELIMITER=’char’ | ’nn’x: Specifies the delimiter to separate columns of data in the output file. Delimiter can be specified as a single or hexadecimal character.
DELIMITER= ‘&’ : To separate data columns by ampersand. No option is specified, EXPORT proc assumes the delimiter as blank. Do not specify delimiter option if:
DBMS=CSV DBMS=TAB Output filename has an extension of
.CSV .TXT
Example ProblemsBrief Demonstration of the Material
Example-1: Exporting a Delimited External File
This example exports the SAS data set SASHELP.CLASS to a delimited external file.
Obs Name Sex Age Height Weight
1 Alfred M 14 69 112.5
2 Alice F 13 56.5 84
3 Barbara F 13 65.3 98
4 Carol F 14 62.8 102.5
5 Henry M 14 63.5 102.5
6 James M 12 57.3 83
7 Jane F 12 59.8 84.5
8 Janet F 15 62.5 112.5
9 Jeffrey M 13 62.5 84
10 John M 12 59 99.5
11 Joyce F 11 51.3 50.5
12 Judy F 14 64.3 90
13 Louise F 12 56.3 77
14 Mary F 15 66.5 112
15 Philip M 16 72 150
16 Robert M 12 64.8 128
17 Ronald M 15 67 133
18 Thomas M 11 57.5 85
19 William M 15 66.5 112
Ex-1: Procedure Features
The EXPORT procedure statement arguments:
DATA=
DBMS=
OUTFILE=
Data source statement:
DELIMITER=
Ex-1: Program
proc export data=sashelp.class
outfile=’c:\myfiles\class’
dbms=dlm;
delimiter=’&’;
run;
SAS Log displays information about successful export + SAS DATA step.
Ex-1: External File
Name&Sex&Age&Height&Weight
Alfred&M&14&69&112.5
Alice&F&13&56.5&84
Barbara&F&13&65.3&98
Carol&F&14&62.8&102.5
Henry&M&14&63.5&102.5
James&M&12&57.3&83
Jane&F&12&59.8&84.5
Janet&F&15&62.5&112.5
Jeffrey&M&13&62.5&84
John&M&12&59&99.5
Joyce&F&11&51.3&50.5
Judy&F&14&64.3&90
Louise&F&12&56.3&77
Mary&F&15&66.5&112
Philip&M&16&72&150
Robert&M&12&64.8&128
Ronald&M&15&67&133
Thomas&M&11&57.5&85
William&M&15&66.5&112
Example-2: Exporting a Subset of Observations to a CSV File
Procedure features:
The EXPORT procedure statement arguments:
DATA=
DBMS=
OUTFILE=
REPLACE
Program:
proc export data=sashelp.class (where=(sex=’F’))
outfile=’c:\myfiles\Femalelist.csv’
dbms=csv
replace;
run;
APPENDIXDataset Option Description
REFRETURN
ALTER=alter-password
alter-password must be a valid SAS name
The ALTER= option applies to all types of SAS files except catalogs. You can use this option to assign a password to a SAS file or to access a read-protected, write-protected, or alter-protected SAS file.
When replacing a SAS data set that is protected with an ALTER password, the new data set inherits the ALTER password. To change the ALTER password for the new data set, use the MODIFY statement in the DATASETS procedure.
Note: A SAS password does not control access to a SAS file beyond the SAS system. You should use the operating system-supplied utilities and file-system security controls in order to control access to SAS files outside of SAS.
RETURN
BUFNO= n | nK | hexX | MIN | MAX
n | nK specifies the number of buffers in multiples of 1 (bytes); 1,024 (kilobytes). For example, a value of 8 specifies 8 buffers, and a value of 1k specifies 1024 buffers.
hexX specifies the number of buffers as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, the value 2dx sets the number of buffers to 45 buffers.
MIN sets the minimum number of buffers to 0, which causes SAS to use the minimum optimal value for the operating environment. This is the default.
MAX sets the number of buffers to the maximum possible number in your operating environment, up to the largest four-byte, signed integer, which is 231-1, or approximately 2 billion.
RETURN
BUFSIZE= n | nK | nM | nG | hexX | MAX
n | nK | nM | nG specifies the page size in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); or 1,073,741,824 (gigabytes). For example, a value of 8 specifies a page size of 8 bytes, and a value of 4k specifies a page size of 4096 bytes.
The default is 0, which causes SAS to use the minimum optimal page size for the operating environment.
hexX specifies the page size as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, the value 2dx sets the page size to 45 bytes.
MAX sets the page size to the maximum possible number in your operating environment, up to the largest four-byte, signed integer, which is 231-1, or approximately 2 billion bytes.
RETURN
CNTLLEV=LIB | MEM | REC
LIB specifies that concurrent access is controlled at the library level. Library-level control restricts concurrent access to only one update process to the library.
MEM specifies that concurrent access is controlled at the SAS data set (member) level. Member-level control restricts concurrent access to only one update or output process to the SAS data set. If the data set is open for an update or output process, then no other operation can access the data set. If the data set is open for an input process, then other concurrent input processes are allowed but no update or output process is allowed.
REC specifies that concurrent access is controlled at the observation (record) level. Record-level control allows more than one update access to the same SAS data set, but it denies concurrent update of the same observation.
RETURN
COMPRESS=NO | YES | CHAR | BINARY
NOspecifies that the observations in a newly created SAS data set are uncompressed (fixed-length records).
YES | CHAR specifies that the observations in a newly created SAS data set are compressed (variable-length records) by SAS using RLE (Run Length Encoding). RLE compresses observations by reducing repeated consecutive characters (including blanks) to two-byte or three-byte representations.
BINARY specifies that the observations in a newly created SAS data set are compressed (variable-length records) by SAS using RDC (Ross Data Compression). RDC combines run-length encoding and sliding-window compression to compress the file.
RETURN
DLDMGACTION=FAIL | ABORT | REPAIR | NOINDEX | PROMPT
FAIL stops the step, issues an error message to the log immediately. This is the default for batch mode.
ABORT terminates the step, issues an error message to the log, and terminates the SAS session.
REPAIR automatically repairs and rebuilds indexes and integrity constraints, unless the data file is truncated. You use the REPAIR statement in PROC DATASETS to restore a truncated data set. It issues a warning message to the log. This is the default for interactive mode.
NOINDEX automatically repairs the data file without the indexes and integrity constraints, deletes the index file, updates the data file to reflect the disabled indexes and integrity constraints, and limits the data file to be opened only in INPUT mode. A warning is written to the SAS log instructing you to execute the PROC DATASETS REBUILD statement to correct or delete the disabled indexes and integrity constraints.
PROMPT displays a dialog box that asks you to select the FAIL, ABORT, REPAIR, or NOINDEX action.
RETURN
DROP=variable-1 <...variable-n>
variable-1 <...variable-n>lists one or more variable names. You can list the variables in any form that SAS allows.
RETURN
ENCRYPT=YES | NO
YES encrypts the file. This encryption uses passwords that are stored in the data set. At a minimum, you must specify the READ= or the PW= data set option at the same time that you specify ENCRYPT=YES. Because the encryption method uses passwords, you cannot change any password on an encrypted data set without re-creating the data set.
CAUTION: Record all passwords when using ENCRYPT= YES. If you forget the passwords, you cannot reset it without assistance from SAS. The process is time-consuming and resource-intensive.
NO does not encrypt the file.
RETURN
FILECLOSE=DISP | LEAVE | REREAD | REWIND
DISP positions the tape volume according to the disposition specified in the operating environment's control language.
LEAVE positions the tape at the end of the file that was just processed. Use FILECLOSE=LEAVE if you are not repeatedly accessing the same files in a SAS program but you are accessing one or more subsequent SAS files on the same tape.
REREAD positions the tape volume at the beginning of the file that was just processed. Use FILECLOSE=REREAD if you are accessing the same SAS data set on tape several times in a SAS program.
REWIND rewinds the tape volume to the beginning. Use FILECLOSE=REWIND if you are accessing one or more previous SAS files on the same tape, but you are not repeatedly accessing the same files in a SAS program.
Operating Environment Information: These values are not recognized by all operating environments. Additional values are available on some operating environments. See the appropriate sections of the SAS documentation for your operating environment for more information about using SAS libraries that are stored on tape.
RETURN
FIRSTOBS= n| nK | nM | nG | hexX | MIN | MAX
n | nK | nM | nG specifies the number of the first observation to process in multiples of 1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); or 1,073,741,824 (gigabytes). For example, a value of 8specifies the 8th observation, and a value of 3k specifies 3,072.
hexX specifies the number of the first observation to process as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, the value 2dx sets the 45th observation as the first observation to process.
MIN sets the number of the first observation to process to 1. This is the default.
MAX sets the number of the first observation to process to the maximum number of observations in the data set, up to the largest eight-byte, signed integer, which is 263-1, or approximately 9.2 quintillion observations.
RETURN
GENMAX=number-of-generations
number-of-generations requests generations for a data set and specifies the maximum number of versions to maintain. The value can be from 0 to 1000. The default is GENMAX=0, which means that no generation data sets are requested.
RETURN
GENNUM=integer
Integer is a number that references a specific version from a generation group. Specifying a positive number is an absolute reference to a specific generation number that is appended to a data set's name. Specifying a negative number is a relative reference to a historical version in relation to the base version, from the youngest to the oldest. Typically, a value of 0 refers to the current (base) version.
Note: The DATASETS procedure provides a variety of statements for which specifying GENNUM= has additional functionality:
For the DATASETS and DELETE statements, GENNUM= supports the additional values ALL, HIST, and REVERT.
For the CHANGE statement, GENNUM= supports the additional value ALL.
For the CHANGE statement, specifying GENNUM=0 refers to all versions rather than just the base version.
RETURN
IDXNAME=index-name
index-name specifies the name (up to 32 characters) of a simple or composite index for the SAS data set. SAS does not attempt to determine whether the specified index is the best one or whether a sequential search might be more resource efficient.
RETURN
IDXWHERE=YES | NO
YES tells SAS to choose the best index to optimize a WHERE expression, and to disregard the possibility that a sequential search of the data set might be more resource-efficient.
NO tells SAS to ignore all indexes and satisfy the conditions of a WHERE expression with a sequential search of the data set.
Note: You cannot use IDXWHERE= to override the use of an index to process a BY statement.
RETURN
IN=variable
Variable names the new variable whose value indicates whether that input data set contributed data to the current observation. Within the DATA step, the value of the variable is 1 if the data set contributed to the current observation, and 0 otherwise.
RETURN
INDEX=(index-specification-1 ...<index-specification-n>)
index-specification names and describes a simple or a composite index to be built. Index-specification has this form: index= (variable(s) </UNIQUE> </NOMISS>)
Index is the name of a variable that forms the index or the name you choose for a composite index.
variable or variables is a list of variables to use in making a composite index.
UNIQUE specifies that the values of the key variables must be unique. If you specify UNIQUE for a new data set and multiple observations have the same values for the index variables, the index is not created. A slash (/) must precede the UNIQUE option.
NOMISS excludes all observations with missing values from the index. Observations with missing values are still read from the data set but not through the index. A slash (/) must precede the NOMISS option.
RETURN
KEEP=variable-1 <...variable-n>
variable-1 <...variable-n> lists one or more variable names. You can list the variables in any form that SAS allows.
RETURN
LABEL='label'
'label‘ specifies a text string of up to 256 characters. If the label text contains single quotation marks, use double quotation marks around the label, or use two single quotation marks in the label text and surround the string with single quotation marks. To remove a label from a data set, assign a label that is equal to a blank that is enclosed in quotation marks.
RETURN
OBS= n | nK | nM | nG | nT | hexX | MIN | MAX
n | nK | nM | nG | nT specifies a number to indicate when to stop processing observations, with n being an integer. Using one of the letter notations results in multiplying the integer by a specific value. That is, specifying K (kilo) multiplies the integer by 1,024, M (mega) multiplies by 1,048,576, G (giga) multiplies by 1,073,741,824, or T (tera) multiplies by 1,099,511,627,776. For example, a value of 20 specifies 20 observations, while a value of 3m specifies 3,145,728 observations.
hexX specifies a number to indicate when to stop processing as a hexadecimal value. You must specify the value beginning with a number (0-9), followed by an X. For example, the hexadecimal value F8 must be specified as 0F8x in order to specify the decimal equivalent of 248. The value 2dx specifies the decimal equivalent of 45.
MIN sets the number to indicate when to stop processing to 0. Use OBS=0 in order to create an empty data set that has the structure, but not the observations, of another data set.
MAX sets the number to indicate when to stop processing to the maximum number of observations in the data set, up to the largest 8-byte, signed integer, which is 263-1, or approximately 9.2 quintillion. This is the default.
RETURN
OBSBUF=n
n specifies the number of observations that are read into the view buffer at a time.
CAUTION: The maximum value for the OBSBUF= option depends on the amount of available memory. If you specify a value so large that the memory allocation of the view buffer fails, an out-of- memory error results. If you specify a value that is larger than the amount of available real memory and your operating environment allows SAS to perform the allocation using virtual memory, the result can be a decrease in performance due to excessive paging.
RETURN
OUTREP=format
formatspecifies the data representation, which is the form in which data is stored in a particular operating environment. Different operating environments use different standards or conventions for storing floating-point numbers (for example, IEEE or IBM Mainframe); for character encoding (ASCII or EBCDIC); for the ordering of bytes in memory (big Endian or little Endian); for word alignment (4-byte boundaries or 8-byte boundaries); for integer data-type length (16-bit, 32-bit, or 64-bit); and for doubles (byte-swapped or not).
Native data representation refers to an environment in which the data representation is comparable to the CPU that is accessing the file. For example, a file that is in Windows data representation is native to the Windows operating environment.
By default, SAS creates a new SAS data set by using the native data representation of the CPU that is running SAS. Specifying the OUTREP= option enables you to create a SAS data set within the native environment that uses a foreign environment data representation. For example, in a UNIX environment, you can create a SAS data set that uses Windows data representation.
RETURN
POINTOBS=YES | NO
YES causes SAS software to produce a compressed data set that might be randomly accessed by observation number. This is the default.
Examples of accessing data directly by observation number are:
the POINT= option of the MODIFY and SET statements in the DATA step
going directly to a specific observation number with PROC FSEDIT.
NO suppresses the ability to randomly access observations in a compressed data set by observation number.
RETURN
PW=password
password must be a valid SAS name.
The PW= option applies to all types of SAS files except catalogs. You can use this option to assign a password to a SAS file or to access a password-protected SAS file.
When replacing a SAS data set that is protected by an ALTER password, the new data set inherits the ALTER password. To change the ALTER password for the new data set, use the MODIFY statement in the DATASETS procedure.
Note: A SAS password does not control access to a SAS file beyond the SAS system. You should use the operating system-supplied utilities and file-system security controls in order to control access to SAS files outside of SAS.
RETURN
PWREQ=YES | NO
YES specifies to display a dialog box.
NO prevents a dialog box from displaying. If a missing or invalid password is entered, the data set is not opened and an error message is written to the SAS log.
RETURN
READ=read-password
read-password must be a valid SAS name.
The READ= option applies to all types of SAS files except catalogs. You can use this option to assign a password to a SAS file or to access a read-protected SAS file.
Note: A SAS password does not control access to a SAS file beyond the SAS system. You should use the operating system-supplied utilities and file-system security controls in order to control access to SAS files outside of SAS.
RETURN
RENAME=(old-name-1=new-name-1 <...old-name-n=new-name-n>)
old-name the variable you want to rename.
new-name the new name of the variable. It must be a valid SAS name.
RETURN
REPEMPTY=YES | NO
YES specifies that a new empty data set with a given name replaces an existing data set with the same name. This is the default.
NO specifies that a new empty data set with a given name does not replace an existing data set with the same name.
RETURN
REPLACE=NO | YES
NO specifies that a new data set with a given name does not replace an existing data set with the same name.
YES specifies that a new data set with a given name replaces an existing data set with the same name.
RETURN
REUSE=NO | YES
NO does not track and reuse space in compressed data sets. New observations are appended to the existing data set. Specifying the NO argument results in less efficient data storage if you delete or update many observations in the SAS data set.
YES tracks and reuses space in compressed SAS data sets. New observations are inserted in the space that is freed when other observations are updated or deleted.
If you plan to use procedures that add observations to the end of SAS data sets (for example, the APPEND and FSEDIT procedures) with compressed data sets, use the REUSE=NO argument. REUSE=YES causes new observations to be added wherever there is space in the file, not necessarily at the end of the file.
RETURN
SORTEDBY=by-clause </ collate-name> | _NULL_
by-clause < / collate-name>indicates how the data is currently sorted.
by-clause names the variables and options that you use in a BY statement in a PROC SORT step.
collate-name names the collating sequence that is used for the sort. By default, the collating sequence is that of your operating environment. A slash (/) must precede the collating sequence.
_NULL_ removes any existing sort indicator.
RETURN
SPILL=YES | NO
YES creates a spill file for non-sequential processing of a DATA step view. This is the default.
NO does not create a spill file or reduces the size of a spill file.
RETURN
TOBSNO=n
n specifies the number of observations to be transmitted.
RETURN
TYPE=data-set-type
data-set-type specifies the special type of the data set.
RETURN
WHERE=(where-expression-1<logical-operator where-expression-n>)
where-expression is an arithmetic or logical expression that consists of a sequence of operators, operands, and SAS functions. An operand is a variable, a SAS function, or a constant. An operator is a symbol that requests a comparison, logical operation, or arithmetic calculation. The expression must be enclosed in parentheses.
logical-operator can be AND, AND NOT, OR, or OR NOT.
RETURN
WHEREUP=NO | YES
NO does not evaluate added observations and modified observations against a WHERE expression.
YES evaluates added observations and modified observations against a WHERE expression.
RETURN
WRITE=write-password
write-password must be a valid SAS name.
The WRITE= option applies to all types of SAS files except catalogs. You can use this option to assign a password to a SAS file or to access a write-protected SAS file.
Note: A SAS password does not control access to a SAS file beyond the SAS system. You should use the operating system-supplied utilities and file-system security controls in order to control access to SAS files outside of SAS.
RETURN
References
[1] Base SAS® 9.2 Procedures Guide.
[2] http://support.sas.com/documentation/
THANK YOU