xlsx2sas macro usage notes - bix -...

12
BiX |XLSX2SAS Macro Usage Notes 1/12 XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010 spreadsheets (XLSX file format) into SAS V9 data sets. Developed by: BiX Corp. www.bixforsas.com Preface The XLSX2SAS macro enables SAS users to import Excel 2007/2010 spreadsheets (also known as XLSX file format) using a simple macro call within their SAS programs. The only required SAS product is SAS/BASE. NO DDE, NO CSV, NO PC-FILE-SERVER, NO SAS/Access to PC file formats. Supported SAS versions: 9.1, 9.2, 9.3 and 9.4. (Both 32 and 64 bit versions) Supported platforms: Unix (Solaris, AIX, HPUX), Linux and Windows workstations and servers. Running environments: SAS Batch, SAS EG, SAS DI and SAS Interactive DMS. The Interactive GUI is supported under SAS 9.2 interactive DMS and above on all platforms. Unix/Linux GUI requires X-Server package on your pc (Eg: MOBA XTERM, Exceed or any X11 compatible) and a special x-font setting (Please view the installation guide). Example: %xlsx2sas( infilename = /home/alex45/test/demo.xlsx , outdata = work.demo , sheetnumber = 1 , xlsxrows = ALL , xlsxcols = ALL , xlsxlabels = DATA , xlsxmix = ALL , xlsxdebug = YES , xlsxdropvars = NO , xlsxdropobs = YES , xlsxmaxlen = 1000 ); Contents: 1. Macro Usage Notes ….. 2 2. Interactive GUI……………. 7

Upload: vuthu

Post on 10-Apr-2018

279 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 1/12

XLSX2SAS Macro Usage Notes

A SAS Base Macro that transforms Excel2007/2010 spreadsheets (XLSX file format) into SAS V9 data sets.

Developed by: BiX Corp. www.bixforsas.com

Preface

The XLSX2SAS macro enables SAS users to import Excel 2007/2010 spreadsheets (also known as XLSX file format) using a simple macro call within their SAS programs. The only required SAS product is SAS/BASE.

NO DDE, NO CSV, NO PC-FILE-SERVER, NO SAS/Access to PC file formats.

Supported SAS versions: 9.1, 9.2, 9.3 and 9.4. (Both 32 and 64 bit versions) Supported platforms: Unix (Solaris, AIX, HPUX), Linux and Windows workstations and servers. Running environments: SAS Batch, SAS EG, SAS DI and SAS Interactive DMS. The Interactive GUI is supported under SAS 9.2 interactive DMS and above on all platforms. Unix/Linux GUI requires X-Server package on your pc (Eg: MOBA XTERM, Exceed or any X11 compatible) and a special x-font setting (Please view the installation guide). Example:

%xlsx2sas( infilename = /home/alex45/test/demo.xlsx , outdata = work.demo , sheetnumber = 1 , xlsxrows = ALL , xlsxcols = ALL , xlsxlabels = DATA , xlsxmix = ALL , xlsxdebug = YES , xlsxdropvars = NO , xlsxdropobs = YES , xlsxmaxlen = 1000 );

Contents:

1. Macro Usage Notes ….. 2 2. Interactive GUI……………. 7

Page 2: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 2/12

1. Macro Usage Notes

The following section will explain in details each parameter of the macro and will give examples of using them in different scenarios. Some parameters have default values and are optional. The only two required parameters are: “infilename” and “outdata”. Default settings for each of the optional parameters can be set inside the macro code. Parameters values are case insensitive.

The minimal call to the macro will take the following form:

%xlsx2sas( infilename = /home/alex45/test/demo.xlsx , outdata = work.demo ); Note for SAS Enterprise Guide/ SAS DI Studio product users: You can use the macro within the “SAS code” node in your project. The interactive GUI is only applicable for native SAS DMS environment.

Macro Parameters

infilename (mandatory) The complete path/name of the xlsx file. Note for Unix/Linux users: in these operating systems file names are case sensitive. Also, depending on your login shell, you may use some Unix/Linux shortcuts like ~ to designate your home directory. Note for Windows users: if you have successfully installed 7zip then you need to specify the location and name of your XLSX file here. However, if 7zip is not installed then you must manually unzip the XLSX file prior to using the macro. In this scenario, use the “xlsxunzipdir” parameter (see below) instead of “infilename” which must be set to blank. outdata (mandatory) The SAS data set name to be created. Use standard SAS convention to name your data set. (Examples: work.demo, sasuser.budget). Note: If the data set exists, no warning is supplied and the data set will be overwritten.

Page 3: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 3/12

The following parameters are all optional but you may need to set their values for specific tuning: Sheetnumber (optional) Single integer number between 1 and 999. (Default=1). Points to the inner spreadsheet’s number inside the Excel workbook. If your Excel workbook contains more than one spreadsheet, you can read each one of them in a separate call to the macro. xlsxrows (optional) Can have one of two alternative forms:

1) ALL: All rows of the spreadsheet. (Default). 2) First_row – Last_row : two numbers (separated by a hyphen) representing the first row

to start the conversion from till last row. Examples: 1-90, 12-247,3-99999.

Note that Excel 2007/2010 can have a maximum of 1,048,576 rows. If your Excel spreadsheet contains a row with labels or names of columns, include this row number in the range of rows and do not forget to set the “xlsxlables” parameter to the appropriate value. Please note that the default “Demo” version of the macro is limited to the first 100 rows. For requesting a “Trial” version for 30 days (no rows limit) go to: http://bixforsas.com/?page_id=5286 For purchasing the “Pro” version (unlimited rows) please go to: http://bixforsas.com/?page_id=5442 xlsxcols (optional) Can have one of two alternative forms:

1) All : All columns of the spreadsheet. (Default). 2) List of Columns: List of Excel column’s names separated by space.

Examples: A B C D, A B E F, Q R T AA AB. Note that you can skip columns.

Note that an Excel spreadsheet may contain columns which are not visual in Excel but will appear in SAS. This can happen if you “clean” the contents of cells in the spreadsheet instead of “delete” them. (See also “xlsxdropvars” parameter). xlsxlabels (optional) Can have one of three alternative forms:

1) DATA: The first row of the selected rows range contains data values (Default). 2) NAME: The first row of the selected rows range is to be used as names for SAS

variables. If the value does not comply with SAS naming conventions, then invalid characters are replaced with underscores (_).

3) LABEL: The first row of the selected rows range is to be used as labels for SAS variables. SAS variables names will correspond to Excel cells names. (A B …)

Page 4: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 4/12

xlsxmix (optional) An Excel spreadsheet may contain (unlike SAS data set) a mixture of data types and formats under the same column. In many cases this mixture is not clearly seen while observing the contents of an Excel spreadsheet and they are only revealed when trying to import it to a SAS data set. This difference between the two data storing models imposes some problems while trying to import Excel data into SAS data set. Four possible values can be used within the “xlsxmix” parameter:

1) ALL (Default) Analyzes the distribution of Excel formats in each column and sets the informat to the one with the highest frequency. For example, if a column has many numeric values and sporadic character values, then the column will have a numeric informat and thus the character values will be set to missing values.

2) CHAR In every case of mixed formats in a column, the informat is set to $char. for that column.

3) FIRST The informat for converting the data will be based on the Excel cell format assigned to the first cell of each column.

4) List of columns Same as ALL, but forces the specified columns to be informatted as character values.

Page 5: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 5/12

Xlsxdebug (optional) (Supported under SAS 9.2 and above) Can have one of two alternative forms:

1) YES: In cases where value has been converted from numeric to char will print in the SAS log errors. This can help in detecting miscoded values inside the Excel spreadsheet. (Default). Note: Look in the error log for a SAS variable named “value1” to understand the error.

2) NO: No errors will be printed in the SAS log.

Xlsxdropvars (optional) Can have one of two alternative forms:

1) YES: Examine the spreadsheet and delete columns that have all values missing. Note: May impose longer completion time.

2) NO: No deletion of columns. (Default). Note: Empty columns may appear in Excel spreadsheet if, for example, a column is “cleared” instead of being “deleted”.

Xlsxdropobs (optional) Can have one of two alternative forms:

1) YES: Examine the spreadsheet and delete rows that have all values missing. Note: May impose longer completion time.

2) NO: No deletion of rows (Default).

Xlsxmaxlen (optional) Single number between 1 and 32767 (Default=1000). Sets the maximum length of all character values in the resulting SAS data set. Note: Usually the default value (1000) is adequate but if your Excel spreadsheet contains strings longer than 1000 characters then increasing the number will be needed. The actual length of each character variable in the resulting SAS data set is based on the maximum length found in that variable. Xlsxunzipdir (optional) Note: This parameter is applicable for Windows users only. If you can’t install and use the recommended 7zip utility on your system, you can alternatively designate a folder on your system and manually unzip your Excel XLSX files into that folder prior to invoking the macro. Note: If you use this method, leave the “infilename” parameter blank and verify that the site parameter “zip7dir” is also blank.

Page 6: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 6/12

How to unzip an Excel XLSX file?

1. Change the suffix of the file from *.xlsx to *.zip 2. Right click on the file and select the “Extract All…” options from the pop-menu:

In the above parameter “xlsxunzipdir” please select the directory where the unzipped files were written to. Xlsxguesscells (optional) In order to best set the length of each character SAS variable, the macro scans cells in order to “guess” the right value. (In Excel file each value under a column may have a different length…) For example, if your spreadsheet contains 20 columns then the default value of 5000 cells means that 5000/20=250 rows are scanned for the guessing process. You can increase the value to an unlimited number if you suspect that your spreadsheet contains longer text values down the rows. Increasing the value will result in longer time to complete the transformation.

Page 7: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 7/12

2. Interactive GUI

The XLSX2SAS package contains an interactive GUI (Graphical User Interface) which allows you to invoke and set parameters for the macro through an interactive window. You can use the GUI to create the macro call for you and then cut&paste the source code into your SAS program. The interactive GUI can also be used to save a named “profile” for each of your commonly used Excel files and reload that profile for repetitive use. Important Note: Prior to using the GUI, the macro has to be installed in the ‘SASAUTOS’ library with all site parameters correctly set. The BiX libname must also be set (autoexec.sas) prior to using the GUI. To invoke the GUI type the following SAS command in any command line of the SAS DMS environment or assign it to a function key: afa c=bix.xlsx.xlsx_unix.frame (For Unix/Linux sites) or afa c=bix.xlsx.xlsx_win.frame (For Windows sites) The following figure shows an example of linking the “afa” command to a function key:

Page 8: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 8/12

Upon launching the above” afa” command, you will be presented with the following window:

The “xlsx2sas wizard” window allows you to specify the mandatory parameters as well as the optional ones. These parameters are grouped into two separate tabs. You can also use features of the window to manage an archive of all your commonly used Excel “profiles” (see below). Upon filling the requirement for the mandatory parameters, click on the “OK” push button and the macro will start running. Note for Windows users: Select the method of extracting the zipped XLSX file: 7zip – if you have successfully installed “7zip” then the macro unzips the XLSX file for you. Fill in the complete path and name of your Excel XLSX file or use the … push-button to open the standard Windows file explorer for you where you can select the XLSX file. Manual - Specify the directory of the unzipped XLSX files. You can manually unzip the XLSX file after changing its suffix from “xlsx” to” zip” and “Extract” its contents to a specific folder. Since the inner names of files in each XLSX zip file are the same, you can always use the same folder (as a “transit” area) to send your extracted files to. Note for Unix/Linux users: The unzip utility is built in the system, thus you don’t need to extract the XLSX file manually. Fill in the complete path and name of your XLSX file (must be moved/ftp to the Unix/Linux file system prior to running the macro) .

Page 9: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 9/12

Viewing the result data set

If you’ll check the “view generated data set” then you can select from two different tools to view the resulting data set immediately upon completion of the conversion.

1) %Visual – If you have licensed %Visual macro from BiX, then you can immediately start to visual analyze your Excel data, create reports and charts and query the data in many ways. (To download your free trial copy of %Visual: http://bixforsas.com/?page_id=5).

2) SAS Viewtable – the standard SAS tool to view data.

The “optional Parameters” Tab:

Upon invoking the wizard all parameters shows default manufacture values. In many cases these values must be changed to correctly read your specific spreadsheet.

Full explanation of these parameters can be found in the first section of this document.

Selecting values in some parameters may open new data entry fields that must be filled in like in the following figure:

Page 10: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 10/12

For Example, if you choose to select specific columns from an Excel spreadsheet then by pushing the “…” push-button you can use the following wizard to select the required columns:

Page 11: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 11/12

Using the “profile” archiving system

Upon filling the mandatory and optional parameters you can save this combination of settings into a specific named “profile” as shown in the following figure:

The profile can be later used to save time when repetitive tasks are required to import XLSX files.

The profile is saved per user in his own “SASUSER” libname.

Page 12: XLSX2SAS Macro Usage Notes - BiX - bixforsas.combixforsas.com/bix_starter_edition/xlsx2sas_usage.pdf · XLSX2SAS Macro Usage Notes A SAS Base Macro that transforms Excel2007/2010

BiX |XLSX2SAS Macro Usage Notes 12/12

Viewing SAS source code

You can view the SAS source code generated by the wizard and save it by cut&paste technique into your SAS batch job.

For technical support please send email to [email protected] or visit www.bixforsas.com for FAQ. Document number: xlsx2sas v2.01. December 2013