Restriction: PROC IMPORT is available for the following operating environments:
OpenVMS Alpha
UNIX
Microsoft Windows.
PROC IMPORT
DATAFILE=" filename " TABLE=" tablename "
OUT=< libref. > SAS-data-set <( SAS-data-set-options )>
<DBMS= identifier ><REPLACE> ;
< data-source-statement(s) ;>
Featured in: All examples
PROC IMPORT
DATAFILE=" filename " TABLE=" tablename "
OUT=< libref. > SAS-data-set <( SAS-data-set-options )>
<DBMS= identifier ><REPLACE> ;
DATAFILE=" filename "
specifies the complete path and filename or a fileref for the input PC file, spreadsheet, or delimited external file. If you specify a fileref or if the complete path and filename does not include special characters (such as the backslash in a path ), lowercase characters , or spaces, you can omit the quotation marks. A fileref is a SAS name that is associated with the physical location of the output file. To assign a fileref, use the FILENAME statement. For more information about PC file formats, see SAS/ACCESS for PC Files: Reference .
Featured in: Example 1 on page 514, Example 2 on page 517, and Example 3 on page 518
Restriction: PROC IMPORT does not support device types or access methods for the FILENAME statement except for DISK. For example, PROC IMPORT does not support the TEMP device type, which creates a temporary external file.
Restriction: For client/server applications: When running SAS/ACCESS software on UNIX to access data that is stored on a PC server, you must specify the full path and filename of the file that you want to import. The use of a fileref is not supported.
Interaction: For some input data sources like a Microsoft Excel spreadsheet, in order to determine the data type (numeric or character) for a column, the first eight rows of data are scanned and the most prevalent type of data is used. If most of the data in the first eight rows is missing, SAS defaults to the character data type; any subsequent numeric data for that column becomes missing as well. Mixed data can also create missing values. For example, if the first eight rows contain mostly character data, SAS assigns the column as a character data type; any subsequent numeric data for that column becomes missing.
Restriction: PROC IMPORT can import data only if the data type is supported by SAS. SAS supports numeric and character types of data but not, for example, binary objects. If the data that you want to import is a type not supported by SAS, PROC IMPORT may not be able to import it correctly. In many cases, the procedure attempts to convert the data to the best of its ability; however, for some types, this is not possible.
Tip: For information about how SAS converts data types, see the specific information for the data source that you are importing in SAS/ACCESS for PC Files: Reference . For example, see the chapter Understanding XLS Essentials for a table that lists XLS data types and the resulting SAS variable data type and formats.
Tip: For a DBF file, if the file was created by Microsoft Visual FoxPro, the file must be exported by Visual FoxPro into an appropriate dBASE format in order to import the file to SAS.
TABLE=" tablename "
specifies the table name of the input DBMS table. If the name does not include special characters (such as question marks), lowercase characters, or spaces, you can omit the quotation marks. Note that the DBMS table name may be case sensitive.
Requirement: When you import a DBMS table, you must specify the DBMS= option.
Featured in: Example 4 on page 519
OUT=< libref. > SAS-data-set
identifies the output SAS data set with either a one- or two-level SAS name (library and member name). If the specified SAS data set does not exist, PROC IMPORT creates it. If you specify a one-level name, by default PROC IMPORT uses either the USER library (if assigned) or the WORK library (if USER not assigned).
Featured in: All examples
( SAS-data-set-options )
specifies SAS data set options. For example, to assign a password to the resulting SAS data set, you can use the ALTER=, PW=, READ=, or WRITE= data set option, or to import only data that meets a specified condition, you can use the WHERE= data set option. For information about all SAS data set options, see Data Set Options in SAS Language Reference: Dictionary .
Restriction: You cannot specify data set options when importing delimited, comma-separated, or tab-delimited external files.
Featured in: Example 3 on page 518
DBMS= identifier
specifies the type of data to import. To import a DBMS table, you must specify DBMS= using a valid database identifier. For example, DBMS=ACCESS specifies to import a Microsoft Access 2000 or 2002 table. To import PC files, spreadsheets, and delimited external files, you do not have to specify DBMS= if the filename that is
specified by DATAFILE= contains a valid extension so that PROC IMPORT can recognize the type of data. For example, PROC IMPORT recognizes the filename ACCOUNTS.WK1 as a Lotus 1-2-3 Release 2 spreadsheet and the filename MYDATA.CSV as a delimited external file that contains comma-separated data values; therefore, a DBMS= specification is not necessary.
The following values are valid for the DBMS= option:
Identifier | Input Data Source | Extension | Host Availability |
---|---|---|---|
ACCESS | Microsoft Access 2000 or 2002 table | .mdb | Microsoft Windows [*] |
ACCESS97 | Microsoft Access 97 table | .mdb | Microsoft Windows [*] |
ACCESS2000 | Microsoft Access 2000 table | .mdb | Microsoft Windows * |
ACCESS2002 | Microsoft Access 2002 table | .mdb | Microsoft Windows * |
ACCESSCS | Microsoft Access table | .mdb | UNIX |
CSV | delimited file (comma-separated values) | .csv | OpenVMS Alpha, UNIX, Microsoft Windows |
DBF | dBASE 5.0, IV, III+, and III files | .dbf | UNIX, Microsoft Windows |
DLM | delimited file (default delimiter is a blank) | .* | OpenVMS Alpha, UNIX, Microsoft Windows |
EXCEL | Excel 2000 or 2002 spreadsheet | .xls | Microsoft Windows * |
EXCEL4 | Excel 4.0 spreadsheet | .xls | Microsoft Windows |
EXCEL5 | Excel 5.0 or 7.0 (95) spreadsheet | .xls | Microsoft Windows |
EXCEL97 | Excel 97 or 7.0 (95) spreadsheet | .xls | Microsoft Windows * |
EXCEL2000 | Excel 2000 spreadsheet | .xls | Microsoft Windows * |
EXCELCS | Excel spreadsheet | .xls | UNIX |
JMP | JMP table | .jmp | UNIX, Microsoft Windows |
PCFS | Files on PC server | .* | UNIX |
TAB | delimited file (tab-delimited values) | .txt | OpenVMS Alpha, UNIX, Microsoft Windows |
WK1 | Lotus 1-2-3 Release 2 spreadsheet | .wk1 | Microsoft Windows |
WK3 | Lotus 1-2-3 Release 3 spreadsheet | .wk3 | Microsoft Windows |
WK4 | Lotus 1-2-3 Release 4 or 5 spreadsheet | .wk4 | Microsoft Windows |
[*] Not available for Microsoft Windows 64-Bit Edition. |
Restriction: The availability of an input data source depends on
the operating environment, and in some cases the platform, as specified in the previous table.
whether your site has a license to the SAS/ACCESS software for PC file formats. If you do not have a license, only delimited files are supported.
Featured in: Example 1 on page 514 and Example 4 on page 519
When you specify a value for DBMS=, consider the following:
To import a Microsoft Access table, PROC IMPORT can distinguish whether the table is in Access 97, 2000, or 2002 format regardless of your specification. For example, if you specify DBMS=ACCESS and the table is an Access 97 table, PROC IMPORT will import the file.
To import a Microsoft Excel spreadsheet, PROC IMPORT can distinguish some versions regardless of your specification. For example, if you specify DBMS=EXCEL and the spreadsheet is an Excel 97 spreadsheet, PROC IMPORT can import the file. However, if you specify DBMS=EXCEL4 and the spreadsheet is an Excel 2000 spreadsheet, PROC IMPORT cannot import the file. The following table lists the spreadsheets and whether PROC IMPORT can distinguish them based on the DBMS= specification:
Specification | Excel 2002 | Excel 2000 | Excel 97 | Excel 5.0 | Excel 4.0 |
---|---|---|---|---|---|
EXCEL | yes | yes | yes | yes | yes |
EXCEL2002 | yes | yes | yes | yes | yes |
EXCEL2000 | yes | yes | yes | yes | yes |
EXCEL97 | yes | yes | yes | yes | yes |
EXCEL5 | no | no | no | yes | yes |
EXCEL4 | no | no | no | yes | yes |
Note: Although Excel 4.0 and Excel 5.0 spreadsheets are often interchangeable, it is recommended that you specify the exact version.
REPLACE
overwrites an existing SAS data set. If you do not specify REPLACE, PROC IMPORT does not overwrite an existing data set.
Featured in: Example 1 on page 514
Featured in: All examples
PROC IMPORT provides a variety of statements that are specific to the input data source.
The following table lists the statements that are available to import PC files, spreadsheets, and delimited external files, and it denotes which statements are valid for a specific data source. For example, Excel spreadsheets have optional statements to indicate whether column names are in the first row of data or which sheet and range of data to import, while a dBASE file (DBF) does not. For more information about PC file formats, see SAS/ACCESS for PC Files: Reference .
Data Source | Supported Syntax | Valid Values | Default Value |
---|---|---|---|
CSV/TAB | GETNAMES= DATAROW= GUESSING ROWS= | YES NO 1 to 32767 1 to 32767 | YES 2 none |
DLM | GETNAMES= DATAROW= GUESSINGROWS= | YES NO 1 to 32767 1 to 32767 | YES 2 none |
JMP | |||
DBF | GETDELETED= | YES NO | NO |
WK1 / WK3 / WK4 | GETNAMES= RANGE= SHEET= | YES NO Range Name or Absolute Range Value, such as A1...C4 Sheet Name | YES |
EXCEL4 / EXCEL5 | GETNAMES= RANGE= SHEET= | YES NO Range Name or Absolute Range Value, such as A1...C4 Sheet Name | YES |
EXCEL EXCEL97 EXCEL2000 EXCEL2002 | GETNAMES= RANGE= SHEET= MIXED= SCANTEXT= SCANTIME= USEDATE= TEXTSIZE= DBSASLABEL= | YES NO Range Name or Absolute Range Value, such as A1...C4 Sheet Name YES NO YES NO YES NO YES NO 1 to 32767 COMPAT NONE | YES NO YES YES YES 1024 COMPAT |
EXCELCS | VERSION= SERVER= SERVICE= PORT= RANGE= SHEET= SCANTEXT= SCANTIME= USEDATE= TEXTSIZE= DBSASLABEL= | 5 95 97 2000 2002 Server Name Service Name 1 to 32767 Range Name or Absolute Range Value, such as A1...C4 Sheet Name YES NO YES NO YES NO 1 to 32767 COMPAT NONE | 97 YES YES YES 1024 COMPAT |
DATAROW= n ;
starts reading data from row number n in the external file.
Default:
1 | when GETNAMES=NO |
2 | when GETNAMES=YES (default for GETNAMES=) |
Interaction: When GETNAMES=YES, DATAROW= must be equal to or greater than 2. When GETNAMES=NO, DATAROW must be equal to or greater than 1.
DBSASLABEL=COMPAT NONE;
When DBSASLABEL=COMPAT, the data source s column names are saved as the corresponding SAS label names. This is the default value.
WHEN DBSASLABEL=NONE, the data source s column names are not saved as SAS label names. SAS label names are left as nulls.
Featured in: Example 1 on page 514
DELIMITER= char nn x ;
for a delimited external file, specifies the delimiter that separates columns of data in the input file. You can specify the delimiter as a single character or as a hexadecimal value. For example, if columns of data are separated by an ampersand, specify DELIMITER= & . If you do not specify DELIMITER=, PROC IMPORT assumes that the delimiter is the blank. You can replace the equal sign with a blank.
GETDELETED=YES NO;
for a dBASE file (DBF), indicates whether to write records to the SAS data set that are marked for deletion but have not been purged. You can replace the equal sign with a blank.
GETNAMES=YES NO;
for spreadsheets and delimited external files, determines whether to generate SAS variable names from the column names in the input file s first row of data. You can replace the equal sign with a blank.
If you specify GETNAMES=NO or if the column names are not valid SAS names, PROC IMPORT uses default variable names. For example, for a delimited file, PROC IMPORT uses VAR1, VAR2, VAR3, and so on.
Note that if a column name contains special characters that are not valid in a SAS name, such as a blank, SAS converts the character to an underscore . For example, the column name Occupancy Code would become the variable name Occupancy_Code .
GUESSING ROWS=1 to 3276;
scans data for its data type from row 1 to the row number that is specified.
Note: This number should be greater than the value that is specified for DATAROW=.
MIXED=YES NO;
converts numeric data values into character data values for a column that contains mixed data types. This option is valid only while importing data from Excel. The default is NO, which means that numeric data will be imported as missing values in a character column. If MIXED=YES, then the engine will assign a SAS character type for the column and convert all numeric data values to character data values. This option is valid only while reading (importing) data into SAS.
PORT=1 to 3276;
scans data for its data type from row 1 to the row number that is specified.
Note: This number should be greater than the value that is specified for DATAROW=.
TEXTSIZE=1 to 32767
specifies the field length that is allowed for importing Microsoft Excel 97, 2000, or 2002 Memo fields.
RANGE=" range-name absolute-range ";
subsets a spreadsheet by identifying the rectangular set of cells to import from the specified spreadsheet. The syntax for range-name and absolute-range is native to the file being read. You can replace the equal sign with a blank.
range-name | is a name that has been assigned to represent a range, such as a range of cells within the spreadsheet. Limitation: SAS supports range names up to 32 characters. If a range name exceeds 32 characters, SAS will notify you that the name is invalid. Tip: For Microsoft Excel, range names do not contain special characters such as spaces or hyphens. |
absolute-range | identifies the top left cell that begins the range and the bottom right cell that ends the range. For Excel 4.0, 5.0, and 7.0 (95), the beginning and ending cells are separated by two periods; that is, C9..F12 specifies a cell range that begins at cell C9, ends at cell F12, and includes all the cells in between. For Excel 97, 2000, and 2002, the beginning and ending cells are separated by a colon “ that is, C9:F12 . Tip: For Excel 97, 2000, and 2002, you can include the spreadsheet name with an absolute range, such as range="North B$a1:d3" . If you do not include the spreadsheet name, PROC IMPORT uses the first sheet in the workbook or the spreadsheet name specified with SHEET=. |
Default: The entire spreadsheet is selected.
Interaction: For Excel 97, 2000, and 2002 spreadsheets, when RANGE= is specified, a spreadsheet name specified with SHEET= is ignored when the conflict occurs.
SCANTEXT=YES NO;
scans the length of text data for a data source column and uses the length of the longest string data that it finds as the SAS column width. However, if the maximum length that it finds is greater than what is specified in the TEXTSIZE= option, then the smaller value that is specified in TEXTSIZE= will be applied as the SAS variable width.
SCANTIME=YES NO;
scans all row values for a DATETIME data type field and automatically determines the TIME data type if only time values (that is, no date or datetime values) exist in the column.
SERVER=" PC-server-name ";
specifies the name of the PC server. You must bring up the listener on the PC server before you can establish a connection to it. You can configure the service name, port number, maximum number of connections allowed, and use of data encryption on your PC server. This is a required statement. Refer to your PC server administrator for the information that is needed. Alias: SERVER_NAME=.
SERVICE=" service-name ";
specifies the service name that is defined on your service file for your client and server machines. This statement and the PORT= statement should not be used in the same procedure. Note that this service name must be defined on both your UNIX machine and your PC server. Alias: SERVER_NAME=, SERVICE_NAME=.
SHEET= spreadsheet-name ;
identifies a particular spreadsheet in a group of spreadsheets. Use this statement with spreadsheets that support multiple spreadsheets within a single file. The naming convention for the spreadsheet name is native to the file being read.
Featured in: Example 2 on page 517
Default: The default depends on the type of spreadsheet. For Excel 4.0 and 5.0, PROC IMPORT reads the first spreadsheet in the file. For Excel 97 and later, PROC IMPORT reads the first spreadsheet from an ascending sort of the spreadsheet names. To be certain that PROC IMPORT reads the desired spreadsheet, you should identify the spreadsheet by specifying SHEET=.
Limitation: SAS supports spreadsheet names up to 31 characters. With the $ appended, the maximum length of a spreadsheet name is 32 characters.
USEDATE=YES NO;
If USEDATE=YES, then DATE. format is used for date/time columns in the data source table while importing data from Excel workbook. If USEDATE=NO, then DATETIME. format is used for date/time.
VERSION=" file-version ";
specifies the version of file that you want to create with if the file does not exist on your PC server yet. The default version is data-source specific. For Microsoft Excel workbook, the valid values are 2002 , 2000 , 97 , 95 and 5 , and its default value is 97 .
Note: Always quote the version value.
Note: If the file already exists in the PC server, then this value can be ignored.
The following data source statements are available to establish a connection to the DBMS when you import a DBMS table.
Data Source | Supported Syntax | Valid Values | Default Value |
ACCESS ACCESS97 ACCESS2000 ACCESS2002 | DATABASE= DBPWD= UID= PWD= WGDB= SCANMEMO= SCANTIME= USEDATE= MEMOSIZE= DBSASLABEL= | The complete path and filename for the MS ACCESS database file. Database password User ID User password The complete path and filename for the Workgroup Administration file. YES NO YES NO YES NO 1 to 32767 COMPAT NONE | YES YES NO 1024 COMPAT |
ACCESSCS | VERSION= SERVER= SERVICE= PORT= DATABASE= DBPWD= UID= PWD= WGDB= SCANMEMO= SCANTIME= USEDATE= MEMOSIZE= DBSASLABEL= | 97 2000 2002 Server Name Service Name 1 to 32767 The complete path and filename for the MS ACCESS database file. Database password User ID User password The complete path and file name for the Workgroup Administration file. YES NO YES NO YES NO 1 to 32767 COMPAT NONE | 2000 YES YES YES 1024 COMPAT |
DATABASE=" database ";
specifies the complete path and filename of the database that contains the specified DBMS table. If the database name does not contain lowercase characters, special characters, or national characters ($, #, or @), you can omit the quotation marks. You can replace the equal sign with a blank.
Note: A default may be configured in the DBMS client software; however, SAS does not generate a default value.
DBPWD=" database password ";
specifies a password that allows access to a database. You can replace the equal sign with a blank.
DBSASLABEL=COMPAT NONE;
When DBSASLABEL=COMPAT, the data source s column names are saved as the corresponding SAS label names. This is the default value.
WHEN DBSASLABEL=NONE, the data source s column names are not saved as SAS label names. SAS label names are left as nulls.
Featured in: Example 1 on page 514
MEMOSIZE=" field-length ";
specifies the field length for importing Microsoft Access Memo fields.
Range:
Default:
Tip:
1 to 32,767
1024
To prevent Memo fields from being imported, you can specify MEMOSIZE=0
Range: 1 - 32,767 Default: 1024 Tip: To prevent Memo fields from being imported, you can specify MEMOSIZE=0.
PORT=1 to 3276;
scans data for its data type from row 1 to the row number that is specified.
Note: This number should be greater than the value that is specified for DATAROW=.
PWD=" password ";
specifies the user password used by the DBMS to validate a specific userid . If the password does not contain lowercase characters, special characters, or national characters, you can omit the quotation marks. You can replace the equal sign with a blank.
Note: The DBMS client software may default to the userid and password that were used to log in to the operating environment; SAS does not generate a default value.
SCANMEMO=YES NO;
scans the length of data for memo fields and uses the length of the longest string data that it finds as the SAS column width. However, if the maximum length that it finds is greater than what is specified in the MEMOSIZE= option, then the smaller value that is specified in MEMOSIZE= will be applied as the SAS variable width.
SCANTIME=YES NO;
scans all row values for a DATETIME data type field and automatically determines the TIME data type if only time values (that is, no date or datetime values) exist in the column.
SERVER=" PC-server-name ";
specifies the name of the PC server. You must bring up the listener on the PC server before you can establish a connection to it. You can configure the service name, port number, maximum number of connections allowed, and use of data encryption on your PC server. This is a required statement. Refer to your PC server administrator for the information that is needed. Alias: SERVER_NAME=.
SERVICE=" service-name ";
specifies the service name that is defined on your service file for your client and server machines. This statement and the PORT= statement should not be used in the same procedure. Note that this service name must be defined on both your UNIX machine and your PC server. Alias: SERVER_NAME=.
UID= " user-id ";
identifies the user to the DBMS. If the userid does not contain lowercase characters, special characters, or national characters, you can omit the quotation marks. You can replace the equal sign with a blank.
Note: The DBMS client software may default to the userid and password that were used to log in to the operating environment; SAS does not generate a default value.
WGDB= " workgroup-database-name ";
specifies the workgroup (security) database name that contains the USERID and PWD data for the DBMS. If the workgroup database name does not contain lowercase characters, special characters, or national characters, you can omit the quotation marks. You can replace the equal sign with a blank.
Note: A default workgroup database may be used by the DBMS; SAS does not generate a default value.
USEDATE=YES NO;
If USEDATE=YES, then DATE. format is used for date/time columns in the data source table while importing data from Excel workbook. If USEDATE=NO, then DATETIME. format is used for date/time.
VERSION=" file-version ";
specifies the version of file that you want to create with if the file does not exist on your PC server yet. The default version is data-source specific. For Microsoft Excel workbook, the valid values are 2002 , 2000 , 97 , 95 and 5 , and its default value is 97 .
Note: Always quote the version value.
Note: If the file already exists in the PC Server, this value can be ignored.
Microsoft Access tables have the following levels of security, for which specific combinations of security statements must be used:
None
Do not specify DBPWD=, PWD=, UID=, or WGDB=.
Password
Specify only DBPWD=.
User-level
Specify only PWD=, UID=, and WGDB=.
Full
Specify DBPWD=, PWD=, UID=, and WGDB=.
Each statement has a default value; however, you may find it necessary to provide a value for each statement explicitly.