Syntax: IMPORT Procedure


Restriction: PROC IMPORT is available for the following operating environments:

  • OpenVMS Alpha

  • UNIX

  • Microsoft Windows.

PROC IMPORT

  • DATAFILE=" filename " TABLE=" tablename "

    OUT=< libref. > SAS-data-set <( SAS-data-set-options )>

    <DBMS= identifier ><REPLACE> ;

  • < data-source-statement(s) ;>

PROC IMPORT Statement

Featured in: All examples

PROC IMPORT

  • DATAFILE=" filename " TABLE=" tablename "

    OUT=< libref. > SAS-data-set <( SAS-data-set-options )>

    <DBMS= identifier ><REPLACE> ;

Required Arguments

DATAFILE=" filename "

  • specifies the complete path and filename or a fileref for the input PC file, spreadsheet, or delimited external file. If you specify a fileref or if the complete path and filename does not include special characters (such as the backslash in a path ), lowercase characters , or spaces, you can omit the quotation marks. A fileref is a SAS name that is associated with the physical location of the output file. To assign a fileref, use the FILENAME statement. For more information about PC file formats, see SAS/ACCESS for PC Files: Reference .

  • Featured in: Example 1 on page 514, Example 2 on page 517, and Example 3 on page 518

  • Restriction: PROC IMPORT does not support device types or access methods for the FILENAME statement except for DISK. For example, PROC IMPORT does not support the TEMP device type, which creates a temporary external file.

  • Restriction: For client/server applications: When running SAS/ACCESS software on UNIX to access data that is stored on a PC server, you must specify the full path and filename of the file that you want to import. The use of a fileref is not supported.

  • Interaction: For some input data sources like a Microsoft Excel spreadsheet, in order to determine the data type (numeric or character) for a column, the first eight rows of data are scanned and the most prevalent type of data is used. If most of the data in the first eight rows is missing, SAS defaults to the character data type; any subsequent numeric data for that column becomes missing as well. Mixed data can also create missing values. For example, if the first eight rows contain mostly character data, SAS assigns the column as a character data type; any subsequent numeric data for that column becomes missing.

  • Restriction: PROC IMPORT can import data only if the data type is supported by SAS. SAS supports numeric and character types of data but not, for example, binary objects. If the data that you want to import is a type not supported by SAS, PROC IMPORT may not be able to import it correctly. In many cases, the procedure attempts to convert the data to the best of its ability; however, for some types, this is not possible.

  • Tip: For information about how SAS converts data types, see the specific information for the data source that you are importing in SAS/ACCESS for PC Files: Reference . For example, see the chapter Understanding XLS Essentials for a table that lists XLS data types and the resulting SAS variable data type and formats.

  • Tip: For a DBF file, if the file was created by Microsoft Visual FoxPro, the file must be exported by Visual FoxPro into an appropriate dBASE format in order to import the file to SAS.

TABLE=" tablename "

  • specifies the table name of the input DBMS table. If the name does not include special characters (such as question marks), lowercase characters, or spaces, you can omit the quotation marks. Note that the DBMS table name may be case sensitive.

    Requirement: When you import a DBMS table, you must specify the DBMS= option.

    Featured in: Example 4 on page 519

OUT=< libref. > SAS-data-set

  • identifies the output SAS data set with either a one- or two-level SAS name (library and member name). If the specified SAS data set does not exist, PROC IMPORT creates it. If you specify a one-level name, by default PROC IMPORT uses either the USER library (if assigned) or the WORK library (if USER not assigned).

    Featured in: All examples

( SAS-data-set-options )

  • specifies SAS data set options. For example, to assign a password to the resulting SAS data set, you can use the ALTER=, PW=, READ=, or WRITE= data set option, or to import only data that meets a specified condition, you can use the WHERE= data set option. For information about all SAS data set options, see Data Set Options in SAS Language Reference: Dictionary .

    Restriction: You cannot specify data set options when importing delimited, comma-separated, or tab-delimited external files.

    Featured in: Example 3 on page 518

Options

DBMS= identifier

  • specifies the type of data to import. To import a DBMS table, you must specify DBMS= using a valid database identifier. For example, DBMS=ACCESS specifies to import a Microsoft Access 2000 or 2002 table. To import PC files, spreadsheets, and delimited external files, you do not have to specify DBMS= if the filename that is

    specified by DATAFILE= contains a valid extension so that PROC IMPORT can recognize the type of data. For example, PROC IMPORT recognizes the filename ACCOUNTS.WK1 as a Lotus 1-2-3 Release 2 spreadsheet and the filename MYDATA.CSV as a delimited external file that contains comma-separated data values; therefore, a DBMS= specification is not necessary.

    The following values are valid for the DBMS= option:

    Identifier

    Input Data Source

    Extension

    Host Availability

    ACCESS

    Microsoft Access 2000 or 2002 table

    .mdb

    Microsoft Windows [*]

    ACCESS97

    Microsoft Access 97 table

    .mdb

    Microsoft Windows [*]

    ACCESS2000

    Microsoft Access 2000 table

    .mdb

    Microsoft Windows *

    ACCESS2002

    Microsoft Access 2002 table

    .mdb

    Microsoft Windows *

    ACCESSCS

    Microsoft Access table

    .mdb

    UNIX

    CSV

    delimited file (comma-separated values)

    .csv

    OpenVMS Alpha, UNIX, Microsoft Windows

    DBF

    dBASE 5.0, IV, III+, and III files

    .dbf

    UNIX, Microsoft Windows

    DLM

    delimited file (default delimiter is a blank)

    .*

    OpenVMS Alpha, UNIX, Microsoft Windows

    EXCEL

    Excel 2000 or 2002 spreadsheet

    .xls

    Microsoft Windows *

    EXCEL4

    Excel 4.0 spreadsheet

    .xls

    Microsoft Windows

    EXCEL5

    Excel 5.0 or 7.0 (95) spreadsheet

    .xls

    Microsoft Windows

    EXCEL97

    Excel 97 or 7.0 (95) spreadsheet

    .xls

    Microsoft Windows *

    EXCEL2000

    Excel 2000 spreadsheet

    .xls

    Microsoft Windows *

    EXCELCS

    Excel spreadsheet

    .xls

    UNIX

    JMP

    JMP table

    .jmp

    UNIX, Microsoft Windows

    PCFS

    Files on PC server

    .*

    UNIX

    TAB

    delimited file (tab-delimited values)

    .txt

    OpenVMS Alpha, UNIX, Microsoft Windows

    WK1

    Lotus 1-2-3 Release 2 spreadsheet

    .wk1

    Microsoft Windows

    WK3

    Lotus 1-2-3 Release 3 spreadsheet

    .wk3

    Microsoft Windows

    WK4

    Lotus 1-2-3 Release 4 or 5 spreadsheet

    .wk4

    Microsoft Windows

    [*] Not available for Microsoft Windows 64-Bit Edition.

    Restriction: The availability of an input data source depends on

    • the operating environment, and in some cases the platform, as specified in the previous table.

    • whether your site has a license to the SAS/ACCESS software for PC file formats. If you do not have a license, only delimited files are supported.

  • Featured in: Example 1 on page 514 and Example 4 on page 519

    • When you specify a value for DBMS=, consider the following:

      • To import a Microsoft Access table, PROC IMPORT can distinguish whether the table is in Access 97, 2000, or 2002 format regardless of your specification. For example, if you specify DBMS=ACCESS and the table is an Access 97 table, PROC IMPORT will import the file.

      • To import a Microsoft Excel spreadsheet, PROC IMPORT can distinguish some versions regardless of your specification. For example, if you specify DBMS=EXCEL and the spreadsheet is an Excel 97 spreadsheet, PROC IMPORT can import the file. However, if you specify DBMS=EXCEL4 and the spreadsheet is an Excel 2000 spreadsheet, PROC IMPORT cannot import the file. The following table lists the spreadsheets and whether PROC IMPORT can distinguish them based on the DBMS= specification:

        Specification

        Excel 2002

        Excel 2000

        Excel 97

        Excel 5.0

        Excel 4.0

        EXCEL

        yes

        yes

        yes

        yes

        yes

        EXCEL2002

        yes

        yes

        yes

        yes

        yes

        EXCEL2000

        yes

        yes

        yes

        yes

        yes

        EXCEL97

        yes

        yes

        yes

        yes

        yes

        EXCEL5

        no

        no

        no

        yes

        yes

        EXCEL4

        no

        no

        no

        yes

        yes

        Note: Although Excel 4.0 and Excel 5.0 spreadsheets are often interchangeable, it is recommended that you specify the exact version.

REPLACE

  • overwrites an existing SAS data set. If you do not specify REPLACE, PROC IMPORT does not overwrite an existing data set.

    Featured in: Example 1 on page 514

Data Source Statements

Featured in: All examples

PROC IMPORT provides a variety of statements that are specific to the input data source.

Statements for PC Files, Spreadsheets, or Delimited External Files

The following table lists the statements that are available to import PC files, spreadsheets, and delimited external files, and it denotes which statements are valid for a specific data source. For example, Excel spreadsheets have optional statements to indicate whether column names are in the first row of data or which sheet and range of data to import, while a dBASE file (DBF) does not. For more information about PC file formats, see SAS/ACCESS for PC Files: Reference .

Data Source

Supported Syntax

Valid Values

Default Value

CSV/TAB

GETNAMES=

DATAROW=

GUESSING ROWS=

YES NO

1 to 32767

1 to 32767

YES

2

none

DLM

GETNAMES=

DATAROW=

GUESSINGROWS=

YES NO

1 to 32767

1 to 32767

YES

2

none

JMP

DBF

GETDELETED=

YES NO

NO

WK1 / WK3 / WK4

GETNAMES=

RANGE=

SHEET=

YES NO

Range Name or Absolute Range Value, such as A1...C4

Sheet Name

YES

EXCEL4 / EXCEL5

GETNAMES=

RANGE=

SHEET=

YES NO

Range Name or Absolute Range Value, such as A1...C4

Sheet Name

YES

EXCEL

EXCEL97

EXCEL2000

EXCEL2002

GETNAMES=

RANGE=

SHEET=

MIXED=

SCANTEXT=

SCANTIME=

USEDATE=

TEXTSIZE=

DBSASLABEL=

YES NO

Range Name or

Absolute Range Value,

such as A1...C4

Sheet Name

YES NO

YES NO

YES NO

YES NO

1 to 32767

COMPAT NONE

YES

NO

YES

YES

YES

1024

COMPAT

EXCELCS

VERSION=

SERVER=

SERVICE=

PORT=

RANGE=

SHEET=

SCANTEXT=

SCANTIME=

USEDATE=

TEXTSIZE=

DBSASLABEL=

5 95 97 2000

2002

Server Name

Service Name

1 to 32767

Range Name or Absolute Range Value, such as A1...C4

Sheet Name

YES NO

YES NO

YES NO

1 to 32767

COMPAT NONE

97

YES

YES

YES

1024

COMPAT

DATAROW= n ;

  • starts reading data from row number n in the external file.

    Default:

    1

    when GETNAMES=NO

    2

    when GETNAMES=YES (default for GETNAMES=)

  • Interaction: When GETNAMES=YES, DATAROW= must be equal to or greater than 2. When GETNAMES=NO, DATAROW must be equal to or greater than 1.

DBSASLABEL=COMPAT NONE;

  • When DBSASLABEL=COMPAT, the data source s column names are saved as the corresponding SAS label names. This is the default value.

    WHEN DBSASLABEL=NONE, the data source s column names are not saved as SAS label names. SAS label names are left as nulls.

    Featured in: Example 1 on page 514

DELIMITER= char nn x ;

  • for a delimited external file, specifies the delimiter that separates columns of data in the input file. You can specify the delimiter as a single character or as a hexadecimal value. For example, if columns of data are separated by an ampersand, specify DELIMITER= & . If you do not specify DELIMITER=, PROC IMPORT assumes that the delimiter is the blank. You can replace the equal sign with a blank.

GETDELETED=YES NO;

  • for a dBASE file (DBF), indicates whether to write records to the SAS data set that are marked for deletion but have not been purged. You can replace the equal sign with a blank.

GETNAMES=YES NO;

  • for spreadsheets and delimited external files, determines whether to generate SAS variable names from the column names in the input file s first row of data. You can replace the equal sign with a blank.

    If you specify GETNAMES=NO or if the column names are not valid SAS names, PROC IMPORT uses default variable names. For example, for a delimited file, PROC IMPORT uses VAR1, VAR2, VAR3, and so on.

    Note that if a column name contains special characters that are not valid in a SAS name, such as a blank, SAS converts the character to an underscore . For example, the column name Occupancy Code would become the variable name Occupancy_Code .

GUESSING ROWS=1 to 3276;

  • scans data for its data type from row 1 to the row number that is specified.

    Note: This number should be greater than the value that is specified for DATAROW=.

MIXED=YES NO;

  • converts numeric data values into character data values for a column that contains mixed data types. This option is valid only while importing data from Excel. The default is NO, which means that numeric data will be imported as missing values in a character column. If MIXED=YES, then the engine will assign a SAS character type for the column and convert all numeric data values to character data values. This option is valid only while reading (importing) data into SAS.

PORT=1 to 3276;

  • scans data for its data type from row 1 to the row number that is specified.

    Note: This number should be greater than the value that is specified for DATAROW=.

TEXTSIZE=1 to 32767

  • specifies the field length that is allowed for importing Microsoft Excel 97, 2000, or 2002 Memo fields.

RANGE=" range-name absolute-range ";

  • subsets a spreadsheet by identifying the rectangular set of cells to import from the specified spreadsheet. The syntax for range-name and absolute-range is native to the file being read. You can replace the equal sign with a blank.

    range-name

    is a name that has been assigned to represent a range, such as a range of cells within the spreadsheet.

    Limitation: SAS supports range names up to 32 characters. If a range name exceeds 32 characters, SAS will notify you that the name is invalid.

    Tip: For Microsoft Excel, range names do not contain special characters such as spaces or hyphens.

    absolute-range

    identifies the top left cell that begins the range and the bottom right cell that ends the range. For Excel 4.0, 5.0, and 7.0 (95), the beginning and ending cells are separated by two periods; that is, C9..F12 specifies a cell range that begins at cell C9, ends at cell F12, and includes all the cells in between. For Excel 97, 2000, and 2002, the beginning and ending cells are separated by a colon “ that is, C9:F12 .

    Tip: For Excel 97, 2000, and 2002, you can include the spreadsheet name with an absolute range, such as range="North B$a1:d3" . If you do not include the spreadsheet name, PROC IMPORT uses the first sheet in the workbook or the spreadsheet name specified with SHEET=.

    Default: The entire spreadsheet is selected.

    Interaction: For Excel 97, 2000, and 2002 spreadsheets, when RANGE= is specified, a spreadsheet name specified with SHEET= is ignored when the conflict occurs.

SCANTEXT=YES NO;

  • scans the length of text data for a data source column and uses the length of the longest string data that it finds as the SAS column width. However, if the maximum length that it finds is greater than what is specified in the TEXTSIZE= option, then the smaller value that is specified in TEXTSIZE= will be applied as the SAS variable width.

SCANTIME=YES NO;

  • scans all row values for a DATETIME data type field and automatically determines the TIME data type if only time values (that is, no date or datetime values) exist in the column.

SERVER=" PC-server-name ";

  • specifies the name of the PC server. You must bring up the listener on the PC server before you can establish a connection to it. You can configure the service name, port number, maximum number of connections allowed, and use of data encryption on your PC server. This is a required statement. Refer to your PC server administrator for the information that is needed. Alias: SERVER_NAME=.

SERVICE=" service-name ";

  • specifies the service name that is defined on your service file for your client and server machines. This statement and the PORT= statement should not be used in the same procedure. Note that this service name must be defined on both your UNIX machine and your PC server. Alias: SERVER_NAME=, SERVICE_NAME=.

SHEET= spreadsheet-name ;

  • identifies a particular spreadsheet in a group of spreadsheets. Use this statement with spreadsheets that support multiple spreadsheets within a single file. The naming convention for the spreadsheet name is native to the file being read.

    Featured in: Example 2 on page 517

    Default: The default depends on the type of spreadsheet. For Excel 4.0 and 5.0, PROC IMPORT reads the first spreadsheet in the file. For Excel 97 and later, PROC IMPORT reads the first spreadsheet from an ascending sort of the spreadsheet names. To be certain that PROC IMPORT reads the desired spreadsheet, you should identify the spreadsheet by specifying SHEET=.

    Limitation: SAS supports spreadsheet names up to 31 characters. With the $ appended, the maximum length of a spreadsheet name is 32 characters.

USEDATE=YES NO;

  • If USEDATE=YES, then DATE. format is used for date/time columns in the data source table while importing data from Excel workbook. If USEDATE=NO, then DATETIME. format is used for date/time.

VERSION=" file-version ";

  • specifies the version of file that you want to create with if the file does not exist on your PC server yet. The default version is data-source specific. For Microsoft Excel workbook, the valid values are 2002 , 2000 , 97 , 95 and 5 , and its default value is 97 .

    Note: Always quote the version value.

    Note: If the file already exists in the PC server, then this value can be ignored.

Statements for DBMS Tables

The following data source statements are available to establish a connection to the DBMS when you import a DBMS table.

Data Source

Supported Syntax

Valid Values

Default Value

ACCESS

ACCESS97

ACCESS2000

ACCESS2002

DATABASE=

DBPWD=

UID=

PWD=

WGDB=

SCANMEMO=

SCANTIME=

USEDATE=

MEMOSIZE=

DBSASLABEL=

The complete path and filename for the MS ACCESS database file.

Database password

User ID

User password

The complete path and filename for the Workgroup Administration file.

YES NO

YES NO

YES NO

1 to 32767

COMPAT NONE

YES

YES

NO

1024

COMPAT

ACCESSCS

VERSION=

SERVER=

SERVICE=

PORT=

DATABASE=

DBPWD=

UID=

PWD=

WGDB=

SCANMEMO=

SCANTIME=

USEDATE=

MEMOSIZE=

DBSASLABEL=

97 2000 2002

Server Name

Service Name

1 to 32767

The complete path and filename for the MS ACCESS database file.

Database password

User ID

User password

The complete path and file name for the Workgroup Administration file.

YES NO

YES NO

YES NO

1 to 32767

COMPAT NONE

2000

YES

YES

YES

1024

COMPAT

DATABASE=" database ";

  • specifies the complete path and filename of the database that contains the specified DBMS table. If the database name does not contain lowercase characters, special characters, or national characters ($, #, or @), you can omit the quotation marks. You can replace the equal sign with a blank.

    Note: A default may be configured in the DBMS client software; however, SAS does not generate a default value.

DBPWD=" database password ";

  • specifies a password that allows access to a database. You can replace the equal sign with a blank.

DBSASLABEL=COMPAT NONE;

  • When DBSASLABEL=COMPAT, the data source s column names are saved as the corresponding SAS label names. This is the default value.

    WHEN DBSASLABEL=NONE, the data source s column names are not saved as SAS label names. SAS label names are left as nulls.

    Featured in: Example 1 on page 514

MEMOSIZE=" field-length ";

  • specifies the field length for importing Microsoft Access Memo fields.

  • Range:

  • Default:

  • Tip:

  • 1 to 32,767

  • 1024

  • To prevent Memo fields from being imported, you can specify MEMOSIZE=0

    Range: 1 - 32,767 Default: 1024 Tip: To prevent Memo fields from being imported, you can specify MEMOSIZE=0.

PORT=1 to 3276;

  • scans data for its data type from row 1 to the row number that is specified.

    Note: This number should be greater than the value that is specified for DATAROW=.

PWD=" password ";

  • specifies the user password used by the DBMS to validate a specific userid . If the password does not contain lowercase characters, special characters, or national characters, you can omit the quotation marks. You can replace the equal sign with a blank.

    Note: The DBMS client software may default to the userid and password that were used to log in to the operating environment; SAS does not generate a default value.

SCANMEMO=YES NO;

  • scans the length of data for memo fields and uses the length of the longest string data that it finds as the SAS column width. However, if the maximum length that it finds is greater than what is specified in the MEMOSIZE= option, then the smaller value that is specified in MEMOSIZE= will be applied as the SAS variable width.

SCANTIME=YES NO;

  • scans all row values for a DATETIME data type field and automatically determines the TIME data type if only time values (that is, no date or datetime values) exist in the column.

SERVER=" PC-server-name ";

  • specifies the name of the PC server. You must bring up the listener on the PC server before you can establish a connection to it. You can configure the service name, port number, maximum number of connections allowed, and use of data encryption on your PC server. This is a required statement. Refer to your PC server administrator for the information that is needed. Alias: SERVER_NAME=.

SERVICE=" service-name ";

  • specifies the service name that is defined on your service file for your client and server machines. This statement and the PORT= statement should not be used in the same procedure. Note that this service name must be defined on both your UNIX machine and your PC server. Alias: SERVER_NAME=.

UID= " user-id ";

  • identifies the user to the DBMS. If the userid does not contain lowercase characters, special characters, or national characters, you can omit the quotation marks. You can replace the equal sign with a blank.

    Note: The DBMS client software may default to the userid and password that were used to log in to the operating environment; SAS does not generate a default value.

WGDB= " workgroup-database-name ";

  • specifies the workgroup (security) database name that contains the USERID and PWD data for the DBMS. If the workgroup database name does not contain lowercase characters, special characters, or national characters, you can omit the quotation marks. You can replace the equal sign with a blank.

    Note: A default workgroup database may be used by the DBMS; SAS does not generate a default value.

USEDATE=YES NO;

  • If USEDATE=YES, then DATE. format is used for date/time columns in the data source table while importing data from Excel workbook. If USEDATE=NO, then DATETIME. format is used for date/time.

VERSION=" file-version ";

  • specifies the version of file that you want to create with if the file does not exist on your PC server yet. The default version is data-source specific. For Microsoft Excel workbook, the valid values are 2002 , 2000 , 97 , 95 and 5 , and its default value is 97 .

    Note: Always quote the version value.

    Note: If the file already exists in the PC Server, this value can be ignored.

Security Levels for Microsoft Access Tables

Microsoft Access tables have the following levels of security, for which specific combinations of security statements must be used:

  • None

    • Do not specify DBPWD=, PWD=, UID=, or WGDB=.

  • Password

    • Specify only DBPWD=.

  • User-level

    • Specify only PWD=, UID=, and WGDB=.

  • Full

    • Specify DBPWD=, PWD=, UID=, and WGDB=.

Each statement has a default value; however, you may find it necessary to provide a value for each statement explicitly.




Base SAS 9.1.3 Procedures Guide (Vol. 1)
Base SAS 9.1 Procedures Guide, Volumes 1, 2, 3 and 4
ISBN: 1590472047
EAN: 2147483647
Year: 2004
Pages: 260

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net