Lesson 1:Importing and Exporting Data

3 4

Importing data from an external source into an instance of SQL Server is likely to be the first step that you will perform once you set up your database. After data has been imported into your database, you can start working with that data by using Transact-SQL statements and other tools to view and modify the data. You might also find that you need to export that data out of the database. Importing data is the process of retrieving data from sources external to SQL Server (for example, an ASCII text file) and inserting it into SQL Server tables. Exporting data is the process of extracting data from an instance of SQL Server into some user-specified format (for example, copying the contents of a SQL Server table to a Microsoft Access database). In this lesson, you will learn how to use the bcp command prompt utility to copy data to and from a SQL Server database and will learn how to use the BULK INSERT statement to import data from a data file to a database. With Data Transformation Services (DTS), you will also be able to extract, transform, and consolidate data from disparate sources.


After this lesson, you will be able to:

  • Use the bcp command prompt utility to import and export data.
  • Use the BULK INSERT statement to import data.
  • Describe how DTS is used to extract, transform, and consolidate data.

Estimated lesson time: 30 minutes


Using the bcp Utility and the BULK INSERT Statement

The bcp command prompt utility copies SQL Server data to or from a data file. You will use this utility most frequently to transfer large volumes of data into a SQL Server table from another program often from another database management system (DBMS). When the bcp utility is used, the data is first exported from the source program to a data file and is then imported from the data file into a SQL Server table. Alternatively, bcp can be used to transfer data from a SQL Server table to a data file for use in other programs such as Microsoft Excel.

Data can also be transferred into a SQL Server table from a data file by using the BULK INSERT statement. The BULK INSERT statement cannot bulk copy data from an instance of SQL Server to a data file, however. With the BULK INSERT statement, you can bulk copy data to an instance of SQL Server by using the functionality of the bcp utility in a Transact-SQL statement (rather than from the command prompt).

In order for the bcp utility and the BULK INSERT statement to insert data, the data file must be in row and column format. SQL Server can accept data in any ASCII or binary format as long as the terminators (characters used to separate columns and rows) can be described. The structure of the data file does not need to be identical to the structure of the SQL Server table, because bcp and BULK INSERT enable columns to be skipped or reordered during the bulk copy process.

Data that is bulk copied into an instance of SQL Server is appended to any existing contents in a table. Data that is bulk copied from an instance of SQL Server to a data file overwrites the previous contents of the data file.

You should keep in mind the following guidelines when you bulk copy data:

  • If you are importing data, the destination table must already exist. If you are exporting to a file, bcp will create the file. The number of fields in the data file does not have to match the number of columns in the table or be in the same order.
  • The data in the data file must be in character format or in a format that the bcp utility generated previously, such as native format. Each column in the table must be compatible with the field in the data file being copied. For example, it is not possible to copy an int field to a datetime column using native format bcp.
  • Relevant permissions to bulk copy data are required for source and destination files and tables. To bulk copy data from a data file into a table, you must have INSERT and SELECT permissions for the table. To bulk copy a table or view to a data file, you must have SELECT permission for the table or view being bulk copied.

The following bcp command copies data from the Publishers table in the Pubs database and into the Publishers.txt file:

 bcp pubs..publishers out publishers.txt -c -T 

The command first identifies the database (Pubs) and the table (Publishers) from which the data will be extracted. The out keyword specifies that the data will be exported from the table and into the Publishers.txt file; the -c switch specifies that a character (char) data format should be used; and the -T switch specifies that bcp should use a trusted connection to connect to SQL Server. The contents of the Publishers.txt file will include all of the data from the Publishers table. You can use a text editor, such as Notepad, to view the data.

You can also use the bcp command prompt utility to bulk copy data from the Publishers.txt file into the Publishers2 table in the Pubs database:

 bcp pubs..publishers2 in publishers.txt -c -T 

Notice that this command uses the in keyword rather than the out keyword. By running this command, you insert all data from the Publishers.txt file into the Publishers2 table.

You can also use the BULK INSERT statement from a query tool, such as Query Analyzer, to bulk copy the data into the Publishers2 table:

 Use Pubs BULK INSERT Publishers2 FROM 'c:\publishers.txt' WITH (DATAFILETYPE = 'CHAR') 

NOTE


The Publishers2 table must already exist before you can bulk copy data into that table.

Using Data Formats

The bcp utility can create or read data files in the default data formats by specifying a switch at the command prompt. The following table includes a description of the four default data formats:

Data Format bcp Switch BULK INSERT Clause Description
Native -n DATAFILETYPE = 'native' Uses native (database) data types. Storing information in native format is useful when information must be copied from one instance of SQL Server to another. Using native format saves time and space, preventing unnecessary conversion of data types to and from character format. A data file in native format cannot be read by any program other than bcp, however.
Character -c DATAFILETYPE = 'char' Uses the character (char) data format for all columns, providing tabs between fields and a new-line character at the end of each row as default terminators. Storing information in character format is useful when the data is used with another program, such as a spreadsheet, or when the data needs to be copied into an instance of SQL Server from another database. Character format tends to be used when copying data from other programs that have the functionality to export and import data in plain-text format.
Unicode character -w DATAFILETYPE = 'widechar' The -w switch (or widechar value for the DATAFILETYPE clause of the BULK INSERT statement) uses the Unicode character data format for all columns, providing (as default termi-nators) tabs between fields and a new-line character at the end of each row. This format allows data to be copied from a server (that is using a code page different from the code page used by the client running bcp) to another server that uses the same or different code page as the original server. This format prevents the loss of any character data, if the source and destination are Unicode data types. In addition, only a minimum number of extended characters are lost if the source and destination are not Unicode data types.
Unicode native -N DATAFILETYPE = 'widenative' Uses native (database) data types for all non-character data and uses Unicode character data format for all character (char, nchar, varchar, nvarchar, text, and ntext) data.

By default, the bcp utility operates in interactive mode and queries SQL Server and the user for information required to specify the data format. When using the -n, -c, -w, or -N switches, however, bcp does not query for information about the SQL Server table on a column-by-column basis. Instead, SQL Server reads or writes the data by using the default format specified.

By default, the BULK INSERT statement operates in character mode (char). Interactive mode does not apply.

When using interactive mode to bulk copy data, the bcp utility prompts you for information regarding the storage type, prefix length, field length, and field and row terminators. The file used to store the format information for each field in the data file is called the format file. This format file provides the default information that is used either to bulk copy the data in the data file back into an instance of SQL Server or to bulk copy data from the table another time (without needing to respecify the format).

Using DTS

SQL Server DTS is a set of graphical tools and programmable objects that enable you to extract, transform, and consolidate data from disparate sources into single or multiple destinations. By using DTS tools, you can create DTS packages that enable you to create custom data-movement solutions tailored to the specialized business needs of your organization. The capability to import and export data is among the tasks that you can perform using DTS tools and packages.

DTS Tools

Data Transformation Services includes a set of tools that enable you to create, schedule, and execute DTS packages. The following table describes DTS tools:

Tool Description
DTS Import/Export wizard Wizard used to copy data to and from an instance of SQL Server and to map transformations on the data. Of all the DTS tools, the DTS Import/Export Wizard provides the simplest method of copying data between OLE DB data sources.
DTS Designer Graphical tool used to build complex packages with workflows and event-driven logic. DTS Designer can also be used to edit and customize packages created with the DTS Import/Export Wizard.
DTS and Enterprise Manager Options available for manipulating packages and accessing package information from Enterprise Manager.
DTS Package Execution Utilities Package execution utilities include the following:
  • The dtswiz command prompt utility enables you to start the DTS Import/Export Wizard from a command prompt. The dtswiz command includes a set of command switches.
  • The dtsrun command prompt utility enables you to execute an existing package from a command prompt. The dtsrun command includes a set of command switches.
  • The DTS Run utility provides a set of dialog boxes that are used to execute an existing package. You can run the DTS Run utility by executing dtsrunui from a command prompt without any command switches.
DTS Query Designer A graphical tool used to build queries in DTS.

DTS Packages

A DTS package is an organized collection of connections, DTS tasks, DTS transformations, and workflow constraints that are assembled either with a DTS tool or programmatically, and then saved to SQL Server, SQL Server 2000 Meta Data Services, a structured storage file, or a Microsoft 0Visual Basic file.

Each package contains one or more steps that are executed sequentially or in parallel when the package is run. When executed, the package connects to the correct data sources, copies data and database objects, transforms data, and notifies other users or processes of events. Packages can be edited, password protected, scheduled for execution, and retrieved by version.

DTS Tasks

A DTS task is a discrete set of functionalities executed as a single step in a package. Each task defines a work item to be performed as part of the data movement and data transformation process or as a job to be executed.

Data Transformation Services supplies a number of tasks that are part of the DTS object model and that can be accessed graphically through DTS Designer or accessed programmatically. These tasks, which can be configured individually, cover a wide variety of data copying, data transformation, and notification situations. For example, the following types of tasks represent some actions that you can perform by using DTS:

  • Importing and exporting data.  DTS can import data from a text file or from an OLE DB data source (for example, a Microsoft Access 2000 database) into SQL Server. Alternatively, data can be exported from SQL Server to an OLE DB data destination (for example, a Microsoft Excel 2000 spreadsheet). DTS also enables high-speed data loading from text files into SQL Server tables.
  • Transforming data.  DTS Designer includes a Transform Data task that enables you to select data from a data source connection, map the columns of data to a set of transformations, and send the transformed data to a destination connection. DTS Designer also includes a Data-Driven Query task that enables you to map data to parameterized queries.
  • Copying database objects.  With DTS, you can transfer indexes, views, logins, stored procedures, triggers, rules, defaults, constraints, and user-defined data types in addition to the data. In addition, you can generate the scripts to copy the database objects.
  • Sending and receiving messages to and from other users and packages.  DTS includes a Send Mail task that enables you to send an e-mail if a package step succeeds or fails. DTS also includes an Execute Package task that enables one package to run another as a package step and includes a Message Queue task that enables you to use Message Queuing to send and receive messages between packages.
  • Executing a set of Transact-SQL statements or ActiveX scripts against a data source.  The Execute SQL and ActiveX Script tasks enable you to write your own SQL statements and scripting code and execute them as a step in a package workflow.

Because DTS is based on an extensible Component Object Model (COM), you can create your own custom tasks. You can integrate custom tasks into the user interface of DTS Designer and save them as part of the DTS object model.

DTS Transformations

A DTS transformation is one or more functions or operations applied against a piece of data before the data arrives at the destination. The source data is not changed. For example, you can extract a substring from a column of source data and copy it to a destination table. The particular substring function is the transformation mapped onto the source column. You also can search for rows that have certain characteristics (for example, specific data values in columns) and apply functions against only the data in those rows. Transformations make it easy to implement complex data validation, data scrubbing, and conversions during the import and export process.

Data transformations enable you to perform the following operations against column data:

  • Manipulating column data.  For example, you can change the type, size, scale, precision, or nullability of a column.
  • Applying functions written as ActiveX scripts.  These functions can apply specialized transformations or include conditional logic. For example, you can write a function in a scripting language that examines the data in a column for values greater than 1000. Whenever such a value is found, a value of -1 is substituted in the destination table. For rows that have column values less than 1000, the value is copied to the destination table.
  • Choosing from among a number of transformations supplied with DTS.  Examples would include a function that reformats input data by using string and date formatting; various string conversion functions; and a function that copies to a destination column the contents of a file specified by a source column.
  • Writing your own transformations as COM objects and applying those transformations against column data.

DTS Connections

Data Transformation Services is based on an OLE DB architecture that enables you to copy and transform data from a variety of data sources:

  • SQL Server and Oracle directly (using native OLE DB providers)
  • ODBC sources (using the Microsoft OLE DB Provider for ODBC)
  • Access 2000, Excel 2000, Visual FoxPro, dBase, Paradox, HTML, and additional file data sources
  • Text files (by using the built-in DTS flat file OLE DB provider)
  • Microsoft Exchange Server, Microsoft Active Directory, and other non-relational data sources
  • Other data sources provided by third-party vendors

DTS functionality might be limited by the capabilities of specific databases, ODBC drivers, or OLE DB providers.

DTS Package Workflow

Data Transformation Services steps and precedence constraints order work items in a DTS package. You can design the DTS package workflow graphically through DTS Designer or design the workflow programmatically. You also can use an ActiveX script to customize step execution.

Steps control the order in which tasks are executed in a DTS package. Steps represent the execution units in the DTS object model, and they define which tasks execute in what sequence when the package is run.

You can define the sequence of step execution in a package with precedence constraints that link two tasks based on whether the first task executes, whether it executes successfully, or whether it executes unsuccessfully. You can use precedence constraints to build conditional branches in a workflow. Steps without constraints are executed immediately, and several steps can execute in parallel. You can also use ActiveX scripts to modify workflow.

Exercise 1:  Importing and Exporting Data

In this exercise, you will import data from text files into tables in the BookShopDB database. You will first import data into lookup tables, such as BookCondition and Position, and you will then populate other tables in the database. When you have completed importing data, you will export data from the BookShopDB database into text files that are created on your hard drive. To complete this exercise, you must copy the text files in the \Chapter07\Exercise1 folder of the Training Kit Supplemental CD-ROM to the root directory of the C: drive on your Windows 2000 computer. In addition, you should be logged into your Windows 2000 Server computer as Administrator.

NOTE


You are not limited to the C: drive or to the root folder. You can select another drive or folder in which to store the files as long as you can run the bcp command from that folder. However, the root folder on the C: drive is used throughout this exercise.

To use the bcp command prompt utility to import data into the BookCondition table

  1. Click Start, then click Run.

The Run dialog box appears.

  1. In the Open text box, type cmd.
  2. Click OK.

A command prompt window appears.

  1. If the command prompt is not located at the root directory, type cd c:\ and then press Enter.
  2. At the C: command prompt, type bcp bookshopdb..bookcondition in bookcondition.txt -c -T, and then press Enter.

The bcp command identifies the database (BookShopDB) and table (BookCondition) that will receive the imported data. The data source (BookCondition.txt) is also identified. The in keyword is used when importing data into a table. The -c switch specifies that a character (char) data type should be used, and the -T switch specifies that a trusted connection should be used.

After you press Enter, the bcp utility copies the data from the BookCondition.txt file into the BookCondition table. Once the data has been copied, a message appears and provides the number of rows copied, the network packet size, and the clock time.

To view the contents of the BookCondition table

  1. Open Query Analyzer and connect to your local server.
  2. In the Editor pane of the Query window, enter the following Transact-SQL code:
 USE BookShopDB SELECT * FROM BookCondition 

In this statement, you are using a SELECT statement to retrieve data from the BookCondition table in the BookShopDB database.

  1. Execute the Transact-SQL statement.

The result set appears in the Grids tab of the Results pane. Notice that each row in the Description column contains a value of N/A. This value can be changed at any time to include a more detailed description of what each book condition means.

To use a BULK INSERT statement to import data into the Positions table

  1. In the Editor pane of the Query window, enter the following Transact-SQL code:
 USE BookShopDB BULK INSERT Positions FROM 'c:\positions.txt' WITH (DATAFILETYPE = 'CHAR') 

In this BULK INSERT statement, you are identifying the table that will receive the data (Positions) and the data source (Positions.txt). The statement uses a CHAR data type.

  1. Execute the Transact-SQL statement.

A message appears in the Messages tab of the Results pane, providing the number of rows affected.

  1. Use a SELECT statement to view the data in the Positions table. Notice that a PositionID value has been provided for each type of position.

To use BULK INSERT statements to import data into the OrderStatus and FormOfPayment tables

  1. Use BULK INSERT statements to insert data from the FormOfPayment.txt file to the FormOfPayment table and from the OrderStatus.txt file to the OrderStatus table.

What statements should you use?

  1. Use SELECT statements to view the data within the OrderStatus table and the FormOfPayment table.

Both tables should now be populated with the appropriate data.

To use BULK INSERT statements to import data into the Authors table, Books table, Customers table, and Employees table

  1. Use BULK INSERT statements to insert data from the Authors.txt file into the Authors table, from the Books.txt file into the Books table, from the Customers.txt file into the Customers table, and from the Employees.txt file into the Employees table.

What statements should you use?

  1. Use SELECT statements to view the data within the four tables.

NOTE


The data used to populate these tables is for demonstration purposes only. The data is by no means factual, although real authors and real books have been used. In addition, the amount of data is less than would be expected for the database. You might also notice that information about customers is similar to information about employees. At this point, you should be concerned only with the functionality of the database and how relational data is being stored in the tables, not with the accuracy of the data itself (other than the fact that it adheres to constraints defined for the system).

To use the bcp command prompt utility to import data into the BookAuthors table

  1. Use the bcp utility to copy data from the BookAuthors.txt file into the BookAuthors table.

What bcp command should you use?

  1. Use a SELECT statement in Query Analyzer to view the contents of the BookAuthors table.

Notice that author 102 appears in the table twice. If more than one author wrote a book, the TitleID value for that book would appear in more than one row.

  1. Close Query Analyzer.

Every table in the BookShopDB database (except for the Orders table and the BookOrders table) should now be populated with data. Data is not added to the Orders table or to the BookOrders table until new orders are generated.

To use the bcp command prompt utility to export data into a text file

  1. At the C: command prompt, type bcp bookshopdb..books out Books2.txt -c -T, and then press Enter.

The bcp command identifies the database (BookShopDB) and table (Books) from which data will be exported. A text file (Books2.txt) will be created, and the data from the Books table will be copied into that file. The out keyword is used when exporting data from a table.

After you press Enter, the bcp utility copies the data from the Books table into the Books2.txt file. Once the data has been copied, a message appears providing the number of rows copied, the network packet size, and the clock time.

  1. Use a text editor such as Notepad to view the contents of the Books2.txt file.

Notice that each row in the table is on a separate line and that a tab separates each column value within a row.

  1. Close the Books2.txt file and the command prompt window.

Lesson Summary

Importing data is the process of retrieving data from sources external to SQL Server and inserting the data into SQL Server tables. Exporting data is the process of extracting data from an instance of SQL Server into some user-specified format. The bcp command prompt utility copies SQL Server data to or from a data file. Data can also be transferred into a SQL Server table from a data file by using the BULK INSERT statement. The BULK INSERT statement cannot bulk copy data from an instance of SQL Server to a data file, however. Data that is bulk copied into an instance of SQL Server is appended to any existing contents in a table. Data that is bulk copied from an instance of SQL Server to a data file overwrites the previous contents of the data file. DTS is a set of graphical tools and programmable objects that enable you to extract, transform, and consolidate data from disparate sources into single or multiple destinations. By using DTS tools, you can create DTS packages with custom data-movement solutions tailored to your organization's business needs. A DTS package is an organized collection of connections, DTS tasks, DTS transformations, and workflow constraints that are assembled either with a DTS tool or programmatically and saved to SQL Server, SQL Server 2000 Meta Data Services, a structured storage file, or a Microsoft Visual Basic file.



Microsoft Press Staff - MCSE. Microsoft SQL Server 2000 Database Design and Implementation Training Kit
MCSE Training Kit (Exam 70-229): Microsoft SQL Server(TM) 2000 Database Design and Implementation (Pro Certification)
ISBN: 073561248X
EAN: 2147483647
Year: 2001
Pages: 97

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net