Method 1: SQL*LoaderSQL*Loader is a nifty little utility from Oracle that you use to load external data, such as spreadsheets and flat files, into your tables. As with all such powerful utilities, you have to do a little work. Fortunately, there are only two steps and only one command for using SQL*Loader. The two steps are:
Then, at the command prompt (DOS prompt), type in "SQLLDR" followed by parameters . (Yes, in the Windows [NT and W2K] world, you run SQL*Loader from the DOS prompt! All you DOS types, this is your chance to cheer .) Oracle also throws in two other files automatically: a log and a file containing any bad records. You can also get a discard file if you specify it. ( Bad records are ones that could not be loaded, discard records are those that were blank or did not meet the WHEN clause condition.):
Note Just a little note before you get too excited: SQL*Loader cannot create tables; they must already be present in the database. This section is going to give you a look at two of the most common SQL*Loader activities: (1) loading a comma-delimited file, such as from a spreadsheet, into the AUTHORS table; and (2) doing the same thing with a fixed-format flat file, in which the columns are a fixed length. After we discuss these processes, I'll give you some examples of more complicated scenarios. You will soon be so good at using plain old SQL*Loader that you'll be looking for more complex challenges. HousekeepingBefore you start handling any data or writing any SQL*Loader commands, do a little housekeeping. Although SQL*Loader doesn't care what you name your files, there are some conventions that I strongly recommend you follow. Believe me, you will be lost a couple of months from now if you start using your literary talents to come up with creative, though oblique , names . The first recommendation is to use the suffix .ctl for control files, .dat for data, .dsc for the discard file, .bad for the bad records, and .log for all log files. Second, create a folder called SQLLOAD , with subfolders named BAD , CTL , DATA , DISCARD , and LOG :
In the subfolders you can create additional folders for each table or group of tables. For your first couple of attempts, I suggest keeping the files in folders named after the tables you're loading. That is, in the LOG folder have another folder for AUTHORS_LOG, and move the log file into it after you have loaded the AUTHORS table. These little folders will eventually become part of your personal bag of tricks because they will contain examples of successful file loads, as well as errors and problems. (Yes, errors. We learn much from our mistakes!) The Data FileNow you're ready to get your data file. For our purposes, let's say it's a spreadsheet with all the author data in it that the training area has been keeping for several years . By now, hundreds of authors are defined, and it would be a serious task to have to rekey the data. The data entry unit manager has given you his "Don't even ask" look, so you know you had better find an easier way. It dawns on you that if the data entry folks won't key in the data for a small table such as AUTHORS, you better not approach them for the larger tables! So you agree to automate the transfer, and I'll show you how. Just remember that the exact same steps will be used for any data that you want to load into your Oracle database tables. Maybe, if you're very good and very lucky, once or twice in your life you will get a data file that has everything you need. Unfortunately , usually the raw data that you get will be missing some information, or additional calculations on existing fields may be needed. So let me warn you that in real life you will most likely have to write a program to reformat the data before you can run SQL*Loader. (Of course, if the data is coming from another system, and you have friends on that project, a little groveling might be advisable to get them to send you the file in the format you need. Just remember that they may come back to you at some point for a favor.) The fields in your data file do not have to be in the same sequence as the columns in your table if our input file is a certain type. In the second example that I will present shortly, I'll show you how to take fields from any place in the data file and match them to a column. That said, for the first example I'm going to give you one of those rareties: a file with fields that are perfect. Just take this as one of your lucky days. Here's the author data as it appears in a spreadsheet:
Now the question is how to put this information into a file format that Oracle can use. In this case, here's what we'll do:
That's it for this file. It is now ready to become your input file for the AUTHORS table. Of course, if the columns were in a different sequence, you would have to do some moving or copying. The Control FileThis is going to be a bit complicated, but if you follow along through the examples, you will get the main points of SQL*Loader. Let's start with a few syntax rules :
There are many refinements that you will eventually learn to use with SQL*Loader, but for now let's just look at a real example. Here's the control file for the .csv data file: ----------------------------------------------------------- -- C O M M E N T S A R E A -- -- Authors_CSV.CTL -- -- -- -- Guerrilla Oracle -- -- -- -- data file is D:\SQLLOAD\DATA\AUTHORS_CSV.DAT -- -- log is D:\SQLLOAD\LOG.AUTHORS.LOG -- -- bad is D:\SQLLOAD\BAD\AUTHORS.BAD -- -- discard is D:\SQLLOAD\DSC\AUTHORS.DSC -- ----------------------------------------------------------- -- First command -- LOAD DATA -- Use APPEND APPEND INTO TABLE AUTHORS FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY " " TRAILING NULLCOLS -- Now list the columns in the table --- ( AUTHOR_ID, AUTHOR_NAME ) We start with LOAD DATA , followed by INTO TABLE table name . Then we tell SQL*Loader that the fields are separated by a comma or a double quotation mark, and we add TRAILING NULLCOLS as a safety measure. Then we start defining the columns to be loaded. Now save your file in the appropriate directory, something like D:\SQLLOAD\CTL\AUTHORS_CSV.CTL That's it! You have completed your control file for loading the AUTHORS table with a comma-delimited file. Running SQL*LoaderYou now have a data file, D:\SQLLOAD\DATA\AUTHORS_CSV.DAT , and your control file, D:\SQLLOAD\CTL\AUTHORS_CSV.CTL , so you're ready to run SQL*Loader. Just go to the DOS prompt D:> , for exampleand type the following: SQLLDR USERID=SYSTEM/MANAGER@CWE1P CONTROL=D:\SQLLOAD\CONTROL\AUTHORS_CSV.CTL DATA=D:\SQLLOAD\DATA\AUTHORS_CSV.DAT LOG=D:\SQLLOAD\LOG\AUTHORS.LOG BAD=D:\SQLLOAD\BAD\AUTHORS.BAD Now go to your LOG , DISCARD , and BAD folders and check your work. If the load was successful, you will see this in the LOG file: SQL*Loader: Release 8.1.7.0.0 - Production on Sat Sep 22 16:51:16 2001 (c) Copyright 2000 Oracle Corporation. All rights reserved. Control File: D:\SQLLOAD\CTL\AUTHORS_CSV.CTL Data File: D:\SQLLOAD\DATA\AUTHORS_COMMA.DAT Bad File: D:\SQLLOAD\BAD\AUTHORS.BAD Discard File: none specified (Allow all discards) Number to load: ALL Number to skip: 0 Errors allowed: 50 Bind array: 64 rows, maximum of 65536 bytes Continuation: none specified Path used: Conventional Table AUTHORS, loaded from every logical record. Insert option in effect for this table: APPEND TRAILING NULLCOLS option in effect Column Name Position Len Term Encl Datatype --------------------------- ---------- ----- ----- ----- ---------- AUTHOR_ID FIRST * , O(") CHARACTER AUTHOR NEXT * , O(") CHARACTER Table AUTHORS: 15 Rows successfully loaded. 0 Rows not loaded due to data errors. 0 Rows not loaded because all WHEN clauses were failed. 0 Rows not loaded because all fields were null. Space allocated for bind array: 33024 bytes(64 rows) Space allocated for memory besides bind array: 0 bytes Total logical records skipped: 0 Total logical records read: 15 Total logical records rejected: 0 Total logical records discarded: 0 Run began on Sat Sep 22 16:51:16 2001 Run ended on Sat Sep 22 16:51:22 2001 Elapsed time was: 00:00:05.16 CPU time was: 00:00:00.03
A little earlier I told you about SQL*Loader's three verbs: APPEND , INSERT , and REPLACE . Let's look at some samples of logs from these commands. Here's the log from the INSERT command on a file that already has records: SQL*Loader: Release 8.1.7.0.0 - Production on Sat Sep 22 16:52:17 2001 (c) Copyright 2000 Oracle Corporation. All rights reserved. Control File: D:\SQLLOAD\CTL\AUTHORS_CSV.CTL Data File: D:\SQLLOAD\DATA\AUTHORS_COMMA.DAT Bad File: D:\SQLLOAD\BAD\AUTHORS.BAD Discard File: none specified (Allow all discards) Number to load: ALL Number to skip: 0 Errors allowed: 50 Bind array: 64 rows, maximum of 65536 bytes Continuation: none specified Path used: Conventional Table AUTHORS, loaded from every logical record. Insert option in effect for this table: INSERT TRAILING NULLCOLS option in effect Column Name Position Len Term Encl Datatype --------------------------- ---------- ---- ---- ---- --------- AUTHOR_ID FIRST * , O(") CHARACTER AUTHOR NEXT * , O(") CHARACTER ERROR!! SQL*Loader-601: For INSERT option, table must be empty. Error on table AUTHORS If you have a constraint built in, Oracle will reject any records that violate that constraint. The .bad file that follows this paragraph shows what happens if we run the file twice, using APPEND . In this case the table has a UNIQUE constraint on the key field, so when INSERT tries to add a duplicate key, it is rejected. See the value of using constraints and keys? SQL*Loader: Release 8.1.7.0.0 - Production on Sat Sep 22 16:48:25 2001 (c) Copyright 2000 Oracle Corporation. All rights reserved. Control File: D:\SQLLOAD\CTL\AUTHORS_CSV.CTL Data File: D:\SQLLOAD\DATA\AUTHORS_COMMA.DAT Bad File: D:\SQLLOAD\BAD\AUTHORS.BAD Discard File: none specified (Allow all discards) Number to load: ALL Number to skip: 0 Errors allowed: 50 Bind array: 64 rows, maximum of 65536 bytes Continuation: none specified Path used: Conventional Table AUTHORS, loaded from every logical record. Insert option in effect for this table: APPEND TRAILING NULLCOLS option in effect Column Name Position Len Term Encl Datatype ---------------------------- ---------- ---- ---- ----- --------- AUTHOR_ID FIRST * , O(") CHARACTER AUTHOR NEXT * , O(") CHARACTER Record 1: Rejected - Error on table AUTHORS. ORA-00001: unique constraint (SYSTEM.SYS_C001376) violated Record 2: Rejected - Error on table AUTHORS. ORA-00001: unique constraint (SYSTEM.SYS_C001376) violated Record 3: Rejected - Error on table AUTHORS. ORA-00001: unique constraint (SYSTEM.SYS_C001376) violated Record 4: Rejected - Error on table AUTHORS. ORA-00001: unique constraint (SYSTEM.SYS_C001376) violated Record 5: Rejected - Error on table AUTHORS. ORA-00001: unique constraint (SYSTEM.SYS_C001376) violated . . . Record 14: Rejected - Error on table AUTHORS. ORA-00001: unique constraint (SYSTEM.SYS_C001376) violated Record 15: Rejected - Error on table AUTHORS. ORA-00001: unique constraint (SYSTEM.SYS_C001376) violated Table AUTHORS: 0 Rows successfully loaded. 15 Rows not loaded due to data errors. 0 Rows not loaded because all WHEN clauses were failed. 0 Rows not loaded because all fields were null. Space allocated for bind array: 33024 bytes(64 rows) Space allocated for memory besides bind array: 0 bytes Total logical records skipped: 0 Total logical records read: 15 Total logical records rejected: 15 Total logical records discarded: 0 Run began on Sat Sep 22 16:48:25 2001 Here's the log from a REPLACE command. Notice that it is the same as the INSERT log: SQL*Loader: Release 8.1.7.0.0 - Production on Sat Sep 22 16:53:14 2001 (c) Copyright 2000 Oracle Corporation. All rights reserved. Control File: D:\SQLLOAD\CTL\AUTHORS_CSV.CTL Data File: D:\SQLLOAD\DATA\AUTHORS_COMMA.DAT Bad File: D:\SQLLOAD\BAD\AUTHORS.BAD Discard File: none specified (Allow all discards) Number to load: ALL Number to skip: 0 Errors allowed: 50 Bind array: 64 rows, maximum of 65536 bytes Continuation: none specified Path used: Conventional REPLACE Command Table AUTHORS, loaded from every logical record. Insert option in effect for this table: REPLACE TRAILING NULLCOLS option in effect Column Name Position Len Term Encl Datatype ---------------------------- ---------- ----- ----- ---- --------- AUTHOR_ID FIRST * , O(") CHARACTER AUTHOR NEXT * , O(") CHARACTER Table AUTHORS: 15 Rows successfully loaded. 0 Rows not loaded due to data errors. 0 Rows not loaded because all WHEN clauses were failed. 0 Rows not loaded because all fields were null. Space allocated for bind array: 33024 bytes(64 rows) Space allocated for memory besides bind array: 0 bytes Total logical records skipped: 0 Total logical records read: 15 Total logical records rejected: 0 Total logical records discarded: 0 Run began on Sat Sep 22 16:53:14 2001 Run ended on Sat Sep 22 16:53:19 2001 Elapsed time was: 00:00:05.11 CPU time was: 00:00:00.04
What have you learned from these examples? Well, you have now successfully used SQL*Loader to load an Oracle table in your new database with 15 author records, using the APPEND command. You know the logs and files involved and have developed a logical structure to keep track of all the files as you load your tables. And you have taken one of the most common types of input data, the comma-delimited file, and loaded it. Now for our second example, let's take a look at another file type that you will also come across quite often, the fixed-format record (that is, a record with fixed-length fields). Here's a sample: [View full width]
Notice that this is another type of flat file, in which the fields are fixed lengths. The first field is Name, the second Phone Extension, the third ID, the fourth Department, the fifth Date, and the sixth Comments. For our purposes, we care about only two fields: Author and Author ID. The numbers at the top of the record are not part of the file. I included them only to make it easy to map the columns. First map out the record layout:
Second, to get only the two fields we want, just change the control file: --------------------------------------------------------------- -- -- C O M M E N T S A R E A -- -- -- -- Authors_Flat.CTL -- -- -- -- GUERRILLA ORACLE -- -- -- -- data file is D:\SQLLOAD\DATA\AUTHORS_FLAT.DAT -- -- log is D:\SQLLOAD\LOG.AUTHORS.LOG -- bad is D:\SQLLOAD\BAD\AUTHORS.BAD -- discard is D:\SQLLOAD\DSC\AUTHORS.DSC -- ----------------------------------------------------------- Same commands: LOAD DATA APPEND INTO TABLE AUTHORS Now list the columns in the table and the corresponding positions in the input file: ( AUTHOR_ID POSITION(30:38) NUMERIC EXTERNAL, AUTHOR POSITION(1:23) CHAR ) Note Although SQL*Loader can load data in either binary or character format, we normally use
When we use the POSITION statement, we tell SQL*Loader the starting and ending position of the data that will go into each column. (It makes no difference where the data is positioned in the flat file.) For the AUTHORS table, then, we must specify where the Author_ID and Author fields start and stop. Notice that in the preceding example, Author ID is the first column for the AUTHORS table, but it is the third field in the record. We also tell SQL*Loader that the fields are normal, CHAR type. SQL*Loader can handle many data formats. In fact, 14 data types can be defined when you use the POSITION statement, each telling SQL*Loader the kind of data that is being pulled, as well as the size of each data type. You cannot override these sizes. We run SQL*Loader on the flat file just as we did on the comma-delimited file; all we have to do is change the data file name. We will also get the same log, discard, and bad files. Here's what the DOS command looks like:
The Last StepThe last step is always to check your tables. Go to SQL*Plus and take a look. You should see all the data that the logs said was loaded. Note If you use something like Notepad or WordPad or another such utility, it is very important that you save the file using double quotes around the name . For example, if someone creates a list file that you want to call Authors.Dat , save it as "AUTHORS.DAT" , and the file type should be "all files". If you don't use the double quotes, sometimes editing characters are automatically added to the file, and SQL*Loader will not be able to open the file. (You will get a SQL*Loader 500 error.) This is usually not a problem with a comma-delimited file created from a spreadsheet utility. I have looked for deep, technical problems with input files that SQL*Loader just would not take, only to discover that the problem was simply that the file needed to be saved with quotes around the name! I'm just trying to save you hours of futile troubleshooting. Some Advanced SQL*LoaderDuring this discussion I imagine you've had a few things on your mind: (1) How do we skip fields in a file that is comma-delimited? (2) How do we add fields such as date, user ID, and so forth? And (3) can we use conditional logic? I'm glad you asked, because these are the features that make SQL*Loader such a helpful tool. And now is the time for the answers. In answer to the first questionskipping fields in a comma-delimited file unfortunately the only two ways are to rebuild the input file or to load all the data into a temporary table and then write a script to pull just the data you want. You would then either use more commas or create a file with fixed-length columns. Note Here's the rule: You must specify something for every column in the table that you're loading. For example, let's take the following fictitious table and input file:
The input file has a bit more information than we want. For example, it has the size (14, 12, 13.5 hands) and the disposition (gentle, wild). To load this information into the table, either we would first have to rewrite the file so that it is in the right sequence, and then drop some information:
or we could write the information to a fixed-format file:
Once we have the data in a usable format, we can easily load it using the techniques already shown. Note Another option is to load the entire file into a holding or temporary table. After loading the temporary table using SQL*Loader, write a script to pull just the data you want. This is actually one of the most common ways to handle input files that do not match your table layout. Now for the second questionadding dates, user ID, and so forth. Here's the rule: Again, you must have an entry for every column in your table. The data can come from the input file, or you can use constants, system fields, and the like. Let's take a look at our WILD_HORSES table. Suppose that you know that MMF is the only one who has ever caught a wild horse, and that all the horses ever caught were brown and white mustangs. Your control file would then look like this: -- -- -- WILD HORSE .CTL FILE -- -- -- LOAD DATA APPEND INTO TABLE WILD_HORSES FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS (SEQ NUM, TYPE_OF_HORSE, AGE, UNIQUE CHAR CONSTANT 'BROWN&WHITE', DATE CAUGHT, CAUGHT_BY CONSTANT 'MMK', ... )
With regard to the third questionconditional logicfortunately SQL*Loader does support the WHEN statement if you're using a fixed-format file. Why? You have to be able to specify the field you want to test: WILD HORSE .CTL FILE LOAD DATA APPEND INTO TABLE WILD_HORSES WHEN (27:2) = 10 (SEQ NUM, TYPE_OF_HORSE, AGE, UNIQUE CHAR CONSTANT 'BROWN&WHITE', DATE CAUGHT, CAUGHT_BY CONSTANT 'MMF', ... )
In this example you could change the control file to say to test whatever is in positions 27 and 28 and see if it equals "10". (The idea is to test for the age of the horse, and to load records only for horses that are 10 years old.) There are many other powerful SQL*Loader features, and I encourage you to investigate them. For example, you can load sequence numbers, pull data from reports , and so forth. However, the examples presented here will give you plenty to work with for the time being. Note There are two Oracle methods for loading data using SQL*Loader: direct path and conventional path . I have not mentioned direct path because there are some restrictions. For mass loads of thousands of records, however, direct path is faster because it bypasses most of the RDBMS processing during the loading of records. I recommend that you investigate direct load further if you have occasion for managing the loading of massive amounts of data. Now I need to cover one more aspect of SQL*Loader: date handling. Remember we said that the normal Oracle date format is DD-MON-YYYY? Often your data will be input in the format DD-MM-YY. Can you load this data? Yes, because Oracle allows you to tell it what the incoming data format is, and then Oracle reformats it automatically into its preferred format of DD-MON-YYYY. Suppose that the date field, IN_DATE , in your input file has the format DD-MM-YY. All you do in your SQL*Loader control file is tell it the format: IN_DATE DATE(8) "MM-DD-YY" . That's it! Here's a simple example in which the data is in the .CTL file: -- Test script to load date in nonstandard format-- LOAD DATA INFILE * APPEND INTO TABLE TEST_DATE FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' (NAME1, DATE1 DATE(8) "MM-DD-YY") BEGINDATA "Michael", 01-25-98 "Mark", 02-25-99 "Philip", 03-25-99 And here's the output:
Method 2: Using a ScriptInstead of using SQL*Loader, you can create a SQL*Plus script to load the records. As you will shortly see, this will be a long series of INSERT commands. This method is useful when you're testing a database and the tables are periodically destroyed , or when you're creating databases for training purposes. Here's an example of a SQL*Plus load script: /* GUERRILLA ORACLE */ INSERT INTO AUTHORS VALUES (118765443, 'John Smith'); INSERT INTO AUTHORS VALUES (227653089, 'Harry Wellins'); INSERT INTO AUTHORS VALUES (337640276, 'Eleanor Satkowski'); INSERT INTO AUTHORS VALUES (442876009, 'George Drum');
and so forth. You can create your script using a text editor, making sure you follow the syntax shown here. For larger tables, of course you would have more entries in the VALUES clause. To run the scripts, go the SQL> prompt and type the following: @D:\SCRIPTS\TABLE_LOAD\LOAD_AUTHORS_SCRIPT.SQL Warning Just as we told you with the data and control files, when you create a SQL script in a text editor, remember to save it with double quotes. For example you would save the file name shown here as "LOAD_AUTHORS_SCRIPT.SQL" . Variation on Using a ScriptIn some cases, when you're retrieving data from another system you can load that data into a spreadsheet, where you can manipulate it and add the LOAD commands. That is, you can rearrange your columns in the spreadsheet so that the items are in the right sequence, and then create a couple other columns with the commands, commas, and so forth. Here's a fictitious spreadsheet with data that should go into your system:
In this example we will start with five columns of data and end up with ten after adding the correct syntax to change the spreadsheet into a SQL script. First, we delete the headers:
Second, we add columns containing the syntax necessary for an INSERT statement:
Third, we copy the new columns:
Finally, we save the data as a .sql file (note the quotes around the name), and we have a script that we can run at the SQL> prompt:
Sometimes using a script to load data can be a quick and dirty way to move relatively small files. Other times, getting the syntax correct can be so time-consuming that it is much better to use SQL*Loader. But notice that you can manipulate the spreadsheet data to make your life easier. As with SQL*Loader, you're probably asking the question, What if the dates are in nonstandard format? The answer is the same: Use Oracle's date features to define the format mask, and then let Oracle do the rest. (We will talk a lot more about dates in Chapter 14.) Here's an example of what you would have to add to your spreadsheet to specify the input format: insert into TEST_DATE values ('jon2', TO_DATE('01-11-90', 'DD-MM-YY'))
And here's what TEST_TABLE now looks like:
Method 3: Using a GUI FormAnother fairly good way to add data to your tables, if you have good data entry staff available, is to use Forms 6 i and build yourself a simple GUI form. Here are the steps to quickly crank out such a form. (We call this the quick and dirty method of form generationthat is, doing just enough to get the job done.) First open Forms 6 i Form Builder, and, using the Data Block Wizard, select the type of data block you would like to create.
Now select the table (in this case AUTHORS) and the two columns: Author_ID and Author. Note that it is very important to check the Enforce data integrity box:
Use the Layout Wizard:
Forms 6 i will then automatically give you the following form. Notice that this form has only the two columns we selected from the AUTHORS table: Author_ID and Author. Right-click on the data entry block that says "AUTHOR_ID" and then click on Property Palette :
Now scroll down to the Data section and change Maximum Length to 9, also making sure that Data Type says "Number" and Fixed Length says "Yes":
Click the X in the upper-right corner to exit from the Author_ID Property Palette. Now right-click on the data entry block that says Author and then click on Property Palette. In the Data section, check to make sure that Maximum Length is 75:
Again, click on the upper right X to exit. Now you can run the form. Click on the green traffic light:
Almost by magic, you now have a running, functioning, simple data entry form. Notice that at the bottom is says "Record: 1/1". The reason is that at this time the AUTHORS table is empty. Also notice the blue arrows, the green plus sign, and the red X at the top, in the toolbar. As you might guess, the green plus sign is used to add a record, the red X to delete a record, and the arrows to navigate through the table. Pretty straightforward, no?
To add a record, do the following:
Repeat these steps until all records have been entered and saved. Note that you will get errors if you try to save a record without enough data, or you enter a duplicate key:
As you add records, notice that the counter at the bottom increments :
Don't forget that you can scroll back and forth through the table by using the blue arrows:
You can change the form to hold more than one record:
The same rules apply: Click on the green + icon to add the records, then the diskette icon to save. The only remaining trick is to select Action Clear All when you have saved all the records. Doing this refreshes the entire screen so that you can enter more records. That's it for using forms. Don't forget, this is a very quick and dirty way to build a simple data entry screen. To review, in this chapter I have shown you how to load data into your tables in three different ways: using SQL*Loader, using a script, and making a simple GUI form using Forms 6 i . Most programmers prefer SQL*Loader as the simplest and fastest way to move large amounts of information into a table or series of tables. Users, however, normally do not have rights to run SQL, so a GUI interface is more appropriate if your users will be loading the data. |