Uploading Data with LOAD DATA INFILE


The LOAD DATA INFILE command allows you to bulk insert data from a text file into a single table without having to write INSERT statements. For example, we could have loaded the data in the department table using this technique as follows . Listing 5.2 shows the contents of a data file containing department information.

Listing 5.2 department_infile.txt
 42     Finance 128    Research and Development NULL   Human Resources NULL   Marketing 

This file is in the default LOAD DATA INFILE format, with each row listed on a separate line with tabs between column values. (This is configurable and we will see how in a minute.)

We can load this information into the department table with the following LOAD DATA INFILE statement:

 
 load data local infile 'department_infile.txt' into table department; 

This facility is particularly useful for converting data from another database format, spreadsheet, or CSV (comma-separated values) file.

The LOAD DATA INFILE statement requires the FILE privilege ”see Chapter 11, "Managing User Privileges," for further information, especially if you are having trouble running this command. The privilege to perform this is often restricted for good security reasons ”to stop people from loading in /etc/passwd, for example.

The general form of the LOAD DATA INFILE statement is as follows:

 
 LOAD DATA [LOW_PRIORITY  CONCURRENT] [LOCAL] INFILE '  fileName.txt  '     [REPLACE  IGNORE]     INTO TABLE  tbl_name  [FIELDS         [TERMINATED BY '\t']         [[OPTIONALLY] ENCLOSED BY '']         [ESCAPED BY '\' ]     ]     [LINES TERMINATED BY '\n']     [IGNORE  number  LINES]     [(  col_name  ,...)] 

The optional clauses are as listed here:

  • The LOW PRIORITY clause works the same way it does in the INSERT statement by waiting for other clients to stop reading from the table. CONCURRENT , on the other hand, allows other clients to read from the table while the bulk insert is going on.

  • In our example, we specified the optional keyword LOCAL , meaning that the data file is on the client machine. If this is not specified, MySQL will look for the infile on the server.

  • If you have key clashes while inserting data, REPLACE and IGNORE provide two methods for dealing with this. Specifying REPLACE tells MySQL to replace the old row with the new row, while IGNORE tells MySQL to keep the old row.

  • The FIELDS and LINES clauses specify how the data in the infile is laid out. The values in the general form are the defaults ”each row on a new line, column values separated by tabs. We can also enclose column values in quotes and use the backslash character to escape any special characters (like single quotes) that might confuse MySQL.

  • The IGNORE number LINES clause tells MySQL to ignore the first number lines in the infile.

  • The final clause allows you to specify that you only want to read data into some of the table's columns .

A common format to receive data in is CSV or comma-separated values. Many programs can read and write files of this type, but one notable example is Microsoft Excel. Listing 5.3 shows a small CSV file saved from Excel.

Listing 5.3 new_programmers.csv
 Name,Job,DepartmentID Julia Lenin,Programmer,128 Douglas Smith,Programmer,128 Tim O'Leary,Programmer,128 

We can load this data into the employee table with the following query:

 
 load data infile 'e:\new_programmers.csv' into table employee fields terminated by ',' lines terminated by '\n' ignore 2 lines (name, job, departmentID); 

You can see that we have used more options to load this data than we did when the data was in the default format. A few points are worth noting:

  • Because we have used a Windows/DOS-style path that includes a backslash, we need to escape the backslash. Our path therefore became 'e:\\new_programmers.csv' .

  • It possibly goes without saying that the fields in a CSV file are terminated by commas, but we need to specify it.

  • We do not need to specify that lines are terminated by a newline character, but we have chosen to.

  • This file has a header, so the first two lines do not contain data and should be ignored.

  • The data in this file does not contain employeeIDs, so to allocate the three columns of data into the four columns in the database, we need to specify what columns (in order) the data will be mapped to. In this case, we have specified (name, job, departmentID) .



MySQL Tutorial
MySQL Tutorial
ISBN: 0672325845
EAN: 2147483647
Year: 2003
Pages: 261

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net