Manipulating Data

Next we'll look at how records can be added to a database and demonstrate different ways of using the INSERT command, and examine the syntax of the SQL UPDATE and DELETE commands.

Transactions

Any change to a SQLite database must take place within a transactiona block of one or more statements that alter the database in some way. Transactions are the way in which a robust database system ensures that either all or none of the requests to alter the database is carried out; it can never be just partially completed. This property of a database is called atomicity.

Whenever an INSERT, UPDATE, or DELETE command is issued, SQLite will begin a new transaction unless one has already been started. An implicit transaction lasts only for the duration of the one statement but ensures that, for instance, an UPDATE affecting many rows of a large table will always carry out the action on every row orin the unlikely event of a system failure while processing this commandnone of them. The database will not reflect a change to any row until every row has been updated and the transaction closed.

A transaction can be started from SQL if you want to make a series of changes to the database as one atomic unit. This is the syntax of the BEGIN TRANSACTION statement:

 BEGIN [TRANSACTION [name]] [ON CONFLICT conflict-algorithm]

The transaction name is optional and, currently, is ignored by SQLite. The facility to provide a transaction name is included for future use if the ability to nest transactions is added. Currently only one transaction can be open at a time. In fact the keyword trANSACTION is also optional, but is included for readability.

An ON CONFLICT clause can be specified to override the default conflict resolution algorithm specified at the table level, but can be superseded itself by the OR clause of an INSERT, UPDATE, or DELETE statement.

To end a transaction and save changes to the database, use COMMIT TRANSACTION. The optional transaction name may be specified. To abort a transaction without any of the changes being stored, use ROLLBACK TRANSACTION.

Inserting Data

There are two versions of the syntax for the INSERT statement, depending on where the data to be inserted is coming from.

The first syntax is the one we have already used in Chapter 2, to insert a single row from values provided in the statement itself. The second version is used to insert a dataset returned as the result of a SELECT statement.

INSERT Using VALUES

The syntax for a single-row insert using the VALUES keyword and a list of values provided as part of the statement is as follows:

 INSERT [OR conflict-algorithm] INTO [database-name .] table-name [(column-list)] VALUES (value-list)

Although all our examples so far have included a column-list, it is actually optional. Where no column-list is provided, the value-list is assumed to contain one value for each column in the table, in the order they appear in the schema.

This can be a useful shortcut when you are adding data; for instance because we know the column-list is a name and then a color, a record can be inserted into the vegetables table simply using this format:

 sqlite> INSERT INTO vegetables VALUES ('mushroom', 'white');

However if the schema of the table is not what you are expecting, the INSERT will fail with an error. SQLite would not make any assumption as to which columns you are referring to.

Because SQLite does not have an ALTER TABLE command, it is much harder to change a schema after a table has been created than it is with other database engines that include this command. We'll see a workaround for ALTER TABLE in the following example, and if for any reason the vegetables table had been expanded to include three columns, the same INSERT statement would produce this error:

 sqlite> INSERT INTO vegetables VALUES ('mushroom', 'white'); SQL error: table vegetables has 3 columns but 2 values were supplied

Therefore it is good practice to always include the column-list in an INSERT statementalso known as performing a full insert.

The OR keyword is used to specify a conflict resolution algorithm in the same way we saw for the CREATE TABLE statement. The list of algorithms and their behavior is identical, but the keyword OR is used instead of ON CONFLICT to give a more natural-sounding syntax.

The conflict algorithm in the OR clause of an INSERT statement has the highest precedence possible, and will override any other setting present at the table or transaction level.

INSERT Using SELECT

The syntax to insert the result of a SELECT query into another table is as follows:

 INSERT [OR conflict-algorithm] INTO [database-name .] table-name [(column-list)] select-statement

The select-statement should return a dataset with the same number and order as the columns specified in the column-list (or every column in the destination table if no column-list is supplied). The full syntax of the SELECT statement is available, and any number of rows can be returned.

As with INSERT ... VALUES, the column-list is optional but including it is highly advisable.

Updating Data

The syntax of the UPDATE statement in SQLite is as follows:

 UPDATE [OR conflict-algorithms] [database-name .] table-name SET assignment [, assignment]* [WHERE expr]

One or more assignments can be performed within the same statement upon the same subset of data, defined by the optional WHERE clause. An assignment is defined as

 column-name = expr

Although the WHERE clause is not required, it is usually desirable. The following example would assign the value of color to green for every row in the table, when in fact we probably only meant to update one or a few records.

 sqlite> UPDATE vegetables    ...> SET color = 'green';

The WHERE clause can be as simple or complex as necessary, and all the conditional elements that can be used in the WHERE clause of a SELECT statement can be used here.

It is not logical to join two or more tables when performing an UPDATE; however, subselects can be used in the WHERE clause, as in the following example:

 sqlite> UPDATE mytable    ...> SET myfield = 'somevalue'    ...> WHERE mykey IN (    ...>   SELECT keyfield    ...>   FROM anothertable    ...> );

The OR keyword is used in an UPDATE statement to specify a conflict resolution algorithm with the highest precedence possible, in the same way as with an INSERT.

Deleting Data

The DELETE statement is used to remove rows from a database. Its syntax is

 DELETE FROM [database-name .] table-name [WHERE expr]

As with the UPDATE statement, the WHERE clause is optional but is usually desiredperforming a DELETE on a table with no WHERE clause will empty the table. No column-list is required for a DELETE because the operation affects the entire row.

The WHERE clause can use the AND and OR operators to combine conditions and can use subselects to perform a DELETE operation conditional on the results of another query.

The following example modifies a query from Chapter 2 to remove records from the timesheets table where the project_code field does not correspond to a key in the projects table.

 sqlite> DELETE FROM timesheets    ...> WHERE project_code NOT IN (    ...>   SELECT code    ...>   FROM projects    ...> );

Altering a Table Schema

There is no ALTER TABLE statement in SQLite; instead a table must be dropped and re-created with a new field added, with any data that you want to preserve extracted before the table is dropped and reloaded into the new structure.

A temporary table is the ideal place to hold such data, and the CREATE TABLE ... AS syntax gives us a very easy way to create a copy of an existing table. The syntax is simply

 CREATE [TEMP | TEMPORARY TABLE] table-name AS select-statement

Let's suppose we want to add a new descriptive column to our vegetables table but without losing the data we have already created. The first step is to take a copy of the existing vegetables table to a new temporary table.

 sqlite> CREATE TEMPORARY TABLE veg_temp    ...> AS SELECT * FROM vegetables;

However, only the field specification has been copied when a table is created this way. The schema of the new table does not include any data type names or column constraints. It is not possible to give a set of column definitions when using CREATE TABLE ... AS in SQLite.

 sqlite> .schema veg_temp CREATE TEMP TABLE veg_temp(name,color);

Compare this to the schema of the original vegetables table, which we'll need for re-creating the table with the new field:

 sqlite> .schema vegetables CREATE TABLE vegetables ( name CHAR NOT NULL, color CHAR NOT NULL DEFAULT 'green' );

So now we can safely drop the old vegetables table and re-create it with our new field:

 sqlite> DROP TABLE vegetables; sqlite> CREATE TABLE vegetables (    ...> name CHAR NOT NULL,    ...> color CHAR NOT NULL DEFAULT 'green',    ...> description CHAR    ...> );

Finally, reinstate the copied data from the temporary table using the INSERT ... SELECT syntax:

 sqlite> INSERT INTO vegetables (name, color)    ...> SELECT name, color FROM veg_temp;

Loading Data from a File

The COPY command in SQLite was based on a similar command found in PostgreSQL and as a result is designed to read the output of the pg_dump command to facilitate data transfer between the two systems.

However, COPY can also be used to load data from most delimited text file formats into SQLite. It has the following syntax:

 COPY [OR conflict-algorithm] [database-name .] table-name FROM filename [USING DELIMITERS delim]

The destination table table-name must existthough it need not be emptybefore the COPY operation is attempted, and either filename must be in the current directory or a full path given.

Each line in the input file will become a record in the table, with each column separated by a tab character, unless a different delimiter character is specified in the USING DELIMITERS clause.

If a tabor the specified delimiterappears within a data column, it must be escaped with a backslash character. The backslash itself can appear in the data if it is escaped itself; in other words it will appear as two consecutive backslash characters.

The special character sequence \N in the data file can be used to represent a NULL value.

Tab is used to separate columns in the output of pg_dump, so it is the default delimiter for the COPY command. Another popular format is comma-separated values (CSV). Listing 3.1 shows a comma-separated data file that can be loaded into the three-column vegetables table.

Listing 3.1. `vegetables.csv`

 cucumber,green,Long green salad vegetable pumpkin,orange,Great for Halloween avocado,green,Can't make guacamole without it

The COPY command to load this data file into SQLite is

 sqlite> COPY vegetables FROM 'vegetables.csv'    ...> USING DELIMITERS ',';

You can instruct COPY to read data from the standard input stream instead of a file by using the keyword STDIN instead of a filename. A blank line, or a backslash followed by a period, is used to indicate the end of the input.

COPY permits an overriding conflict resolution algorithm to be specified after the OR keyword, as with INSERT and UPDATE.