Section 2.2. Database File Structure

Database File Format > Database File Structure

2.2. Database File Structure

Except in-memory databases, SQLite stores an entire (main or temp) database in a single database file.

2.2.1. Page

For ease in space management and reading/writing of data from databases, SQLite divides each database (including an in-memory database) into fixed-size regions called database pages, or simply pages. The page size is a power of 2, and can be between 512 and 32,768 (both inclusive); the default value is 1,024. (The upper bound is a limit imposed by the necessity of storing page size in 2-byte signed integer variables in various places in code and external storage.) The database is an (expandable and contractible) array of pages. An index to the page array is called a page number. Page numbers start at 1, and can go sequentially up to 2,147,483,647 (231 1). (The upper bound may be restricted further due to the maximum file size limit enforced by the native filesystem.) Page number 0 is treated as the NULL page or "not a page" the page does not exist physically. Page 1 and onward are stored one after another into the database file starting at file offset 0.

NOTE

Once a database file is created, SQLite uses the compile time default page size, but the size can be changed by a pragma command before creating the first table in the database. SQLite stores the size value as a part of metadata. It will use this page size value instead of the default one. (As mentioned previously, pragma commands are used to modify the behavior of the SQLite library. See the SQLite homepage for details.)

2.2.2. Page type

There are four types of pages: leaf, internal, overflow, and free. Free pages are inactive (currently unused) pages; the others are active pages. B+-tree internal pages contain navigational information for searches (B-tree internal pages have search information and data). Leaf pages store actual data (e.g., table rows) in B+-trees. If a row's data is too large to fit in a single page, part of the data is stored in the tree page, and the remaining part in overflow pages.

2.2.3. File header

SQLite can use any database page for any page type, except Page 1, which is always a B+-tree internal page. The page also contains a 100 byte file header record that is stored starting at file offset 0. The header information characterizes the structure of the database file. SQLite initializes the header when it creates the file. The format of the file header is given in the following table. The first two columns in Table 2-1 are in bytes.

Table 2-1. Structure of database file header
Offset Size Description
0 16 Header string
16 2 Page size in bytes
18 1 File format write version
19 1 File format read version
20 1 Bytes reserved at the end of each page
21 1 Max embedded payload fraction
22 1 Min embedded payload fraction
23 1 Min leaf payload fraction
24 4 File change counter
28 4 Reserved for future use
32 4 First freelist page
36 4 Number of freelist pages
40 60 15 4-byte meta values


Here is a description of each header element:


Header string

This is the 16 byte string: "SQLite format 3."


Page size

This is the size of each page in this database.


File format

The two bytes at offsets 18 and 19 are used to indicate the file format version. They both have to be 1 in the current version of SQLite, or an error is returned. If future file format changes occur, these numbers will increase to indicate the new file format version number.


Reserved space

SQLite may reserve a small fixed amount of space (<= 255 bytes) at the end of each page for its own purpose, and this value is stored at offset 20; the default value is 0. It is nonzero when a database uses SQLite's built-in encryption technology. The first part of a page (page size minus reserved size) is the usable space where database content proper is stored.


Embedded payload

The max embedded payload fraction value (at offset 21) is the amount of the total usable space in a page that can be consumed by a single entry (called a cell or record) of a standard B/B+-tree internal node. A value of 255 means 100 percent. The default max embedded payload fraction value is 64 (i.e., 25 percent): the value is to limit the maximum cell size so that at least 4 cells fit on one node. If the payload for a cell is larger than the max value, then extra payload is spilled into overflow pages. Once SQLite allocates an overflow page, it moves as many bytes as possible into the overflow page without letting the cell size to drop below the min embedded payload fraction value (at offset 22). The default value is 32, i.e., 12.5 percent.

The min leaf payload fraction value (at offset 23) is like the min embedded payload fraction, except that it is for B+-tree leaf pages. The default value is 32, i.e., 12.5 percent. The max payload fraction value for a leaf node is always 100 percent (or 255), and is not specified in the header. (There are no special-purpose leaf nodes in B-trees.)


File change counter

The file change counter (at offset 24) is used by transactions. That value is incremented by every transaction. This value is intended to indicate when the database has changed so that the pager can avoid having to flush its cache, though that feature has not been implemented as of this writing. The pager is responsible for incrementing this value.


Freelist

The freelist of unused pages originates in the file header at offset 32. The total number of free pages is stored at offset 36. The freelist is organized in a rooted trunk (see Figure 2-1). Freelist pages come in two subtypes: trunk pages and leaf pages. The file header points to the first one on the linked list of trunk pages. Each trunk page points to multiple leaf pages. (A leaf page content is unspecified.)

Figure 2-1. Structure of a freelist


A trunk page is formatted like the following, starting at the base of the page:

NOTE

You can purge the freelist by executing the vacuum command on the database. The command makes a copy of the database into a temporary file (the copy is made using INSERT INTO ... SELECT * FROM... commands). Then, it overwrites the original database with the temporary copy, under the protection of a transaction.


Meta variables

At offset 40, there are fifteen 4-byte integer values that are reserved for the B+-tree and the VM (virtual machine) modules. They represent values of many meta variables, including the database schema cookie number at offset 40; this value is incremented at each schema change. Other meta variables include the file formatting information of the schema layer at offset 44, the page cache size at 48, the autovacuum flag at 52, the text encoding (1:UTF-8, 3:UTF-16 LE, 4:UTF-16 BE) at 56, and the user version number at 60. You can find more information about these variables in the SQLite source files, notably btree.c.

NOTE

The database file format for SQLite is backward compatible back to version 3.0.0. This means that any version of SQLite can read and write a database file that was originally created by version 3.0.0. This is mostly true in the other direction version 3.0.0 of SQLite can usually read and write any SQLite database created by later versions of the library. However, there are some new features introduced by later versions of SQLite that version 3.0.0 does not understand, and if the database contains these optional new features, older versions of the library will not be able to read and understand it.

The file header is followed by a B+-tree internal node on Page 1. The node is the root of the master catalog table, named sqlite_master or sqlite_temp_master for a regular (main or attached) or temp database, respectively.

NOTE

All multibyte integer values are stored in the big-endian (most significant byte first) order. This lets you safely move your database files from one platform to another.



Inside SQLite
Inside Symbian SQL: A Mobile Developers Guide to SQLite
ISBN: 0470744022
EAN: 2147483647
Year: 2004
Pages: 29
Authors: Ivan Litovski, Richard Maynard
BUY ON AMAZON

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net