You can use an integers table to generate sequential data, or provide missing data in an OUTER JOIN.
What is an integers table? Sometimes called a numbers or sequence table, an integers table is simply a table containing some integers. It's not a system table; it's a user table.
|
An integers table allows you to write queries which involve a range of values. Queries can use an integers table to do things such as:
10.6.1. Create and Populate an integers Table
Your integers table should have a single column called i, which is defined as the table's primary key:
CREATE TABLE integers ( i INTEGER NOT NULL PRIMARY KEY )
Declaring i as the primary key guarantees you cannot accidentally populate the table twice. It also defines an index, which helps for optimization.
You can start by inserting the digits 0 through 9:
INSERT INTO integers (i) VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
If your database system doesn't support that syntax, use this instead:
INSERT INTO integers (i) VALUES (0); INSERT INTO integers (i) VALUES (1); INSERT INTO integers (i) VALUES (2); INSERT INTO integers (i) VALUES (3); INSERT INTO integers (i) VALUES (4); INSERT INTO integers (i) VALUES (5); INSERT INTO integers (i) VALUES (6); INSERT INTO integers (i) VALUES (7); INSERT INTO integers (i) VALUES (8); INSERT INTO integers (i) VALUES (9);
10.6.2. Generate Sequential Data
You may be wondering how useful the integers 0 through 9 can be. Watch carefully; this may seem like magic, but it isn't.
10.6.2.1. Numbers 099
Because the integers table has only the numbers 0 through 9, if you want more numbers, you need to do something more. So simply CROSS JOIN the integers table to itself, as in this query:
SELECT 10*t.i+u.i AS number FROM integers AS u CROSS JOIN integers AS t ORDER BY number ;
In a CROSS JOIN, every row of one table is joined to every row of the other. If your database system doesn't support CROSS JOIN syntax, use this instead:
SELECT 10*t.i+u.i AS number FROM integers AS u , integers AS t ORDER BY number;
|
Because the integers table is used as both tables, this is a self-join, which requires using table aliases. The alias names t and u stand for tens and units. The expression 10 * t.i + u.i involves a calculation using the value of i from each table, and we give this expression the column alias name number. Finally, the result set is sorted by number:
number +------+ | 0 | | 1 | | 2 | ... | | 99 | +------+ 100 rows in set
10.6.2.2. Letters AZ
You can also use the integers table to generate the alphabet:
SELECT SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ' FROM 10*t.i+u.i FOR 1) AS letter FROM integers u CROSS JOIN integers t WHERE 10*t.i+u.i BETWEEN 1 AND 26 ORDER BY letter;
The query uses the expression 10 * t.i + u.i as the value of one of the parameters in the SUBSTRING function. Instead of 0 through 99, as in the preceding query, the integers are restricted to 1 through 26 by a condition in the WHERE clause:
letter A B C ... Z 26 rows in set
10.6.2.3. SQL Server
In SQL Server, you separate the parameters of the SUBSTRING function with commas:
SELECT SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ' , 10*t.i+u.i , 1) AS letter FROM integers u CROSS JOIN integers t WHERE 10*t.i+u.i BETWEEN 1 AND 26 ORDER BY letter
10.6.2.4. Oracle
You need to use SUBSTR in Oracle:
SELECT SUBSTR('ABCDEFGHIJKLMNOPQRSTUVWXYZ' , 10*t.i+u.i , 1) AS letter FROM integers u CROSS JOIN integers t WHERE 10*t.i+u.i BETWEEN 1 AND 26 ORDER BY letter
10.6.2.5. Date ranges
You can even use the integers table to generate a sequence of dates:
SELECT current_date + INTERVAL 10*t.i+u.i day AS next_30_dates FROM integers u CROSS JOIN integers t WHERE 10*t.i+u.i BETWEEN 0 AND 29 ORDER BY next_30_dates;
This query generates dates for the next 30 days, including today:
next_30_dates 2008-02-27 2008-02-28 2008-02-29 ... 2008-03-27 30 rows in set
10.6.2.6. Oracle and PostgreSQL
You can use the same technique with Oracle and PostgreSQL:
SELECT CURRENT_DATE + 10*t.i+u.i AS next_30_dates FROM integers u CROSS JOIN integers t WHERE 10*t.i+u.i BETWEEN 0 AND 29 ORDER BY next_30_dates
10.6.2.7. SQL Server
SQL Server needs the GEtdATE( ) function:
SELECT GETDATE( ) + 10*t.i+u.i AS next_30_dates FROM integers u CROSS JOIN integers t WHERE 10*t.i+u.i BETWEEN 0 AND 29 ORDER BY next_30_dates
10.6.3. Provide Missing Data in an OUTER JOIN
Now you can take your generated data and put it to work as the left table in a LEFT OUTER JOIN. The benefit of doing this is that data points which are missing from the data table (the right table) will be included in the result set.
10.6.3.1. Counts for each letter
Suppose you have a table of articles. Using a LEFT OUTER JOIN with your INTEGERS table to generate the letters A through Z, you can count the number of article titles for each letter, including a count for letters which have no corresponding article titles:
SELECT letters.letter , COUNT(articles.title) AS titles FROM ( SELECT SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ' FROM 10*t.i+u.i FOR 1) AS letter FROM integers AS u CROSS JOIN integers AS t WHERE 10*t.i+u.i BETWEEN 1 AND 26 ) AS letters LEFT OUTER JOIN articles ON letters.letter = SUBSTRING(articles.title FROM 1 FOR 1) GROUP BY letters.letter
Because it's a LEFT OUTER JOIN, all 26 letters will be included in the result set with a count of 0 if no article titles start with a particular letter:
letter titles A 1 B 9 C 37 ... Z 0 26 rows in set
10.6.3.2. Data for consecutive dates
When generating summary data for graphing purposes, it's important that every date in the sample range be included, even if there are no sample values for a given date. This query uses a table, SAMPLES, which contains S_DATE (the date when the sample was taken) and S_VALUE (the value of the sample):
SELECT dates.X_axis_date , SUM(samples.s_value) AS Y_axis_value FROM ( SELECT ( SELECT MIN(s_date) FROM samples ) + INTERVAL 10*t.i+u.i day AS X_axis_date FROM integers AS u CROSS JOIN integers AS t WHERE ( SELECT min(s_date) FROM samples ) + INTERVAL 10*t.i+u.i day <= ( SELECT MAX(s_date) FROM samples ) ) AS dates LEFT OUTER JOIN samples ON dates.X_axis_date = samples.s_date GROUP BY dates.X_axis_date
Notice how instead of specifying fixed numbers in the WHERE clause, you let the range of dates between MIN and MAX in the sample data determine which dates to generate with the integers table. It might produce:
X_axis_date Y_axis_value 2006-07-05 10 2006-07-06 10 2006-07-07 20 2006-07-08 NULL 2006-07-09 NULL 2006-07-10 50 6 rows
|
10.6.4. Hacking the Hack
You can simplify queries which use self-joins on the integers table by declaring views that incorporate the self-join. Here's a view for the first 100 integers:
CREATE VIEW hundred (i) AS SELECT 10*t.i+u.i FROM integers u CROSS JOIN integers t
Here's a view for the first 1,000 integers:
CREATE VIEW thousand (i) AS SELECT 100*h.i+10*t.i+u.i FROM integers u CROSS JOIN integers t CROSS JOIN integers h
Now if you need a range of dates to cover a period of days up to a year, you can use the thousand view:
SELECT current_date + INTERVAL i day AS next_year_dates FROM thousand WHERE current_date + INTERVAL i day < current_date + INTERVAL 1 year ORDER BY next_year_dates
Instead of specifying a fixed number in the WHERE clause, this query allows the data to determine the range of dates. The result set will be either 365 dates or 366, depending on whether the date range includes a leap day:
next_year_dates 2007-01-01 2007-01-02 2007-01-03 ... 2007-12-31 365 rows next_year_dates 2007-03-01 2007-03-02 2007-03-03 ... 2008-02-29 366 rows
10.6.4.1. Oracle
In Oracle, you can use the INTERVAL notation with a quoted number:
SELECT current_date + i AS next_year_dates FROM thousand WHERE current_date + i < current_date + INTERVAL '1' year ORDER BY next_year_dates
10.6.4.2. PostgreSQL
In PostgreSQL, you quote the number and the word year:
SELECT current_date + i AS next_year_dates FROM thousand WHERE current_date + i < current_date + INTERVAL '1 year' ORDER BY next_year_dates
10.6.4.3. SQL Server
In SQL Server, you can use the DateAdd function:
SELECT GetDate( ) + i AS next_year_dates FROM thousand WHERE GetDate( ) + i < DateAdd(yy,1,GetDate( )) ORDER BY next_year_dates
Rudy Limeback
SQL Fundamentals
Joins, Unions, and Views
Text Handling
Date Handling
Number Crunching
Online Applications
Organizing Data
Storing Small Amounts of Data
Locking and Performance
Reporting
Users and Administration
Wider Access
Index