Taking Advantage of Unions

As I described in Chapter 4, a union is an SQL construct that allows you to knit together the results of several SQL queries and treat those results as if they had been returned by just one query. I find them invaluable when writing queries. One of the more creative uses I've discovered involves using unions to produce reports that need to show data grouped by categories and that may need to show the same records in more than one of those categories.

7.6.1 A Typical Union Example

A good example of this type of report would be one that fulfills the following request:

Produce an employee turnover report that lists everyone employed at the beginning of the year, everyone hired during the year, everyone terminated during the year, and everyone employed at the end of the year. The report should be divided into four sections, one for each of those categories.

This is a common kind of request, for me at least. The interesting thing about this request, though, is that every employee will need to be listed in two categories. That means you would need to write a query that returned each employee record twice in the correct categories.

When you are faced with this type of query, it can be helpful to simplify the problem by thinking in terms of separate queries, one for each category. It's fairly easy to conceive of a query to bring back a list of employees that were on board at the beginning of the year. You need to make sure the first of the year is between the hire date and the termination date, and account for the termination date possibly being null. Here's the query to return a list of people employed as of January 1, 2004:

SELECT employee_id,

 employee_name,

 employee_hire_date,

 employee_termination_date

 FROM employee

 WHERE employee_hire_date < TO_DATE('1-Jan-2004','dd-mon-yyyy')

 AND (employee_termination_date IS NULL

 OR employee_termination_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy'))

This gives you the first section of the report, which is those employed at the beginning of the year. Retrieving the data for the remaining sections is a matter of using a different WHERE clause for each section. Table 7-2 shows the selection criteria for each section of the report.

Table 7-2. Union query selection criteria

Report section

WHERE clause

Employed at beginning of year

WHERE employee_hire_date < TO_DATE('1-Jan-2004','dd-mon-yyyy')

 AND (employee_termination_date IS NULL

 OR employee_termination_date >= TO_DATE('1-Jan-2004',

 'dd-mon-yyyy'))

Hired during year

WHERE employee_hire_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy')

 AND (employee_hire_date < TO_DATE('1-Jan-2005','dd-mon-yyyy'))

Terminated during year

WHERE employee_termination_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy')

 AND (employee_termination_date < TO_DATE('1-Jan-2005',

 'dd-mon-yyyy'))

Employed at end of year

WHERE employee_hire_date < TO_DATE('1-Jan-2005','dd-mon-yyyy')

 AND (employee_termination_date IS NULL

 OR employee_termination_date >= TO_DATE('1-Jan-2005',

 'dd-mon-yyyy'))

 

7.6.2 The UNION Query

After separately developing the four queries, one for each section of the report, you can use SQL's UNION operator to link those four queries together into one large query. There are four things to consider when doing this:

  • You need to return all the records retrieved by all four queries.
  • You need to be able to group the retrieved records by category.
  • You need to be able to control which category prints first.
  • You need to identify each category on the printed report so the person reading the report knows what's what.

To be certain of getting all the records back from the query, use the UNION ALL operator to tie the queries together. Using UNION by itself causes SQL to filter out any duplicate rows in the result set. That unwanted filtering is done (usually) by a sorting operation, which consumes disk and CPU resources. Use UNION ALL to ensure that you get back all rows in each result set and without any unwanted sorts.

In order to properly group the records, you can add a numeric constant to each of the four queries. For example, the query to return the list of those employed at the beginning of the year could return an arbitrary value of 1:

SELECT 1 sort_column,

 employee_id,

 employee_name,

 . . .

 

The other queries would return values of 2, 3, and 4 in the sort column. Sorting the query results on these arbitrary numeric values serves two purposes. First, the records for each section of the report will be grouped together because they will all have the same constant. Second, the value of the sort column controls the order in which the sections print. Use a value of 1 for the section to be printed first, a value of 2 for the second section, and so on.

The final thing to worry about is identifying the results to the reader of the report. The values used in the sort column won't mean anything to the reader, so you need to add a column with some descriptive text. Here's how the final query for people employed at the beginning of the year looks with that text added:

SELECT 1 sort_column,

 'Employed at Beginning of Year' employee_status_text,

 employee_id,

 employee_name,

 employee_hire_date,

 employee_termination_date

 FROM employee

 WHERE employee_hire_date < TO_DATE('1-Jan-2004','dd-mon-yyyy')

 AND (employee_termination_date IS NULL

 OR employee_termination_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy'))

 

The first column returned by this query is used to sort these records to the top of the report, and the second column serves to identify those records for the reader. Example 7-7 shows the full-blown UNION query to produce all four sections of the report.

Example 7-7. A UNION query to generate a four-section report

SELECT 1 sort_column,

 'Employed at Beginning of Year' employee_status_text,

 employee_id,

 employee_name,

 employee_hire_date,

 employee_termination_date

 FROM employee

 WHERE employee_hire_date < TO_DATE('1-Jan-2004','dd-mon-yyyy')

 AND (employee_termination_date IS NULL

 OR employee_termination_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy'))

UNION ALL

SELECT 2 as sort_column,

 'Hired During Year' as employee_status_text,

 employee_id,

 employee_name,

 employee_hire_date,

 employee_termination_date

 FROM employee

 WHERE employee_hire_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy')

 AND (employee_hire_date < TO_DATE('1-Jan-2005','dd-mon-yyyy'))

UNION ALL

SELECT 3 as sort_column,

 'Terminated During Year' as employee_status_text,

 employee_id,

 employee_name,

 employee_hire_date,

 employee_termination_date

 FROM employee

 WHERE employee_termination_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy')

 AND (employee_termination_date < TO_DATE('1-Jan-2005','dd-mon-yyyy'))

UNION ALL

SELECT 4 as sort_column,

 'Employed at End of Year' as employee_status_text,

 employee_id,

 employee_name,

 employee_hire_date,

 employee_termination_date

 FROM employee

 WHERE employee_hire_date < TO_DATE('1-Jan-2005','dd-mon-yyyy')

 AND (employee_termination_date IS NULL

 OR employee_termination_date >= TO_DATE('1-Jan-2005','dd-mon-yyyy'))

ORDER BY sort_column, employee_id, employee_hire_date;

 

The four queries have been unioned together in the same order in which the report is to be printed. That's done for readability though. It's the ORDER BY clause at the bottom that ensures that the records are returned in the proper order.

7.6.3 The Final Report

Now that you've worked out the query, you only have to follow the remaining steps in the report development methodology to format and print the report. Example 7-8 shows a script to produce a reasonably formatted report. I have elided the SELECT statement to save space, but it's the same as in Example 7-7.

Example 7-8. A UNION report with four sections

--Set up pagesize parameters

SET NEWPAGE 1

SET PAGESIZE 55

 

--Set the linesize, which must match the number of equal signs used

--for the ruling lines in the headers and footers.

SET LINESIZE 75

 

TTITLE CENTER 'The Fictional Company' SKIP 2 -

 CENTER 'Employee Turnover Report' SKIP 1 -

 LEFT '===================================' -

 '=======================================' -

 SKIP 3

 

--Format the columns

CLEAR COLUMNS

COLUMN sort_column NOPRINT

COLUMN employee_id NOPRINT

COLUMN employee_status_text HEADING 'Status' FORMAT A29

COLUMN employee_name HEADING 'Employee Name' FORMAT A20

COLUMN employee_hire_date HEADING 'Hire Date' FORMAT A11

COLUMN employee_termination_date HEADING 'Term Date' FORMAT A11

 

--Breaks and computations

BREAK ON employee_status_text SKIP 2 NODUPLICATES

CLEAR COMPUTES

COMPUTE NUMBER LABEL 'Total Count' OF employee_name ON employee_status_text

 

--Set the date format to use

ALTER SESSION SET NLS_DATE_FORMAT = 'dd-Mon-yyyy';



SELECT 1 sort_column,

 'Employed at Beginning of Year' employee_status_text,

 employee_id,

...

EXIT

 

The output from Example 7-8 will look like this:

The Fictional Company



 Employee Turnover Report

==========================================================================





Status Employee Name Hire Date Term Date

----------------------------- -------------------- ----------- -----------

Employed at Beginning of Year Marusia Churai 15-Nov-1961

 Mykhailo Hrushevsky 16-Sep-1964 05-May-2004

 Pavlo Virsky 29-Dec-1987 01-Apr-2004

 Pavlo Chubynsky 01-Mar-1994 15-Nov-2004

 Taras Shevchenko 23-Aug-1976

 Igor Sikorsky 15-Nov-1961 04-Apr-2004

***************************** --------------------

Total Count 6





Hired During Year Mykola Leontovych 15-Jun-2004

 Lesia Ukrainka 02-Jan-2004

 Ivan Mazepa 04-Apr-2004 30-Sep-2004

 Mykhailo Verbytsky 03-Mar-2004 31-Oct-2004

 Roxolana Lisovsky 03-Jun-2004

***************************** --------------------

Total Count 5





Terminated During Year Mykhailo Hrushevsky 16-Sep-1964 05-May-2004

 Pavlo Virsky 29-Dec-1987 01-Apr-2004

 Pavlo Chubynsky 01-Mar-1994 15-Nov-2004

 Ivan Mazepa 04-Apr-2004 30-Sep-2004

 Igor Sikorsky 15-Nov-1961 04-Apr-2004

 Mykhailo Verbytsky 03-Mar-2004 31-Oct-2004

***************************** --------------------

Total Count 6





Employed at End of Year Marusia Churai 15-Nov-1961

 Mykola Leontovych 15-Jun-2004

 Lesia Ukrainka 02-Jan-2004

 Taras Shevchenko 23-Aug-1976

 Roxolana Lisovsky 03-Jun-2004

***************************** --------------------

Total Count 5

 

That's all there is to it. It wouldn't be a big leap to turn this report into a master/detail report with each section starting on a new page. Using this technique, you can develop similar reports with any number of sections you need.

     



Oracle SQL Plus The Definitive Guide, 2nd Edition
Oracle SQL*Plus: The Definitive Guide (Definitive Guides)
ISBN: 0596007469
EAN: 2147483647
Year: N/A
Pages: 151
Simiral book on Amazon

Flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net