3.3 Types of Joins

   

There are several types of joins to be aware of:


Cross joins

Cross joins are joins without a join condition. Each row of one table is combined with each row of another table. The result is referred to as a Cartesian product.


Inner joins

Inner joins are the regular joins. An inner join returns the rows that satisfy the join condition. Each row returned by an inner join contains data from all the tables involved in the join.


Outer joins

Outer joins are an extension to inner joins. An outer join returns the rows that satisfy the join condition and also the rows from one table for which no corresponding rows (i.e., that satisfy the join condition) exist in the other table.


Equi- and non-equi-joins

An equi-join is a join where the join condition uses the equal to (=) operator to relate the rows of two tables. When a join condition uses any other operator to relate the rows of two tables, the join is called a non-equi-join.


Self joins

A self join is a join of a table to itself.


Partition outer joins

A new type of join introduced in Oracle Database 10g that is slated to be part of the next ANSI/ISO SQL standard after SQL:2003. A partition outer join divides your result set into groups, or partitions, and repeats the same outer join for each of these groups. Such joins are extremely handy for generating missing rows.

The following sections discuss each of these joins in detail, and with examples.

3.3.1 Cross Joins/Cartesian Products

If you don't specify a join condition when joining two tables, Oracle combines each row from the first table with each row from the second table. This type of result set is called a cross join or a Cartesian product; either term is acceptable. The number of rows in a cross join is the product of the number of rows in each table. Here's an example of a cross join:

SELECT e.lname, d.name FROM employee e CROSS JOIN department d; LNAME      NAME ---------- -------------- SMITH      ACCOUNTING ALLEN      ACCOUNTING WARD       ACCOUNTING JONES      ACCOUNTING MARTIN     ACCOUNTING BLAKE      ACCOUNTING  . . .   . . .   . . .  SCOTT      OPERATIONS KING       OPERATIONS TURNER     OPERATIONS ADAMS      OPERATIONS JAMES      OPERATIONS FORD       OPERATIONS MILLER     OPERATIONS 56 rows selected.

Since the query didn't specify a join condition, each row from the employee table is combined with each row from the department table. Needless to say, this result set is of little use. More often than not, a cross join produces a result set containing misleading rows. Therefore, unless you are sure that you want a Cartesian product, don't use a cross join.

Notice the use of the keyword CROSS before the JOIN keyword in the previous example. If you omit the CROSS keyword, and don't specify a join condition, Oracle will throw an error, because it thinks that you are attempting a regular join and have inadvertently omitted the join condition. For example:

SELECT e.lname, d.name FROM employee e JOIN department d; FROM employee e JOIN department d                                 * ERROR at line 2: ORA-00905: missing keyword

What happens when you specify the CROSS keyword as well as a join condition through an ON or USING clause? Oracle rejects your query with an error, and rightly so, because cross joins are joins without join conditions. For example:

SELECT e.lname, d.name FROM employee e CROSS JOIN department d ON e.dept_id = d.dept_id; ON e.dept_id = d.dept_id * ERROR at line 3: ORA-00933: SQL command not properly ended

Be aware that it's easily possible to inadvertently specify a cross join when using the old join syntax described in the Appendix A. Using that syntax, a cross join occurs when you list two tables in the FROM clause separated by commas, and you forget to write a join condition into the query's WHERE clause.

3.3.2 Inner Joins

Inner joins are the most commonly used joins. When people refer simply to a "join," they most likely mean an "inner join." An inner join relates the rows from the source tables based on the join condition, and returns the rows that satisfy it. For example, to list the name and department for each employee, you would use the following SQL statement:

SELECT e.lname, d.name FROM employee e JOIN department d ON e.dept_id = d.dept_id; LNAME                NAME -------------------- -------------------- CLARK                ACCOUNTING KING                 ACCOUNTING MILLER               ACCOUNTING SMITH                RESEARCH ADAMS                RESEARCH FORD                 RESEARCH SCOTT                RESEARCH JONES                RESEARCH ALLEN                SALES BLAKE                SALES MARTIN               SALES JAMES                SALES TURNER               SALES WARD                 SALES 14 rows selected.

In this example, each row of the employee table is combined with each row of the department table, and if the combination satisfies the join condition (dept_id in the employee table matches the dept_id in the department table), then it is included in the result set.

The JOIN keyword, unless prefixed with another keyword, means an inner join. Optionally, you can use the INNER keyword before the JOIN keyword to explicitly indicate an inner join, as in the following example:

SELECT e.lname, d.name FROM employee e INNER JOIN department d ON e.dept_id = d.dept_id;

Let's look at another example to go a bit deeper in the concept behind an inner join:

SELECT * FROM department;    DEPT_ID NAME                 LOCATION_ID ---------- -------------------- -----------         10 ACCOUNTING                   122         20 RESEARCH                     124         30 SALES         40 OPERATIONS                   167 SELECT * FROM  location; LOCATION_ID REGIONAL_GROUP ----------- --------------------         122 NEW YORK         124 DALLAS         123 CHICAGO         167 BOSTON         144 SAN FRANCISCO SELECT d.name, l.regional_group FROM department d JOIN location l ON d.location_id = l.location_id; NAME                 REGIONAL_GROUP -------------------- -------------------- ACCOUNTING           NEW YORK RESEARCH             DALLAS OPERATIONS           BOSTON

Our department table has four rows, and our location table has five rows. However, the inner join returns only three rows. The inner join returns only those rows from the two tables that satisfy the join condition. What this means, with respect to this example, is that only those departments that have a corresponding location, and only those locations that have a corresponding department, are returned by the inner join query. The "SALES" department doesn't have a location_id, and therefore has no corresponding entry in the location table, so it is not included in the result set of the inner join. Similarly, the locations "CHICAGO" and "SAN FRANCISCO" don't have corresponding entries in the department table, and are not included in the result set of the inner join.

The concept of an inner join is easier to understand in terms of the Cartesian product (or cross join). While performing a join of the department and location tables, a Cartesian product is first formed (conceptually, Oracle doesn't physically materialize this Cartesian product), and then the join conditions in the ON (or USING) clause restrict the results to only those rows for which the location_id values match.


The most important concept to understand about joins, and especially about inner joins, is that a join is all about matching rows from one table with corresponding rows in another table.

3.3.3 Outer Joins

Sometimes, while performing a join between two tables, you need to return all the rows from one table even when there are no corresponding rows in the other table. For example, you may want to see all the departments even if they are not related to any particular location. Oracle provides a special type of join to include rows from one table that don't have matching rows from the other table. This type of join is known as an outer join.

The syntax of an outer join is:

FROM table1 { LEFT | RIGHT | FULL } [OUTER] JOIN table2

The syntax elements are:


table1, table2

Specifies the tables between which you are performing the outer join.


LEFT

Specifies that the results be generated using all rows from table1. For those rows in table1 that don't have corresponding rows in table2, NULLs are returned in the result set for the table2 columns.


RIGHT

Specifies that the results be generated using all rows from table2. For those rows in table2 that don't have corresponding rows in table1, NULLs are returned in the result set for the table1 columns.


FULL

Specifies that the results be generated using all rows from table1 and table2. For those rows in table1 that don't have corresponding rows in table2, NULLs are returned in the result set for the table2 columns. Additionally, for those rows in table2 that don't have corresponding rows in table1, NULLs are returned in the result set for the table1 columns.


OUTER

Specifies that you are performing an OUTER join. This keyword is optional. If you use LEFT, RIGHT, or FULL, Oracle automatically assumes an outer join. The OUTER keyword is for completeness' sake, and complements the INNER keyword.

3.3.3.1 Left outer joins

To list all departments even if they are not related to any particular location, you can perform a LEFT OUTER JOIN between the department and the location tables. For example:

SELECT d.dept_id, d.name, l.regional_group FROM department d LEFT OUTER JOIN location l ON d.location_id = l.location_id;    DEPT_ID NAME                 REGIONAL_GROUP ---------- -------------------- --------------         10 ACCOUNTING           NEW YORK         20 RESEARCH             DALLAS         30 SALES         40 OPERATIONS           BOSTON

This query lists all the rows from the department table together with their corresponding locations from the location table. For the rows from department with no corresponding rows in location, NULLs are returned for the l.regional_group column in the result set.

3.3.3.2 Right outer joins

Likewise, to list all the locations even if they are not related to any particular department, you can perform a RIGHT OUTER JOIN between the location and the department tables. For example:

SELECT d.dept_id, d.name, l.regional_group FROM department d RIGHT OUTER JOIN location l ON d.location_id = l.location_id;    DEPT_ID NAME                 REGIONAL_GROUP ---------- -------------------- ---------------         10 ACCOUNTING           NEW YORK                                 CHICAGO         20 RESEARCH             DALLAS                                 SAN FRANCISCO         40 OPERATIONS           BOSTON

This query lists all the rows from the location table, and their corresponding departments from the department table. For the rows from location that don't have corresponding rows in department, NULLs are returned for the d.dept_id and d.name columns in the result set.

The LEFT and RIGHT keywords in an outer join query are relative to the position of the tables in the FROM clause. The same result can be achieved using either a LEFT OUTER JOIN or a RIGHT OUTER JOIN, by switching the position of the tables. For example, the following two queries are equivalent:

SELECT d.dept_id, d.name, l.regional_group FROM department d LEFT OUTER JOIN location l ON d.location_id = l.location_id; SELECT d.dept_id, d.name, l.regional_group FROM location l RIGHT OUTER JOIN department d ON d.location_id = l.location_id;

In each case, the directional word, either LEFT or RIGHT, points toward the anchor table, the table that is required. The other table is then the optional table in the join.

3.3.3.3 Full outer joins

Ocassionally, you may need the effect of an outer join in both directions, which you can think of as a combination of LEFT and RIGHT outer joins. For example, you may need to list all the departments (with or without a location), as well as all the locations (with or without a department). Use a FULL OUTER JOIN to generate such a result set:

SELECT d.dept_id, d.name, l.regional_group FROM department d FULL OUTER JOIN location l ON d.location_id = l.location_id;    DEPT_ID NAME                 REGIONAL_GROUP ---------- -------------------- ----------------         10 ACCOUNTING           NEW YORK         20 RESEARCH             DALLAS         30 SALES         40 OPERATIONS           BOSTON                                 CHICAGO                                 SAN FRANCISCO 6 rows selected.

This query performs a FULL OUTER JOIN between the two tables, and lists:

  • All the rows that satisfy the join condition

  • The rows in the department table that don't have a corresponding location

  • The rows in the location table that don't have a corresponding department

A full outer join is bidirectional in the sense that the result is the same irrespective of the position of the tables in the FROM clause. In mathematical terms, you would consider the FULL OUTER JOIN operator to be "commutative."

3.3.4 Equi-Joins Versus Non-Equi-Joins

The join condition determines whether a join is an equi-join or a non-equi-join. When a join condition relates two tables by equating the columns from the tables, it is an equi-join. When a join condition relates two tables by an operator other than equality, it is a non-equi-join. A query may contain equi-joins as well as non-equi-joins.

Equi-joins are the most common join type. For example, if you want to list all the parts supplied by all the suppliers, you can join the supplier table with the part table by equating the supplier_id from one table to that of the other:

SELECT s.name supplier_name, p.name part_name FROM supplier s JOIN part p ON s.supplier_id = p.supplier_id; SUPPLIER_NAME                  PART_NAME ------------------------------ ----------------------- Acme Industries                Acme Part AI5-4557 Tilton Enterprises             Tilton Part TZ50828 Eastern Importers              Eastern Part EI-T5-001

However, there are situations in which you need non-equi-joins to get the required information. For example, if you want to list the inventory_class of each part, and the inventory_class is based on a range of unit costs, you need to execute the following query:

SELECT p.name part_name, c.inv_class inv_class FROM part p JOIN inventory_class c ON p.unit_cost BETWEEN c.low_cost AND c.high_cost; PART_NAME                      INV ------------------------------ --- Acme Part AI5-4557             A Tilton Part TZ50828            B Eastern Part EI-T5-001         B

The use of the BETWEEN operator to relate the unit_cost column from the part table to the low_cost and high_cost columns of the inventory_class table makes this query a non-equi-join. (You can achieve the same result by using the combination of >= and <= operators instead of BETWEEN. Try this as an exercise.)

3.3.5 Self Joins

There are situations in which one row of a table is related to another row of the same table. The employee table is a good example. The manager of one employee is also an employee. The rows for both are in the same employee table. This relationship is indicated in the manager_emp_id column:

CREATE TABLE employee ( emp_id          NUMBER (5) NOT NULL PRIMARY KEY, fname           VARCHAR2 (20),  lname           VARCHAR2 (20),  dept_id         NUMBER (5), manager_emp_id  NUMBER (5) REFERENCES employee(emp_id), salary          NUMBER (5), hire_date       DATE,  job_id          NUMBER (3));

To get information about an employee and his manager, you have to join the employee table with itself. You can do that by specifying the employee table twice in the FROM clause and using two different table aliases, thereby treating employee as if it were two separate tables. The following example lists the name of each employee and his manager:

SELECT e.lname employee, m.lname manager FROM employee e JOIN employee m ON e.manager_emp_id = m.emp_id; EMPLOYEE             MANAGER -------------------- -------------------- SCOTT                JONES FORD                 JONES ALLEN                BLAKE WARD                 BLAKE JAMES                BLAKE TURNER               BLAKE MARTIN               BLAKE MILLER               CLARK ADAMS                SCOTT JONES                KING CLARK                KING BLAKE                KING SMITH                FORD 13 rows selected.

Note that the employee table is used twice in the FROM clause with two different aliases. Also note the join condition that reads as: "Where the employee's manager_emp_id is the same as his manager's emp_id."

3.3.5.1 Self outer joins

Even though the employee table has 14 rows, the previous query returned only 13 rows. This is because there is an employee without a manager_emp_id. Oracle excludes that employee's row from the result set while performing the self inner join. To include employees without manager_emp_id values, in other words, without managers, you need an outer join:

SELECT e.lname employee, m.lname manager FROM employee e LEFT OUTER JOIN employee m ON e.manager_emp_id = m.emp_id; EMPLOYEE             MANAGER -------------------- -------------------- FORD                 JONES SCOTT                JONES JAMES                BLAKE TURNER               BLAKE MARTIN               BLAKE WARD                 BLAKE ALLEN                BLAKE MILLER               CLARK ADAMS                SCOTT CLARK                KING BLAKE                KING JONES                KING SMITH                FORD KING 14 rows selected.

Be careful when using a LEFT or RIGHT outer join to join a table to itself. If you choose the wrong direction, you may get an absurd result set that makes no sense. In this case, we want to list all the employees irrespective of whether they have a manager or not. Therefore, the employee table we need to make optional is the one from which we are drawing manager names.

3.3.5.2 Self non-equi-joins

The previous example showed self equi-joins. However, there are situations when you need to perform self non-equi-joins. We will illustrate this by an example. Let's assume that you are in charge of organizing interdepartmental basketball competition within your organization. It is your responsibility to draw the teams and schedule the competition. You query the department table and get the following result:

SELECT name FROM department; NAME -------------- ACCOUNTING RESEARCH SALES OPERATIONS

You find that there are four departments, and to make a fair competition, you decide that each department plays against the other three departments once, and at the end, the department with the maximum wins is declared the winner. You have been to an Oracle SQL training class recently, and decide to apply the concept of self join you learned there. You execute the following query:

SELECT d1.name team1, d2.name team2 FROM department d1 CROSS JOIN department d2; TEAM1                TEAM2 -------------------- -------------------- ACCOUNTING           ACCOUNTING ACCOUNTING           RESEARCH ACCOUNTING           SALES ACCOUNTING           OPERATIONS RESEARCH             ACCOUNTING RESEARCH             RESEARCH RESEARCH             SALES RESEARCH             OPERATIONS SALES                ACCOUNTING SALES                RESEARCH SALES                SALES SALES                OPERATIONS OPERATIONS           ACCOUNTING OPERATIONS           RESEARCH OPERATIONS           SALES OPERATIONS           OPERATIONS 16 rows selected.

Disappointing results. From your knowledge of high school mathematics, you know that four teams each playing once with the other three makes six combinations. However, your SQL query returned 16 rows. Now you realize that since you used a cross join (or didn't specify any join condition), you got a Cartesian product from your query. You put in a join condition, and your query and results now look as follows:

SELECT d1.name team1, d2.name team2 FROM department d1 JOIN department d2 ON d1.dept_id = d2.dept_id; TEAM1          TEAM2 -------------- -------------- ACCOUNTING     ACCOUNTING RESEARCH       RESEARCH SALES          SALES OPERATIONS     OPERATIONS

Oops! The equi-join returned a very unwanted result. A team can't play against itself. You realize your mistake, and this sparks the idea that you can use non-equi-joins in this situation. You rewrite the query as a non-equi-join. You don't want a team to play against itself, and therefore replace the = operator in the join condition with !=. Let's look at the results:

SELECT d1.name team1, d2.name team2 FROM department d1 JOIN department d2 ON d1.dept_id != d2.dept_id; TEAM1          TEAM2 -------------- -------------- RESEARCH       ACCOUNTING SALES          ACCOUNTING OPERATIONS     ACCOUNTING ACCOUNTING     RESEARCH SALES          RESEARCH OPERATIONS     RESEARCH ACCOUNTING     SALES RESEARCH       SALES OPERATIONS     SALES ACCOUNTING     OPERATIONS RESEARCH       OPERATIONS SALES          OPERATIONS 12 rows selected.

Still not done. In this result set, you have permutations such as (RESEARCH, ACCOUNTING) and (ACCOUNTING, RESEARCH), and so on. Therefore, each team plays against the others twice. You need to remove these permutations, which you rightly consider to be duplicates. You think about using DISTINCT. DISTINCT will not help here, because the row (RESEARCH, ACCOUNTING) is different from the row (ACCOUNTING, RESEARCH) from the viewpoint of DISTINCT; but not from the viewpoint of your requirement. After some thought, you want to try out an inequality operator other than !=. You decide to go with the less-than (<) operator. Here are the results you get:

SELECT d1.name team1, d2.name team2 FROM department d1 JOIN department d2 ON D1.DEPT_ID < D2.DEPT_ID; TEAM1                TEAM2 -------------------- ----------- ACCOUNTING           RESEARCH ACCOUNTING           SALES ACCOUNTING           OPERATIONS RESEARCH             SALES RESEARCH             OPERATIONS SALES                OPERATIONS 6 rows selected.

That's it! Now you have six combinations: each team plays against the other three just once. Let's examine why this version of the query works. Conceptually, when Oracle executes this query, a Cartesian product is first formed with 16 rows. Then the less-than (<) operator in the join condition restricts the result set to those rows in which the dept_id of Team 1 is less than the dept_id of Team 2. The less-than (<) operator eliminates the duplicates, because for any given permutation of two departments this condition is satisfied for only one. Using greater-than (>) instead of less-than (<) will also give you the required result, but the team1 and team2 values will be reversed:

SELECT d1.name team1, d2.name team2 FROM department d1 JOIN department d2 ON d1.dept_id > d2.dept_id; TEAM1                TEAM2 -------------------- ----------- OPERATIONS           SALES OPERATIONS           RESEARCH OPERATIONS           ACCOUNTING SALES                RESEARCH SALES                ACCOUNTING RESEARCH             ACCOUNTING 6 rows selected.

Don't be disheartened by the painful process you had to go through to get this result. Sometimes you have to go through an agonizing experience to get simple results such as these. That's life. Now that you have the team combinations right, go a bit further and assign a date for each match. Use "tomorrow" as the starting date:

SELECT d1.name team1, d2.name team2, SYSDATE + ROWNUM match_date FROM department d1 JOIN department d2 ON d1.dept_id < d2.dept_id; TEAM1                TEAM2                MATCH_DAT -------------------- -------------------- --------- ACCOUNTING           RESEARCH             10-NOV-03 ACCOUNTING           SALES                11-NOV-03 ACCOUNTING           OPERATIONS           12-NOV-03 RESEARCH             SALES                13-NOV-03 RESEARCH             OPERATIONS           14-NOV-03 SALES                OPERATIONS           15-NOV-03 6 rows selected.

Now publish these results on the corporate intranet along with the rules and regulations for the competition, and you are done.

3.3.6 Partition Outer Joins

Partition outer joins are an interesting new feature in Oracle Database 10g. They're useful for data densification, which is a fancy way of saying that they are useful for filling in rows that do not exist. This is a common requirement for data warehousing queries.

Part of our example database is an employee_expense table with summary information from employee expense reports. The data in that table looks as follows:

SELECT * FROM employee_expense;     EMP_ID       YEAR      MONTH EXPENSE_CLAIM APPROVED_AMT PAID_DATE ---------- ---------- ---------- ------------- ------------ ---------       7369       2002          2       3072.43      3072.43 03-MAR-02       7369       2002          4            30           30 01-JUN-02       7369       2002          5        235.03        35.03 01-JUN-02       7369       2002          9       5095.98      5095.08 31-OCT-02       7369       2002         12       1001.01      1001.01 01-FEB-03       7782       2002          1        111.09       111.09 01-FEB-02       7782       2002          3          9.85         9.85 01-APR-02       7782       2002          7       3987.32      3987.32 01-AUG-02       7782       2002          9          1200         1200 01-OCT-02

Management wants to review expenses, and you've been asked to generate a data extract of employee expense claims for the year 2002. Furthermore, to facilitate some analysis that management wishes to do using Excel, you've been asked to ensure that your extract contains one row per month per employee, but only for those employees who actively submit expense reports. Most do not.

You think about these requirements for a while, and realize that you have a months table containing one row for each month in 2002:

SELECT *  FROM months WHERE year = 2002;       YEAR      MONTH ---------- ----------       2002          1       2002          2       2002          3       2002          4       2002          5       2002          6       2002          7       2002          8       2002          9       2002         10       2002         11       2002         12

You can use this months table in an outer join to generate 12 rows for each employee. For example, to generate one row per month for employee 7782, you can write the following query:

SELECT NVL(ee.emp_id, 7782), m.year, m.month, NVL(ee.expense_claim,0) FROM (SELECT * FROM months WHERE year = 2002) m      LEFT OUTER JOIN (SELECT *                       FROM employee_expense                       WHERE emp_id = 7782) ee      ON m.year = ee.year AND m.month = ee.month ORDER BY m.month; NVL(EE.EMP_ID,7782)       YEAR      MONTH NVL(EE.EXPENSE_CLAIM,0) ------------------- ---------- ---------- -----------------------                7782       2002          1                  111.09                7782       2002          2                       0                7782       2002          3                    9.85                7782       2002          4                       0                7782       2002          5                       0                7782       2002          6                       0                7782       2002          7                 3987.32                7782       2002          8                       0                7782       2002          9                    1200                7782       2002         10                       0                7782       2002         11                       0 NVL(EE.EMP_ID,7782)       YEAR      MONTH NVL(EE.EXPENSE_CLAIM,0) ------------------- ---------- ---------- -----------------------                7782       2002         12                       0

The query in this example is a bit intense. It performs a join of two nested SELECT statements, called subqueries. The first subquery returns the 12 rows from months for the year 2002. Those 12 rows are the mandatory rows in the outer join. The second subquery returns the actual expense rows for employee #7782. The outer join ensures that 12 rows are returned, and the two NVL functions ensure that each of those 12 rows has a value for the potentially NULL employee_expense fields. Expense claims for months in which no report was filed are simply set to zero.

Subqueries are discussed in detail in Chapter 5. We hated to have to bring them up now, but we felt the preceding example was necessary to help you understand what a partition outer join is all about.


The previous query is all well and good, but to generate your report you'd need to execute the preceding query many times, once for each employee who has submitted at least one expense report in the year 2002. This is where partition outer joins come into play. They make it trivial to do the equivalent of executing the preceding query once per employee. Here's how:

SELECT ee.emp_id, m.year, m.month, NVL(ee.expense_claim,0) FROM (SELECT * FROM months WHERE year = 2002) m      LEFT OUTER JOIN employee_expense ee         PARTITION BY (ee.emp_id)         ON m.year = ee.year AND m.month = ee.month ORDER BY ee.emp_id, m.month;     EMP_ID       YEAR      MONTH NVL(EE.EXPENSE_CLAIM,0) ---------- ---------- ---------- -----------------------       7369       2002          1                       0       7369       2002          2                 3072.43       7369       2002          3                       0       7369       2002          4                      30       7369       2002          5                  235.03       7369       2002          6                       0       7369       2002          7                       0       7369       2002          8                       0       7369       2002          9                 5095.98       7369       2002         10                       0       7369       2002         11                       0       7369       2002         12                 1001.01       7782       2002          1                  111.09       7782       2002          2                       0       7782       2002          3                    9.85       7782       2002          4                       0       7782       2002          5                       0       7782       2002          6                       0       7782       2002          7                 3987.32       7782       2002          8                       0       7782       2002          9                    1200       7782       2002         10                       0       7782       2002         11                       0       7782       2002         12                       0

Notice the PARTITION BY clause in this query. That clause is new in Oracle Database 10g, and in this example it causes the database engine to conceptually perform the following steps:

  1. Divide the rows from employee_expense into groups based on their emp_id values, one group per value.

  2. Outer join each group to the months table as a separate operation.

The key here is that rather than one outer join, you are getting the equivalent of many outer joins, but with a much simpler syntax, and from one query. The preceding query is logically equivalent to the following UNION ALL query:

SELECT NVL(ee.emp_id, 7369), m.year, m.month, NVL(ee.expense_claim,0) FROM (SELECT * FROM months WHERE year = 2002) m      LEFT OUTER JOIN (SELECT *                       FROM employee_expense                       WHERE emp_id = 7369) ee      ON m.year = ee.year AND m.month = ee.month ORDER BY m.month UNION ALL SELECT NVL(ee.emp_id, 7782), m.year, m.month, NVL(ee.expense_claim,0) FROM (SELECT * FROM months WHERE year = 2002) m      LEFT OUTER JOIN (SELECT *                       FROM employee_expense                       WHERE emp_id = 7782) ee      ON m.year = ee.year AND m.month = ee.month ORDER BY m.month;

You'll learn more about UNION queries in Chapter 7, so don't worry if you don't fully understand this example now. Our point here is to illustrate that, given the two employees represented in our example employee_expense table, our partition outer join query performs the equivalent of two outer joins, one for each employee.

Unlike the case with our UNION ALL code, you do not need to apply NVL to the partition columns when doing a partition outer join. The correct emp_id values were filled in automatically, for all new rows generated in each partition.


Because they make it easy to fill in gaps in your data, partition outer joins are particularly helpful when writing lag and lead queries, which are a type of query particularly sensitive to gaps in data. You'll learn more about lag and lead queries in Chapter 14.



Mastering Oracle SQL
Mastering Oracle SQL, 2nd Edition
ISBN: 0596006322
EAN: 2147483647
Year: 2003
Pages: 154

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net