Recipe 9.13. Identifying Overlapping Date RangesProblemYou want to find all instances of an employee starting a new project before ending an existing project. Consider table EMP_PROJECT: select * from emp_project EMPNO ENAME PROJ_ID PROJ_START PROJ_END ----- ---------- ------- ----------- ----------- 7782 CLARK 1 16-JUN-2005 18-JUN-2005 7782 CLARK 4 19-JUN-2005 24-JUN-2005 7782 CLARK 7 22-JUN-2005 25-JUN-2005 7782 CLARK 10 25-JUN-2005 28-JUN-2005 7782 CLARK 13 28-JUN-2005 02-JUL-2005 7839 KING 2 17-JUN-2005 21-JUN-2005 7839 KING 8 23-JUN-2005 25-JUN-2005 7839 KING 14 29-JUN-2005 30-JUN-2005 7839 KING 11 26-JUN-2005 27-JUN-2005 7839 KING 5 20-JUN-2005 24-JUN-2005 7934 MILLER 3 18-JUN-2005 22-JUN-2005 7934 MILLER 12 27-JUN-2005 28-JUN-2005 7934 MILLER 15 30-JUN-2005 03-JUL-2005 7934 MILLER 9 24-JUN-2005 27-JUN-2005 7934 MILLER 6 21-JUN-2005 23-JUN-2005 Looking at the results for employee KING, you see that KING began PROJ_ID 8 before finishing PROJ_ID 5 and began PROJ_ID 5 before finishing PROJ_ID 2. You want to return the following result set: EMPNO ENAME MSG ----- ---------- -------------------------------- 7782 CLARK project 7 overlaps project 4 7782 CLARK project 10 overlaps project 7 7782 CLARK project 13 overlaps project 10 7839 KING project 8 overlaps project 5 7839 KING project 5 overlaps project 2 7934 MILLER project 12 overlaps project 9 7934 MILLER project 6 overlaps project 3 SolutionThe key here is to find rows where PROJ_START (the date the new project starts) occurs on or after another project's PROJ_START date and on or before that other project's PROJ_END date. To begin, you need to be able to compare each project with each other project (for the same employee). By self joining EMP_PROJECT on employee, you generate every possible combination of two projects for each employee. To find the overlaps, simply find the rows where PROJ_START for any PROJ_ID falls between PROJ_START and PROJ_END for another PROJ_ID by the same employee. DB2, PostgreSQL, and OracleSelf join EMP_PROJECT. Then use the concatenation operator "||" to construct the message that explains which projects overlap: 1 select a.empno,a.ename, 2 'project '||b.proj_id|| 3 ' overlaps project '||a.proj_id as msg 4 from emp_project a, 5 emp_project b 6 where a.empno = b.empno 7 and b.proj_start >= a.proj_start 8 and b.proj_start <= a.proj_end 9 and a.proj_id != b.proj_id MySQLSelf join EMP_PROJECT. Then use the CONCAT function to construct the message that explains which projects overlap: 1 select a.empno,a.ename, 2 concat('project ',b.proj_id, 3 ' overlaps project ',a.proj_id) as msg 4 from emp_project a, 5 emp_project b 6 where a.empno = b.empno 7 and b.proj_start >= a.proj_start 8 and b.proj_start <= a.proj_end 9 and a.proj_id != b.proj_id SQL ServerSelf join EMP_PROJECT. Then use the concatenation operator "+" to construct the message that explains which projects overlap: 1 select a.empno,a.ename, 2 'project '+b.proj_id+ 3 ' overlaps project '+a.proj_id as msg 4 from emp_project a, 5 emp_project b 6 where a.empno = b.empno 7 and b.proj_start >= a.proj_start 8 and b.proj_start <= a.proj_end 9 and a.proj_id != b.proj_id DiscussionThe only difference between the solutions lies in the string concatenation, so one discussion using the DB2 syntax will cover all three solutions. The first step is a self join of EMP_PROJECT so that the PROJ_START dates can be compared amongst the different projects. The output of the self join for employee KING is shown below. You can observe how each project can "see" the other projects: select a.ename, a.proj_id as a_id, a.proj_start as a_start, a.proj_end as a_end, b.proj_id as b_id, b.proj_start as b_start from emp_project a, emp_project b where a.ename = 'KING' and a.empno = b.empno and a.proj_id != b.proj_id order by 2 ENAME A_ID A_START A_END B_ID B_START ------ ----- ----------- ----------- ----- ----------- KING 2 17-JUN-2005 21-JUN-2005 8 23-JUN-2005 KING 2 17-JUN-2005 21-JUN-2005 14 29-JUN-2005 KING 2 17-JUN-2005 21-JUN-2005 11 26-JUN-2005 KING 2 17-JUN-2005 21-JUN-2005 5 20-JUN-2005 KING 5 20-JUN-2005 24-JUN-2005 2 17-JUN-2005 KING 5 20-JUN-2005 24-JUN-2005 8 23-JUN-2005 KING 5 20-JUN-2005 24-JUN-2005 11 26-JUN-2005 KING 5 20-JUN-2005 24-JUN-2005 14 29-JUN-2005 KING 8 23-JUN-2005 25-JUN-2005 2 17-JUN-2005 KING 8 23-JUN-2005 25-JUN-2005 14 29-JUN-2005 KING 8 23-JUN-2005 25-JUN-2005 5 20-JUN-2005 KING 8 23-JUN-2005 25-JUN-2005 11 26-JUN-2005 KING 11 26-JUN-2005 27-JUN-2005 2 17-JUN-2005 KING 11 26-JUN-2005 27-JUN-2005 8 23-JUN-2005 KING 11 26-JUN-2005 27-JUN-2005 14 29-JUN-2005 KING 11 26-JUN-2005 27-JUN-2005 5 20-JUN-2005 KING 14 29-JUN-2005 30-JUN-2005 2 17-JUN-2005 KING 14 29-JUN-2005 30-JUN-2005 8 23-JUN-2005 KING 14 29-JUN-2005 30-JUN-2005 5 20-JUN-2005 KING 14 29-JUN-2005 30-JUN-2005 11 26-JUN-2005 As you can see from the result set above, the self join makes finding overlapping dates easy; simply return each row where B_START occurs between A_START and A_END. If you look at the WHERE clause on lines 7 and 8 of the solution: and b.proj_start >= a.proj_start and b.proj_start <= a.proj_end it is doing just that. Once you have the required rows, constructing the messages is just a matter of concatenating the return values. Oracle users can use the window function LEAD OVER to avoid the self join, if the maximum number of projects per employee is fixed. This can come in handy if the self join is expensive for your particular results (if the self join requires more resources than the sorts needed for LEAD OVER). For example, consider the alternative for employee KING using LEAD OVER: select empno, ename, proj_id, proj_start, proj_end, case when lead(proj_start,1)over(order by proj_start) between proj_start and proj_end then lead(proj_id)over(order by proj_start) when lead(proj_start,2)over(order by proj_start) between proj_start and proj_end then lead(proj_id)over(order by proj_start) when lead(proj_start,3)over(order by proj_start) between proj_start and proj_end then lead(proj_id)over(order by proj_start) when lead(proj_start,4)over(order by proj_start) between proj_start and proj_end then lead(proj_id)over(order by proj_start) end is_overlap from emp_project where ename = 'KING' EMPNO ENAME PROJ_ID PROJ_START PROJ_END IS_OVERLAP ----- ------ ------- ----------- ----------- ---------- 7839 KING 2 17-JUN-2005 21-JUN-2005 5 7839 KING 5 20-JUN-2005 24-JUN-2005 8 7839 KING 8 23-JUN-2005 25-JUN-2005 7839 KING 11 26-JUN-2005 27-JUN-2005 7839 KING 14 29-JUN-2005 30-JUN-2005 Because the number of projects is fixed at five for employee KING, you can use LEAD OVER to move examine the dates of all the projects without a self join. From here, producing the final result set is easy. Simply keep the rows where IS_OVERLAP is not NULL: select empno,ename, 'project '||is_overlap|| ' overlaps project '||proj_id msg from ( select empno, ename, proj_id, proj_start, proj_end, case when lead(proj_start,1)over(order by proj_start) between proj_start and proj_end then lead(proj_id)over(order by proj_start) when lead(proj_start,2)over(order by proj_start) between proj_start and proj_end then lead(proj_id)over(order by proj_start) when lead(proj_start,3)over(order by proj_start) between proj_start and proj_end then lead(proj_id)over(order by proj_start) when lead(proj_start,4)over(order by proj_start) between proj_start and proj_end then lead(proj_id)over(order by proj_start) end is_overlap from emp_project where ename = 'KING' ) where is_overlap is not null EMPNO ENAME MSG ----- ------ -------------------------------- 7839 KING project 5 overlaps project 2 7839 KING project 8 overlaps project 5 To allow the solution to work for all employees (not just KING), partition by ENAME in the LEAD OVER function: select empno,ename, 'project '||is_overlap|| ' overlaps project '||proj_id msg from ( select empno, ename, proj_id, proj_start, proj_end, case when lead(proj_start,1)over(partition by ename order by proj_start) between proj_start and proj_end then lead(proj_id)over(partition by ename order by proj_start) when lead(proj_start,2)over(partition by ename order by proj_start) between proj_start and proj_end then lead(proj_id)over(partition by ename order by proj_start) when lead(proj_start,3)over(partition by ename order by proj_start) between proj_start and proj_end then lead(proj_id)over(partition by ename order by proj_start) when lead(proj_start,4)over(partition by ename order by proj_start) between proj_start and proj_end then lead(proj_id)over(partition by ename order by proj_start) end is_overlap from emp_project ) where is_overlap is not null EMPNO ENAME MSG ----- ------ ------------------------------- 7782 CLARK project 7 overlaps project 4 7782 CLARK project 10 overlaps project 7 7782 CLARK project 13 overlaps project 10 7839 KING project 5 overlaps project 2 7839 KING project 8 overlaps project 5 7934 MILLER project 6 overlaps project 3 7934 MILLER project 12 overlaps project 9 |