Section 11.5. Selecting Rows That Match Several Items in a List


11.5. Selecting Rows That Match Several Items in a List

An interesting problem is that of how to write queries based on some criteria referring to a varying list of values. This case is best illustrated by looking for employees who have certain skills, using the three tables shown in Figure 11-6. The skillset table links employees to skills, associating a 1 to 3 skill_level value to distinguish between honest competency, strong experience, and outright wizardry.

Figure 11-6. Tables used for querying employee skills


Finding employees that have a level 2 or 3 SQL skill is easy enough:

 select e.employee_name from employees e where e.employee_id in      (select ss.employee_id       from skillset ss,            skills s       where s.skill_id = ss.skill_id         and s.skill_name = 'SQL'         and ss.skill_level >= 2) order by e.employee_name 

(We can also write the preceding query with a simple join.) If we want to retrieve the employees who are competent with Oracle or DB2, all we need to do is write:

 select e.employee_name, s.skill_name, ss.skill_level from employees e,      skillset ss,      skills s where e.employee_id = ss.employee_id   and s.skill_id = ss.skill_id   and s.skill_name in ('ORACLE', 'DB2') order by e.employee_name 

No need to test for the skill level, since we will accept any level. However, we do need to display the skill name; otherwise, we won't be able to tell why a particular employee was returned by the query. We also encounter a first difficulty, namely that people who are competent in both Oracle and DB2 will appear twice. What we can try to do is to aggregate skills by employee. Unfortunately, not all SQL dialects provide an aggregate function for concatenating strings (you can sometimes write it as a user-defined aggregate function, though). We can nevertheless perform a skill aggregate by using the simple stratagem of a double conversion . First we convert our value from string to number, then from number back to string once we have aggregated numbers.

Skill levels are in the 1 through 3 range. We can therefore confidently represent any combination of Oracle and DB2 skills by a two-digit number, assigning for instance the first digit to DB2 and the second one to Oracle. This is easily done as follows:

 select e.employee_name,        (case s.skill_name           when 'DB2' then 10           else 1         end) * ss.skill_level as computed_skill_level from employees e,      skillset ss,      skills s where e.employee_id = ss.employee_id   and s.skill_id = ss.skill_id   and s.skill_name in ('ORACLE', 'DB2') 

computed_skill_level will result in 10, 20, or 30 for DB2 skill levels, while Oracle skill levels will remain 1, 2, and 3. We then can very easily aggregate our skill levels, and convert them back to a more friendly description:

 select employee_name,        --  Decode the numerically encoded skill + skill level combination        --  Tens are DB2 skill levels, and units Oracle skill levels       case         when aggr_skill_level >= 10           then 'DB2:' + str(round(aggr_skill_level/10,0)) + ' '       end       + case           when aggr_skill_level % 10 > 0            then 'Oracle:' + str(aggr_skill_level % 10)        end as skills from (select e.employee_name,              -- Numerically encode skill + skill level              -- so that we can aggregate them              sum((case s.skill_name                     when 'DB2' then 10                     else 1                   end) * ss.skill_level)  as aggr_skill_level       from employees e,            skillset ss,            skills s       where e.employee_id = ss.employee_id         and s.skill_id = ss.skill_id         and s.skill_name in ('ORACLE', 'DB2')      group by e.employee_name) as encoded_skills order by employee_name 

But now let's try to answer a more difficult question. Suppose that the project we want to staff happens to be a migration from one DBMS to another one. Instead of finding people who know Oracle or DB2, we want people who know both Oracle and DB2.

We have several ways to answer such a question. If the SQL dialect we are using supports it, the intersect operator is one solution: we find people who are skilled on Oracle on one hand, people who are skilled on DB2 on the other hand, and keep the happy few that belong to both sets. We certainly can also write the very same query with an in( ):

 select e.employee_name from employees e,      skillset ss,      skills s where s.skill_name = 'ORACLE'   and s.skill_id = ss.skill_id   and ss.employee_id = e.employee_id   and e.employee_id in (select ss2.employee_id                         from skillset ss2,                              skills s2                         where s2.skill_name = 'DB2'                           and s2.skill_id = ss2.skill_id) 

We can also use the double conversion solution and filter on the numerical aggregate by using the same expressions as we have been using for decoding the encoded_skills computed column. The double conversion stratagem has other advantages:

  • It hits tables only once.

  • It makes it easier to handle more complicated questions such as "people who know Oracle and Java, or MySQL and PHP."

  • As we are only using a list of skills, we can use a pivot table and bind the list, thus improving performance of oft-repeated queries. The row_num pivot table column can help us encode since, if the list is reasonably short, we can multiply the skill_level value by 10 raised to the (row_num -1)th power. If we don't care about the exact value of the skill level, and our DBMS implements bit-wise aggregate functions, we can even try to dynamically build a bit-map.




The Art of SQL
The Art of SQL
ISBN: 0596008945
EAN: 2147483647
Year: N/A
Pages: 143

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net