Yes, it is possible to write a JOIN on a column containing a comma-delimited list of values.
You may have heard that it's not a good idea to design a table with a comma-delimited list of key values in a single column. But what if you inherit such a table and you need to write a query using it?
10.8.1. Advantages of Comma-Delimited Lists
You commonly encounter comma-delimited lists in web page HTML forms containing a SELECT, or drop-down list, with the MULTIPLE option. For example, suppose you have a web site of articles, each of which can belong to one or more categories. To add a new article, you will likely have a form which includes a drop down to select to which categories the article belongs:
NewsFeatured ArticlesOpinionsScienceTechnologyComputers
When the web page form is submitted, all of the option values selected in the drop down will be submitted as a comma-delimited list of values in the categories form field. So if the user selects Opinions, Science, and Computers, the value of the categories form field will be '3,4,6'.
The advantages of simply storing the list of user-selected values into a VARCHAR column are obvious: the code to store the list is simpler than the code to store the individual values of the list as multiple rows in a separate relationship table.
10.8.2. Disadvantages of Comma-Delimited Lists
One disadvantage becomes apparent the moment you want to use the comma-delimited list in a join. Suppose you need to retrieve the article title and body, as well as the category names for each category to which the article belongs. You will probably try this approach first:
SELECT articles.title , articles.body , categories.name FROM articles INNER JOIN categories ON categories.id IN ( articles.categories )
The problem is, you cannot use IN like that (you may even get a syntax error), because the VARCHAR column isn't an actual list of values as far as SQL is concernedit's a single string value, which just happens to contain commas. You cannot use an equality test either, because none of the categories.id values will equal the comma-delimited list (unless there's only one value in the list). What to do?
10.8.3. Joining on a Comma-Delimited List
The difficulty of joining on a comma-delimited list is that one of the columns has a key value which must be found "inside" the other column value. This sounds like a job for LIKE:
SELECT articles.title , articles.body , categories.name FROM articles INNER JOIN categories ON articles.categories LIKE '%' || categories.id || '%'
|
This approach takes every value of categories.id and determines whether it's located anywhere within the articles.categories comma-delimited string values. Simple, right?
Wrong. If the comma-delimited list of values is '3,9,37,53,82' and the single key value is 7, the LIKE expression becomes '%7%', and you will get a correct match even though article 7 is not in the list.
The way around this predicament is to realize that you also need to delimit the LIKE value. Here is the final query:
select articles.title , articles.body , categories.name from articles inner join categories on ',' || articles.categories || ',' like '%,' || categories.id || ',%'
To use the preceding example data, this query now searches the string ',3,9,37,53,82,' for a LIKE expression of '%,7,%' which will not match. Appending the commas to the front and back of both the searched and the search strings ensures that the first and last items in the list will be found.
|
10.8.4. See Also
"Search for a String Across Columns" [Hack #16]
Rudy Limeback
SQL Fundamentals
Joins, Unions, and Views
Text Handling
Date Handling
Number Crunching
Online Applications
Organizing Data
Storing Small Amounts of Data
Locking and Performance
Reporting
Users and Administration
Wider Access
Index