Recipe 7.12. Sorting Hostnames in Domain Order


Problem

You want to sort hostnames in domain order, with the rightmost parts of the names more significant than the leftmost parts.

Solution

Break apart the names, and sort the pieces from right to left.

Discussion

Hostnames are strings and therefore their natural sort order is lexical. However, it's often desirable to sort hostnames in domain order, where the rightmost segments of the hostname values are more significant than the leftmost segments. Suppose that you have a table hostname that contains the following names:

mysql> SELECT name FROM hostname ORDER BY name; +--------------------+ | name               | +--------------------+ | cvs.php.net        | | dbi.perl.org       | | jakarta.apache.org | | lists.mysql.com    | | mysql.com          | | www.kitebird.com   | +--------------------+ 

The preceding query demonstrates the natural lexical sort order of the name values. That differs from domain order, as shown by the following table.

Lexical orderDomain order
cvs.php.net www.kitebird.com
dbi.perl.org mysql.com
jakarta.apache.org lists.mysql.com
lists.mysql.com cvs.php.net
mysql.com jakarta.apache.org
www.kitebird.com dbi.perl.org


Producing domain-ordered output is a substring-sorting problem, where it's necessary to extract each segment of the names so they can be sorted in right-to-left fashion. There is also an additional complication if your values contain different numbers of segments, as our example hostnames do. (Most of them have three segments, but mysql.com has only two.)

To extract the pieces of the hostnames, begin by using SUBSTRING_INDEX⁠(⁠ ⁠ ⁠) in a manner similar to that described previously in Section 7.11. The hostname values have a maximum of three segments, from which the pieces can be extracted left to right like this:

SUBSTRING_INDEX(SUBSTRING_INDEX(name,'.',-3),'.',1) SUBSTRING_INDEX(SUBSTRING_INDEX(name,'.',-2),'.',1) SUBSTRING_INDEX(name,'.',-1) 

These expressions work properly as long as all the hostnames have three components. But if a name has fewer than three, you don't get the correct result, as the following query demonstrates:

mysql> SELECT name,     -> SUBSTRING_INDEX(SUBSTRING_INDEX(name,'.',-3),'.',1) AS leftmost,     -> SUBSTRING_INDEX(SUBSTRING_INDEX(name,'.',-2),'.',1) AS middle,     -> SUBSTRING_INDEX(name,'.',-1) AS rightmost     -> FROM hostname; +--------------------+----------+----------+-----------+ | name               | leftmost | middle   | rightmost | +--------------------+----------+----------+-----------+ | cvs.php.net        | cvs      | php      | net       | | dbi.perl.org       | dbi      | perl     | org       | | lists.mysql.com    | lists    | mysql    | com       | | mysql.com          | mysql    | mysql    | com       | | jakarta.apache.org | jakarta  | apache   | org       | | www.kitebird.com   | www      | kitebird | com       | +--------------------+----------+----------+-----------+ 

Notice the output for the mysql.com row; it has mysql for the value of the leftmost column, where it should have an empty string. The segment-extraction expressions work by pulling off the rightmost n segments, and then returning the leftmost segment of the result. The source of the problem for mysql.com is that if there aren't n segments, the expression simply returns the leftmost segment of however many there are. To fix this problem, add a sufficient number of periods at the beginning of the hostname values to guarantee that they have the requisite number of segments:

mysql> SELECT name,     -> SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT('..',name),'.',-3),'.',1)     -> AS leftmost,     -> SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT('.',name),'.',-2),'.',1)     -> AS middle,     -> SUBSTRING_INDEX(name,'.',-1) AS rightmost     -> FROM hostname; +--------------------+----------+----------+-----------+ | name               | leftmost | middle   | rightmost | +--------------------+----------+----------+-----------+ | cvs.php.net        | cvs      | php      | net       | | dbi.perl.org       | dbi      | perl     | org       | | lists.mysql.com    | lists    | mysql    | com       | | mysql.com          |          | mysql    | com       | | jakarta.apache.org | jakarta  | apache   | org       | | www.kitebird.com   | www      | kitebird | com       | +--------------------+----------+----------+-----------+ 

That's pretty ugly. But these expressions do serve to extract the substrings that are needed for sorting hostname values correctly in right-to-left fashion:

mysql> SELECT name FROM hostname     -> ORDER BY     -> SUBSTRING_INDEX(name,'.',-1),     -> SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT('.',name),'.',-2),'.',1),     -> SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT('..',name),'.',-3),'.',1); +--------------------+ | name               | +--------------------+ | www.kitebird.com   | | mysql.com          | | lists.mysql.com    | | cvs.php.net        | | jakarta.apache.org | | dbi.perl.org       | +--------------------+ 

If you had hostnames with a maximum of four segments rather than three, you'd need to add to the ORDER BY clause another SUBSTRING_INDEX⁠(⁠ ⁠ ⁠) expression that adds three dots at the beginning of the hostname values.




MySQL Cookbook
MySQL Cookbook
ISBN: 059652708X
EAN: 2147483647
Year: 2004
Pages: 375
Authors: Paul DuBois

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net