Section 3.7. Working with Sequences

   

3.7 Working with Sequences

3.7.1 Problem

You want to find any arithmetically increasing sequence in the data. In our example, you want to find any sequences in which the purity level is arithmetically increasing (100, 101, 102, etc.). Any such sequences that are three or more containers long indicate that a production line is overheating .

3.7.2 Solution

Find sequences in the ProductionLine data using the following query:

 SELECT     p1.ContainerId SeqBeg, p2.ContainerId SeqEnd,    p2.ContainerId-p1.ContainerId+1 SequenceSize FROM ProductionLine p1, ProductionLine p2 WHERE     (p1.ContainerId < p2.ContainerId) AND    NOT EXISTS(SELECT * FROM ProductionLine p3        WHERE (p3.Purity-p3.ContainerId!=p1.Purity-p1.ContainerId AND        p3.ContainerId BETWEEN p1.ContainerId AND p2.ContainerId)        OR (p3.ContainerId=p1.ContainerId-1 AND           p3.Purity-p3.ContainerId=p1.Purity-p1.ContainerId)        OR (p3.ContainerId=p2.ContainerId+1 AND           p3.Purity-p3.ContainerId=p1.Purity-p1.ContainerId)) SeqBeg      SeqEnd      SequenceSize  ----------- ----------- ------------  2           5           4 8           9           2 

3.7.3 Discussion

This query uses a framework similar to that used to find regions . The difference is that the subquery contains a WHERE clause condition to identify sequences. To explain exactly how that works, let's begin by looking at the raw data:

 ContainerId Purity      Diff         ----------- ----------- -----------  1           100         99 2           100         98 3           101         98 4           102         98 5           103         98 6           100         94 7           103         96 8           108         100 9           109         100 10          100         90 11          100         89 12          100         88 

Notice that the purity levels of containers 2-5 represent an arithmetically increasing sequence. Notice also that the container IDs represent an arithmetically increasing sequence. We can use the fact that both sequences are monotonic to our advantage. It means that, within any given sequence, the difference between the ContainerId and Purity will be a constant. For example:

 100 - 2 = 98 101 - 3 = 98 ... 

We put knowledge of this pattern to use in the subquery, which uses the following WHERE condition to find any candidates that break the sequence:

 p3.Purity-p3.ContainerId!=p1.Purity-p1.ContainerId 

If any row in a candidate sequence (p3) has the same difference as the first row of the sequence (p1), it is a member of the sequence. If not, the candidate pair (p1, p2) should be discarded.

The rest of the framework is exactly the same as for finding regions, and you can easily extend it for sequences in the same ways that you can when finding regions. For example, to find only sequences larger than three rows, add the following WHERE condition:

 p2.ContainerId-p1.ContainerId>=2 AND 

For example:

 SELECT     p1.ContainerId SeqBeg, p2.ContainerId SeqEnd,    p2.ContainerId-p1.ContainerId+1 SequenceSize FROM ProductionLine p1, ProductionLine p2 WHERE     (p1.ContainerId < p2.ContainerId) AND    p2.ContainerId-p1.ContainerId>=2 AND    NOT EXISTS(SELECT * FROM ProductionLine p3        WHERE (p3.Purity-p3.ContainerId!=p1.Purity-p1.ContainerId AND        p3.ContainerId BETWEEN p1.ContainerId AND p2.ContainerId)        OR (p3.ContainerId=p1.ContainerId-1 AND           p3.Purity-p3.ContainerId=p1.Purity-p1.ContainerId)        OR (p3.ContainerId=p2.ContainerId+1 AND           p3.Purity-p3.ContainerId=p1.Purity-p1.ContainerId)) SeqBeg      SeqEnd      SequenceSize  ----------- ----------- ------------  2           5           4 

With this framework, you can use algorithms for regions and apply them to sequences with minimal changes to the code. Usually, you just have to add an additional condition such as the one we added in this recipe.

   


Transact-SQL Cookbook
Transact-SQL Cookbook
ISBN: 1565927567
EAN: 2147483647
Year: 2005
Pages: 152

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net