Measuring the Success of the New Shipment Method

 < Day Day Up > 

In order to determine whether the new shipment method is a success, we will need to generate purchases for an entire year and examine the number of orders that are generated. To do this, we will need to utilize a revised copy of the GeneratePurchases routine introduced in the last chapter.

A Revised Copy of GeneratePurchases

In Chapter 5 we used the GeneratePurchases routine to simulate daily purchasing activity for each store on a yearly basis. Orders were generated when purchases brought the quantity levels at or below the minimum thresholds allowed. At the beginning of each day, the orders table was checked for unfilled orders and necessary shipments were generated.

The first overloaded GeneratePurchases routine in the LoadSampleData program for Chapter 6 does something similar. The code for this routine is seen as follows:

 'Maximum # of purchases per day Dim nMaxPurchases As Int16 'Maximum # of products per purchase Dim nMaxProducts As Int16 'Maximum value of quantity per product Dim nMaxQuantity As Int16 Dim sYear As String = "2002" Dim nStoreID As Int16 'These are the Product Types in which there 'is an increased chance of product selection 'The default of 1,2 represents snack foods 'and beverages Dim sProductTypeAdj As String 'These are the Vendors in which there is an 'increased chance of product selection 'The default of 20,24,27 represents Kamp, Notch, 'and PNuts as Vendors Dim sVendorAdj As String Dim i As Int16 Try   'Process for all 5 stores   For i = 1 To 5     nStoreID = i     If i = 1 Then       nMaxPurchases = 80       nMaxProducts = 8       nMaxQuantity = 5       sProductTypeAdj = "1,2"       sVendorAdj = "20,24,27"     ElseIf i = 2 Then       nMaxPurchases = 60       nMaxProducts = 10       nMaxQuantity = 3       sProductTypeAdj = "2,6"       sVendorAdj = "13,18"     ElseIf i = 3 Then       nMaxPurchases = 70       nMaxProducts = 6       nMaxQuantity = 4       sProductTypeAdj = "2"       sVendorAdj = "34"     ElseIf i = 4 Then       nMaxPurchases = 90       nMaxProducts = 5       nMaxQuantity = 2       sProductTypeAdj = ""       sVendorAdj = "34,18"     ElseIf i = 5 Then       nMaxPurchases = 50       nMaxProducts = 12       nMaxQuantity = 6      sProductTypeAdj = "6"      sVendorAdj = "24,27,34"    End If    Dim params(2) As SqlParameter    params(0) = New SqlParameter("@ID", SqlDbType.Int)    params(1) = New _      SqlParameter("@ProdTypeAdj", SqlDbType.VarChar, 50)    params(2) = New _      SqlParameter("@VendorAdj", SqlDbType.VarChar, 50)    params(0).Value = nStoreID    params(1).Value = sProductTypeAdj    params(2).Value = sVendorAdj    Dim ds As DataSet = _     SqlHelper.ExecuteDataset(sConn, _     CommandType.StoredProcedure, "GetProductIDS", params)    Dim j As Int16 = 1    Dim dtDate As DateTime = _    Convert.ToDateTime("01/01/" + sYear)    'Loop through everyday of the year    'we assume the store is open everyday    Randomize()    Do Until j = 182      'First thing is check to see if orders      'needs to be fulfilled for this day and store      'We assume that all orders are shipped 3 days      'after the orderdate in one shipment      Dim params1(1) As SqlParameter      params1(0) = New _       SqlParameter("@StoreID", SqlDbType.Int)      params1(1) = New _       SqlParameter("@Date", SqlDbType.SmallDateTime)      params1(0).Value = nStoreID      'order was placed 3 days ago      params1(1).Value = dtDate.AddDays(-3)      SqlHelper.ExecuteReader(sConn, _        CommandType.StoredProcedure, _        "InsertShipment", params1)      'We will also call a new stored procedure here      'called CheckShipmentSchedules which will      'be used to see if the store needs orders      'generated for this day.      Dim params1b(1) As SqlParameter      params1b(0) = New _       SqlParameter("@StoreID", SqlDbType.Int)      params1b(1) = New _       SqlParameter("@Date", SqlDbType.SmallDateTime)      params1b(0).Value = nStoreID      params1b(1).Value = dtDate       SqlHelper.ExecuteNonQuery(sConn, _       CommandType.StoredProcedure, _       "CheckShipmentSchedule", params1b)      Dim x As Int16 = 1      'This will be the total number of      'purchases for this day      Dim nPurchases As Int16 = _        CInt(Int((nMaxPurchases * Rnd()) + 1))      Do Until x = nPurchases + 1         Dim y As Int16 = 1         Dim nEmployeePos As Int16 = _          CInt(Int((ds.Tables(1).Rows.Count * Rnd())))         Dim nEmployeeID As Integer = _          Convert.ToInt32(ds.Tables(1).Rows(nEmployeePos).Item(0))         Dim params2(2) As SqlParameter         params2(0) = New _           SqlParameter("@ID1", SqlDbType.Int)         params2(1) = _           New SqlParameter("@Date", SqlDbType.SmallDateTime)         params2(2) = New SqlParameter("@ID2", SqlDbType.Int)         params2(0).Value = nStoreID         params2(1).Value = dtDate         params2(2).Value = nEmployeeID         Dim nPurchaseID As Integer = _          SqlHelper.ExecuteScalar(sConn, _          CommandType.StoredProcedure, _          "InsertPurchase", params2)         'This is total number of products for         'this(purchase)         Dim nProducts As Int16 = _          CInt(Int((nMaxProducts * Rnd()) + 1))         Do Until y = nProducts + 1           'This is quantity for this purchase           Dim nQty As Int16 = _             CInt(Int((nMaxQuantity * Rnd()) + 1))            'This is the product for this detail record            Dim nProductPos As Int16 = _              CInt(Int((ds.Tables(0).Rows.Count * Rnd())))            Dim nProductID As Integer = _                 Convert.ToInt32(ds.Tables(0).Rows(nProductPos).Item(0))                'Generate the detail record                Dim params3(4) As SqlParameter                params3(0) = New _                 SqlParameter("@StoreID", SqlDbType.Int)                params3(1) = New _                 SqlParameter("@ProductID", SqlDbType.Int)                params3(2) = New _                 SqlParameter("@PurchaseID", SqlDbType.Int)                params3(3) = New _                 SqlParameter("@Qty", SqlDbType.Int)                params3(4) = New _                 SqlParameter("@Date", SqlDbType.SmallDateTime)                params3(0).Value = nStoreID                params3(1).Value = nProductID                params3(2).Value = nPurchaseID                params3(3).Value = nQty                params3(4).Value = dtDate                'Note that the stored procedure name being                'called is different than the one from Chapter 4                SqlHelper.ExecuteScalar(sConn, _                CommandType.StoredProcedure, _                "InsertPurchaseDetailNew", params3)                y += 1              Loop              x += 1            Loop            j += 1            ProgressBar1.Value = j            dtDate = dtDate.AddDays(1) 'Go to the next day         Loop      Next      MessageBox.Show("Purchases for all 5 stores were generated _         successfully")    Catch ex As Exception       MessageBox.Show(ex.Message)    End Try 

It too will simulate random purchase generation for each store. The difference is that we now will use the ShipmentSchedules table to determine when products should be ordered. In addition, the time frame has been lowered to six months to shorten the amount of processing time necessary.

The newly created InsertPurchaseDetailNew stored procedure is used not only to generate purchase records, but also to verify that the product quantity requested is available. If the product quantity is not available, a record is written to the newly created ProductAvailabilityTracking table. Management can use this table to measure the success of the new shipment method. Each record indicates an attempt by a customer to purchase a product not available.

In a real-world situation it would be difficult to utilize such a table. After all, most customers will not announce or even know that they are trying to purchase a product not available. More than likely a store would utilize some other method of tracking product availability. This could involve store managers routinely checking the store aisles or an automated method of querying the database.

The important thing to note is that the application of predictions by data-mining algorithms should be continually measured and evaluated. It is not good enough to simply use Analysis Services to make predictions and then assume that everything will work fine.

For demonstration purposes, you can execute the GeneratePurchases routine by clicking the Generate Purchases button on Form1. The form does not contain input parameters like the ones used in the Chapter 5 application. Instead, the values are hard-coded within the routine itself.

After running the GeneratePurchases routine using the Version 1 database file, we see some interesting results. If you were to perform the following query in Query Analyzer, SELECT COUNT(ShipmentID) FROM Shipments WHERE StoreID = 1 and ShippedDate > '01/01/2002', you might get a return value of 53. So for six months in 2002, only 53 shipments were made to Store 1. This number is significantly less than the 295 shipments made the entire year before. It would be very easy to consider these numbers alone indicative of success, but we must consider something else.

For this chapter, a new table named ProductAvailabilityTracking was created. This table is written to every time a randomly generated purchase is attempted and the product is not available. For Store 1, 1232 records were written to this table. This indicates that in the year 2002 more than 1200 customers were displeased to find that the product they wanted was not available. Any business major will tell you that missed sales opportunities and negative customer impressions can outweigh the benefits of reducing operational costs. In the next section we will examine another method of applying predictions that may lead to better results.

Tuning the Mining Model

Even though the initial run of our Generate Purchases scenario indicated success in that shipments were reduced, further analysis uncovered a flaw. It was discovered that using the mining model would result in a large number of customers not being able to purchase the products they needed.

The biggest factor is the data itself. You will train the mining model based on a set of data that was preselected. For real-world applications this will be a subset of the data that has been scrutinized or cleaned (as was discussed in Chapter 5). Chapter 5 relied on simulated data generated for all five stores in one year. The simulation was based on parameters supplied to the GeneratePurchases routine. Depending on the parameters selected, a varied number of records would be generated.

Utilizing a Version 2 Database File

To demonstrate the significance of the training data utilized, another database file has been provided. The Version 2 database files are also available for download from the book's Web site. This database version was the result of modifying the parameters used to generate purchases in Chapter 5. Table 6.3 lists the parameters along with the values used to build the Version 1 and Version 2 database files.

Table 6.3. Comparison of the input parameters used to generate purchases for Version 1 and Version 2 backups. The values that are different are bolded in the Version 2 values column.

Store

Field Caption

Version 1 Values

Version 2 Values

1

Processing Year

2001

2001

 

Max Purchases

80

80

 

Max Products

20

8

 

Max Quantity

5

5

 

Product Type Adjustment

1, 2

1, 2

 

Vendor Adjustment

20, 24, 27

20, 24, 27

2

Processing Year

2001

2001

 

Max Purchases

60

60

 

Max Products

12

10

 

Max Quantity

7

3

 

Product Type Adjustment

2, 6

2, 6

 

Vendor Adjustment

13, 18

13, 18

3

Processing Year

2001

2001

 

Max Purchases

70

70

 

Max Products

12

6

 

Max Quantity

7

4

 

Product Type Adjustment

2

2

 

Vendor Adjustment

34

34

4

Processing Year

2001

2001

 

Max Purchases

100

90

 

Max Products

5

5

 

Max Quantity

2

2

 

Product Type Adjustment

  
 

Vendor Adjustment

34, 18

34, 18

5

Processing Year

2001

2001

 

Max Purchases

50

50

 

Max Products

15

12

 

Max Quantity

8

6

 

Product Type Adjustment

6

6

 

Vendor Adjustment

24, 27, 34

24, 27, 34


Version 1 is a snapshot of the data as it would look if the steps in Chapter 5 were followed. Version 2 is an alternative version of the database utilizing different input parameters. The input parameters introduced in Chapter 5 have been lowered to introduce more variability and speed up the processing time.

At this time you should attach the Version 2 file by performing the following steps:

1.

Open SQL Server's Enterprise Manager.

2.

Right-click the SavingsMart database node and click All Tasks. . . and Detach Database. You may have to clear any connections currently open.

3.

Copy the SavingsMart.mdf and SavingsMart.ldf files from the book's Web site to the local directory where Version 1 was located. Click OK to replace the files.

4.

Right-click the Database node and click All Tasks. . . and Attach Database.

5.

From the Attach Database dialog, browse to the directory where you copied the database files in step 3 and select the SavingsMart.mdf file. Click OK.

6.

Click OK to attach the database.

Once the files are attached, you will need to click the Apply Shipment Schedules button so that the mining model is refreshed and new values are written to the ShipmentSchedules table. After the routine completes successfully, open up Analysis Manager and expand the SavingsMart database. Right-click the Analyze Shipments mining model and click Browse. You should instantly see that the model content is significantly different. The tree contains more branches and involves product type. The first factor affecting days between shipments is Store ID as opposed to vendor name. If you double-click the Store ID = 1 node, you will drill down farther into the tree. Figure 6.5 is a screenshot of what this view would look like.

Figure 6.5. Screenshot of the Content Detail in Analysis Manager once the model has been refreshed with data from the Version 2 backup file. The next split is based on product type and finally vendor name. This is in stark contrast to the results seen when utilizing the Version 1 backup file.


Since product type is now considered a factor when determining days between shipments, we will have to modify the query used to retrieve predictions from the mining model. The PopulateShipmentSchedulesTable procedure contains the code used to generate this query; this is the commented code that was discussed earlier. The string shown below is an example of what the new query string would look like.

 SELECT Predict([Days Since Last Shipped]),Predict(Quantity) FROM [Analyze Shipments] PREDICTION JOIN (SELECT 'Ables' as VendorName, 1 as StoreID,     'Beverages' as Type) as t ON [Analyze Shipments].[Vendor Name] = t.VendorName AND [Analyze Shipments].[Product Type] = t.Type AND [Analyze Shipments].[Store ID] = t.StoreID 

Note that the query now includes product type as a filter. It also joins the mining model based on product type so that more granularity is achieved.

If you were to execute the GeneratePurchases routine by clicking Generate Purchases, you would attain different results. For instance, the query, SELECT COUNT(ShipmentID) FROM Shipments WHERE StoreID = 1 and ShippedDate > '01/01/2002', might return a result of 4. This result is a definite improvement over the amount of 53 attained when processing the Version 1 database file. The best part is that the entries to the ProductAvailabilityTracking table have also been reduced. They went from 1232 for Store 1 to 118. Thus we have not only reduced the number of shipments, but we have also ensured greater product availability.

Tuning Parameters

Other factors that can affect the predictions of a mining model are based on the number of splits that occur within the decision tree. Some of these factors can be controlled through the use of parameters assigned to the model. Table 6.4 lists two parameters available with the decision trees mining model.

Table 6.4. The two mining-model parameters available with the decision trees algorithm. These parameters are set in the Additional Parameters property using the Relational Mining Model Editor.

Parameter Name

Type

Range

Description

COMPLEXITY_ PENALTY

Float

0 1

Used to restrict the growth of the tree. The decision tree algorithm produces splits that result in the creation of multiple branches. A low value for COMPEXITY_ PENALTY indicates that splits will occur more frequently, and a high value that splits will occur less frequently. The default value varies depending on the number of attributes specified for the mining model.

MINIMUM_ LEAF_CASES

Integer

0 2,147,483,647

Similar to the COMPLEXITY_ PENALTY in that it controls the number of splits that occur. This parameter, though, is based on the number of leaf nodes present in the tree; a low value indicates that splits will occur more frequently, and a high value that they will occur less frequently. The default value is 10.


     < Day Day Up > 


    Building Intelligent  .NET Applications(c) Agents, Data Mining, Rule-Based Systems, and Speech Processing
    Building Intelligent .NET Applications(c) Agents, Data Mining, Rule-Based Systems, and Speech Processing
    ISBN: N/A
    EAN: N/A
    Year: 2005
    Pages: 123

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net