3.2. Building an Application
We now understand how this application will work in theory. But, we need to take the theory and
turn
it into a solutionsomething we can give to the
user
. In this section we will build a complete Excel application based on the processes described in the last section.
3.2.1. Design
As we develop an application, things can easily get out of control. Therefore, it is important to start with a plan and some structure. The example has several worksheets, each with its own function.
The Data sheet holds information. It has no formulas. The user is not going to be looking at this sheet, so there is no formatting. This sheet stores the information that
drives
the application. When new information comes in, it goes on the Data sheet. Keeping the data separate has advantages. If the data source changes, only the interface between the data sheet and the data source needs changing.
All of the logic in the application is on the Workarea sheet. This is the only sheet with workbook functions.
The Display sheet is only
concerned
with presentation. It is attached to the Workarea sheet by named
cells
and ranges. There are no formulas, only formatting and information organization.
The Settings sheet contains application options, and there are two specialized sheets: AccuracyChart contains a chart and HourlyForecasts contains a table of hourly
predictions
. These sheets have no formulas and only hold or display information.
Prediction is the
core
of the application. But, to make a complete solution there needs to be a structure around it. What will the user see? How will the user interact with the application? These questions are
answered
by design.
3.2.2. List the Requirements
This is important, and should be shared with users. It forces the developer and the user to think through how the application will be used. In this case there are seven basic requirements:
-
The system will display a full week of predictions.
-
The adjusted prediction for the
next
day will be shown.
-
Hourly forecasts will be calculated and displayed.
-
The system will provide a twenty day measure of prediction accuracy.
-
A method for entering hourly call
volumes
will be provided.
-
The system will detect anomalous situations and
react
to them appropriately.
-
A way of adding new data and deleting/correcting data in the system will be provided.
3.2.3. Consider the Source of Data
The data could come from a variety of sources. It could be in a SQL database, in an XML web service, or it could simply be keyed in. In the sample application the user will key the actual call volume for the previous day into the application. A VBA routine will control the process to ensure that the data is
sensible
and ends up in the right place. The application will, however, work just as well if the user keys it directly onto the Data sheet.
Figure 3-6 shows the layout of the Data sheet. The user will only provide the Actual value. The rest of the data will come from the application itself. There are no formulas on this sheet and formats play no role. Note that the Anomaly Flag is set for July 3rd.
A VBA routine to delete the last day entered will provide a simple editing scheme. To make the application use a different data source, only the VBA needs to be changed. We will examine the VBA later.
3.2.4. Presentation
The next step is to decide what the user will see. There are three things to consider. First, and at this point most important, are the information elements. Information is displayed as blocks of
related
data. The area that displays the forecasts for each day of the current week is an element. The hourly forecast graph is another. The contents and layout of the elements will guide the development of the logic on the Workarea sheet.
The next consideration is the placement of the elements. Related elements should support each other. Principal elements should be prominently placed. The last thing to consider is the formatting. A simple
color
scheme that puts the emphasis on the content is best. Placement and formatting are easy to change.
Figure 3-7 shows the main display, which looks and behaves like the home page of a web site, helping the user understand how the application works. The web page is an established metaphor and
tells
the user what to expect and how to interact with the application. Areas that display information retain an Excel look. This is another use of an established metaphor. Notice that only the information is displayed as black on white. Headings and labels use lower contrast. The emphasis is on the information.
The area on the left is set apart by using a different background color . It contains the navigation
buttons
and displays summary information. The main section shows the detailed forecasts.
The display is based on black, gray, and white. Colorful displays can cause problems. People are better at distinguishing the difference between light and dark than between colors. Some color combinations cause eye strain, and some users will be color vision
deficient
. If the display is printed, information represented by color can be lost.
The information shown on this sheet is actually on the Workarea sheet. All the references are to named cells and ranges. This has several advantages. It makes changing the display sheet easy. Areas on the sheet can simply be cut and pasted
anywhere
. It also
preserves
the application's organization. You know there is no logic on this sheet, only formatting.
Two additional display areas complete the application: a chart showing the accuracy of the predictions during the last twenty days and a table showing hourly predictions for the whole week. They will not fit on this sheet. So, two additional sheets are used. As with the Display sheet, they reference information on the Workarea sheet.
Figure 3-8 shows the accuracy chart.
The hourly forecasts are shown in Figure 3-9.
3.2.5. Conventions and
Names
This application is complex. Using names for important values
simplifies
formulas and makes the application easier to understand. It also aids in troubleshooting or enhancing the logic. Another way to keep things from getting out of hand is following a set of conventions for the Workarea sheet. There are a few simple rules. Information from the Data sheet is displayed in blue font. Information to be displayed has a gray background. Information to be stored on the Data sheet has a blue background. Calculation areas have a border. Calculated values that are used in other calculations or macros are in Column A and are named.
3.2.6. Named Values and Ranges on Settings
The Settings sheet, shown in Figure 3-10, contains options that can be changed by the user.
The application uses named cells and ranges to isolate the interfaces between sheets. These
name
cells and ranges are defined on the Settings sheet.
-
Heading (B1)
-
This is the heading that appears at the top of the Display sheet.
-
Lag (B2)
-
Lag is the number of values per cycle in the data. The sample data uses a five-day week so Lag is 5. The application is designed to work with any value from 2 to 7.
-
Work Item (B3)
-
This is the name of the work item. In the sample data it is Calls, but depending on the work being done it could be checks, orders,
cars
, or any unit or work. This value is used in headings and labels.
-
Confidence Level (B4)
-
The confidence level determines the probability used to set the range for the adjusted prediction.
-
Anomaly Detection (B5)
-
The application calculates the probability that the actual value for the day is an anomaly. This user-set value is used to decide if the current value should be ignored.
-
HourlyDist (E2:K12)
-
This area contains the expected work distribution during the day. It uses an 11-hour
workday
, but any number of divisions is possible. For each day of the week, several weeks of hourly data were used to determine what percentage of the daily calls arrived during each
hour
. This table contains that distribution, and is used to generate the hourly forecasts.
-
Periods (D2:D12)
-
These are the labels for the items in HourlyDist. In the sample data they are hours taken on the half hour.
3.2.7. The Named Values on the Workarea Sheet
The first part of the Workarea sheet is shown in Figure 3-11.
The Workarea sheet contains all the logic. The named values from this sheet are described in Table 3-3.
Table 3-3. Descriptions of the named values from the Workarea sheet
|
Named value
|
Formula
|
Description
|
|
Last_Row
(
cell
A2)
|
={MAX((ROW(Data!B1:B2000)*(Data!B1:B2000<>"")))}
|
This is the row number of the last row used on the Data sheet. It is an array formula that multiplies row
numbers
by a truth value (1 or 0) that is 0 if the cell in that row is empty. This value is used to locate the part of the Data sheet to be used and to determine what row to put new data on.
|
|
MyTop
(cell A4)
|
=Last_Row-56
|
The calculations require 56 days of data. This value is one row above the first row to be used on the Data sheet. It is one row above so that a formula using this value can be filled down. It controls the row numbers of data being linked between the Data and Workarea sheets.
|
|
DayofWeek
(cell A6)
|
=WEEKDAY(C56)
|
Dates from the Data sheet are in column C. Since 56 rows are used, the last day entered will be in C56. This formula gives the day of the week for the last day entered. The application displays predictions for this week and next week. To do this it has to be able to find the row with the first workday of the week. DayofWeek is used in these calculations and for display.
|
|
FirstWorkDay
(cell A13)
|
={MIN(WEEKDAY(C49:C56))}
|
In the sample data the work week starts on Monday. But a workweek could start on any day. The application uses the minimum value of
WEEKDAY
over eight days to determine what day the workweek starts on. The value is also used to find the beginning of the current weeks.
|
|
StartofWeek
(cell A11)
|
=56-((DayofWeek-FirstWorkDay))
|
This is the row where the current week starts on the Workarea sheet. It is used in week-related formulas.
|
|
Interval
(cell A17)
|
={CONFIDENCE(1-Confidence_Level,STDEV(G37:G56-D37:D56),20)}
|
This is the confidence interval for the adjusted prediction. It is an array formula. Column G contains the adjusted predictions and column D the actual values. The sample
size
is 20. Confidence_Level is a named value on the Setting sheet. Interval is a display item.
|
|
CurrentHour
(cell A22)
|
={IF(SUM(T48:T58)<1,0,MAX((T48:T58>0)*ROW(T48:T58)))}
|
When the user enters hourly actual call volumes, they are linked to the T column on the Workarea sheet. The application
adjusts
the forecast during the day as hourly actuals are entered. CurrentHour keeps up with the entry of hourly information. It points to the last hour entered.
|
|
CurrentRatio
(cell A24)
|
=IF(CurrentHour=0,1,INDIRECT("U"&A22)/INDIRECT("V"&A22))
|
This
compares
the expected volume to the volume for today (up to the current hour). The ratio tells if the day (so far) is high or low and by how much. This value is used to adjust the current day's hourly forecasts.
|
|
Prediction
(cell I9)
|
=INT((SUM(I1:I8)-(MAX(I1:I8)+MIN(I1:I8)))/6)
|
This is the weekly prediction, the filtered average. It is based on the calculation area above it (I1:I8), which contains the actual values for the current day of the week for the last eight weeks. This value is
copied
to the D column on the Data sheet when new data is entered.
|
|
Anomaly
(cell L43)
|
=IF(L42>=Anomaly_Detection,1,0)
|
This is the anomaly flag for the current actual value. It is the result of the calculation area above it (L38:L42). Anomaly_Detection is a user-set value on the Settings sheet. This value is copied to the A column on the Data sheet.
|
|
Adjusted Prediction
(cell G57
)
|
=IF(F56=0,"",F57*((1+(E56/F56))/2))
|
This is the adjusted prediction for the current day. The weekly prediction for this day (F57) is multiplied by half the error ratio for the previous day. This value is copied to the E column on the Data sheet.
|
3.2.8. Named Ranges on Workarea
In this section we look at the named ranges on Workarea. Each area
manages
a set of data and calculations for one piece of the logic. They also arrange the data to fit the needs of the display sheet.
This part of Workarea is shown in Figure 3-12.
-
ThisWeek (M3:Q9)
-
This area contains the day of the week, date, weekly prediction, adjusted prediction, and actual value for each day of the current week. The calculation area in column L is a set of flags that tell which days are in use. The basic formula in this area is
=IF(L3=1,INDIRECT("workarea!C" & StartofWeek + ROW(A1)-1),"")
. A reference to a cell is built inside the
INDIRECT
function. It starts as a literal specifying the Workarea sheet's column C. It uses
StartofWeek
and a row number to get the correct row. This formula fills down. Data is pulled from
columns
C, F, G, and D. The formula in column M,
=IF(L3=1,TEXT(WEEKDAY(N3),"dddd"),"")
, returns the name of the day of the week. The
WEEKDAY
function returns a number, and the
TEXT
function
translates
the number into the day of the week.
-
NextWeek (N15:O21)
-
This area contains the dates and weekly predictions for next week. It uses the same basic technique as ThisWeek. The formula is
=IF(L3=1,INDIRECT("workarea!C" & Lag + StartofWeek + ROW(A1)-1),"")
. The only difference is that
Lag
is added to
StartofWeek
.
-
HourlyNextWeek (Z20:AG31)
-
These are the hourly forecasts for the whole week. This area is displayed on the Hourly sheet. The calculation area to the left links to the HourlyDist area on Settings. The values in S20:Y20 are the weekly predictions for each of the seven days of the current week. To get the hourly forecast, the prediction for the day is multiplied by the distribution value for each hour of the day.
Figure 3-13 contains the HourlyNextWeek part of the Workarea sheet.
-
Hours (R48:S58)
-
Times and hourly predictions for the current day are in this range. This works like HourlyNextWeek, but it is only for the current day. So, it has to find the right column in Settings. The formula is
=INDIRECT("Settings!" & ADDRESS(ROW(A2),DayofWeek+4))
. Here the Settings sheet is the target, but both the row and column are calculated. The
ADDRESS
function inside of an
INDIRECT
provides the flexibility to read from any row or column.
4
is added to
DayofWeek
because the hourly distribution value starts in the fifth column of Settings.
-
AdjustedHourly (W48:W58)
-
This area contains the adjusted hourly forecast for the current day. It works with Hours to build the hourly area on the Display sheet. As the user enters actual hourly values, they appear in column T. Column U keeps a running total for the day and column V does the same for the
predicted
hourly values. CurrentRatio contains the total volume entered so far divided by the expected volume for the same hours. The hourly forecasts are adjusted by multiplying them by CurrentRatio.
Figure 3-14 shows the part of Workarea that deals with the hourly calculations.
-
Weekly (H60:H63)
-
The values in this area give the average weekly prediction error and error percentage over the last 20 days. It also contains the average number of daily calls during that period.
-
NextDay (H67:H75)
-
This is summary information about the current day. It gives the adjusted forecast, the confidence interval, the probability of anomaly, and the most recent adjusted forecast based on the hourly actual values.
-
Adjusted (J60:J61)
-
This is the 20 day average error and error percent for the adjusted forecast.
The part of Workarea that handles the error calculations is shown in Figure 3-15.
3.2.9. Other Important Links on Workarea
-
(C37:D56) & (F37:G56)
-
These ranges link to the chart on the AccuracyChart sheet, and link back to the Data sheet.
-
(T48:T85)
-
This area links to H7:H17 on the Display sheet. This is where the user enters the actual hourly values. This information feeds the calculations that adjust the hourly forecasts as the day progresses.
3.2.10. Linking to the Data
We now know what the Workarea sheet has to do. It is time to build the logic. The first step is to link the Workarea to the data on the Data sheet. To handle all the possible lags and days of the week, the
INDIRECT
function is used. The calculations use eight weeks of history and there can be up to seven workdays in a week. So, 56 days of history must be available.
The
Last_Row
named cell on the Workarea sheet contains the row number of the last row used on the Data sheet. The application needs to start 55 rows above the last row. MyTop contains the starting row number. The data starts in cell B1. The formula is
=INDIRECT("Data!A" & MyTop + ROW(A1))
. It references column A on the Data sheet. Note that MyTop points to the cell one row above the first one needed. This allows the formula to be copied down.
Row(A1)
is added to
MyTop
.
Row(A1)
is one, but as the formula copies down
A1
becomes
A2
,
A3
, etc. This gives the correct row for each value. The cells in columns C and D work the same way. The three columns are filled down to row 56.
Column E masks anomalies. If the anomaly flag is set in column B, the actual value for that day is ignored, and the value from the previous week is substituted. The formula is
=IF(B1=0,D1,INDIRECT("Data!C" & (MyTop + ROW(A1)-Lag)))
. This also fills down to row 56. This is the value used in the prediction calculation.
Predictions from the last 20 days are used for several calculations and for a chart. So, we need to bring them onto Workarea. The weekly predictions are in the range F37:F63 and the adjusted predictions are in G37:G56. These cells use the same formula as in columns B, C, and D, but reference a different column on the Data sheet. The weekly predictions in column F fill down to 63 because we are predicting a week into the future.
These formulas keep Workarea linked to the Data sheet. When new information is added to the Data sheet, Workarea automatically updates, and all of the values are ready to use.
3.2.11. Visual Basic
This application uses Visual Basic for Applications (VBA). VBA is a powerful tool and is easily over-used. In general, it is best to do as much as possible on the workbook and only use VBA for things that Excel cannot do. The code is in Module One (the default module for a spreadsheet) and can be
viewed
using the Visual Basic Editor. To reach the editor, select Tools
Macro
Visual Basic Editor. You can also launch the editor by pressing Alt-F11.
VBA is not necessary for the application to work. The most complex VBA operation is adding a new day's data, and that can be done manually by going to the bottom of the Data sheet and typing the values in. In this application VBA is used for three things.
First, for navigation, there are buttons on the sheets that move the user between sheets and ensure the view is set to the top left corner. This works like the sheet tabs, but gives flexibility in securing the application and helps make the application a complete package. Navigation code looks like this.
Sub AccuracyChart( )
'*******************
' Navigation
' This macro takes the user to
' the AccuracyChart worksheet and selects
' cell A1.
'*******************
Sheets("AccuracyChart").Select
Range("A1").Select
End Sub
This code only does two things. It selects the sheet and then it selects cell A1.
Next, this macro adds a new day to the Data sheet. This can easily be modified to use an automated data source such as an SQL database. The variable
NewActual
could be
populated
by any method. This may look like a
fairly
involved routine, but actually it merely moves data around. All the calculations are done by Excel.
Sub AddDay( )
'*****************************************
' This routine allows the user to
' enter the actual for the next day.
' The entry is checked and if it
' is valid the new data is moved to the
' Data sheet.
'******************************************
Dim myItem As String
Dim myDate As Date
Dim NewActual As Variant
Dim NextRow As Integer
Dim Anomaly As Double
Dim Prediction As Integer
Dim AdjustedPrediction As Integer
Dim TheLag As Integer
myItem = Range("Item").Value ' Read the item from Settings
myDate = Range("workarea!c57").Value ' Get the next date
' Use an input box to allow the user to enter the next actual
NewActual = InputBox("Please enter the number of " & LCase(myItem) & _
" for " & myDate & ".", "Enter Actual", 0)
' If cancel or no entry quit
If NewActual = "0" Or NewActual = "" Then Exit Sub
If Not IsNumeric(NewActual) Then ' Is the value a number?
MsgBox ("The Actual must be a number.") ' If not put up a message
Exit Sub ' and quit
End If
If Val(NewActual) < 0 Then ' Is the value less than zero?
MsgBox ("The Actual cannot be negative.")
Exit Sub
End If
' Is the value not an integer?
If Val(NewActual) <> Int(Val(NewActual)) Then
MsgBox ("The Actual must be an integer.")
Exit Sub
End If
NextRow = Range("Last_row").Value + 1 ' Get the row number of the next
' row on the Data sheet
Range("data!b" & NextRow).Value = myDate ' put the new date on the sheet
Range("data!c" & NextRow).Value = NewActual ' put the new actual on the sheet
TheLag = Range("Lag").Value ' get the lag from Settings
' Prediction will update when the new actual is put on the Data sheet
Prediction = Range("Prediction").Value
' Anomaly will also update automatically
Anomaly = Range("Anomaly").Value
' The anomaly goes on the Data sheet next
' since it is used as part of the calculation
' of the adjusted forecast
Range("data!a" & NextRow).Value = Anomaly
' get the adjusted forecast
AdjustedPrediction = Range("AdjustedPrediction").Value
' The weekly prediction goes one lag down from the current day
Range("data!d" & NextRow + TheLag).Value = Prediction
' the adjusted prediction is put on the next row
Range("data!e" & NextRow + 1).Value = AdjustedPrediction
' clear the area where the hourly actuals are entered
Range("h7:h17").ClearContents
Range("a1").Select ' select A1
End Sub
The last macro allows the user to delete the data for the last day entered. This provides
a simple editing capability. All this does is find the cells populated when the last day
was entered then selects and clears them.
Sub DeleteDay( )
'*******************************************
' This sub deletes the last day. All that is
' necessary is to remove the last entry from
' the Data sheet.
'********************************************
Dim LastRow As Integer
Dim TheLag As Integer
LastRow = Range("Last_row").Value ' First we need to know where the
' last row is. The number of the
' last row used on the data sheet
' is in a named cell on the workarea
' sheet. This statement stores the
' row number in a variable called
' LastRow
Sheets("data").Select ' All of the data to be deleted is on the
' the data sheet. So we start by selecting
' that sheet.
' The data is in columns A-E. We need only build a reference to
' the cells and clear the contents.
' For columns A,B, and C the data
' to be deleted is in LastRow. So,
' we can clear them at the same time.
Range("data!a" & LastRow & ":c" & LastRow).ClearContents
TheLag = Range("Lag") ' To determine the row of the last
' weekly prediction we need to know the
' lag. Lag is a named cell on the Settings
' sheet.
' This section deletes the last weekly
' prediction. It is one Lag below the
' LastRow.
Range("data!d" & LastRow + TheLag).ClearContents
' The last adjusted prediction is one
' row below LastRow
Range("data!e" & LastRow + 1).ClearContents
Sheets("display").Select ' Return to the Display sheet
Range("h7:h17").ClearContents ' clear the area where the
' hourly actuals are entered.
Range("a1").Select
End Sub
With Workarea and the VBA complete, it is time to build the application's user interface.
3.2.12. Formatting
Formatting in this application uses
mainly
backgrounds and borders. However, there is some conditional formatting on the Display sheet. In the ranges G23:J28 and M23:M28, the number of rows that will contain data varies with the number of workdays in the week. Therefore, a conditional format is set to include those cells in the display if they are used and to gray them out if they are not.
The hourly sheet is just patterns and borders, as shown in Figure 3-16.
This sheet may be over-formatted, but it shows how metaphor can imply function and guide user expectations.
3.2.13. Running the Application
The application was written using Excel 2002. It uses no third party software. There are some formatting issues if it is run in Excel 95, but all the logic works even in that older version of the software.
Start by opening the application to the Display sheet. In the Next Day area on the left, the value for Current Adjusted is 11,024. That is the adjusted prediction for the current day. The hourly forecasts for the first three hours are 390, 835, and 1,129. Suppose that the actual call counts for those hours are 500, 1,000, and 1,300. In Figure 3-17, in item 1 those volumes have been entered as the actuals for those hours. In item 2, the Hourly Chart is now showing that the day is higher than the forecast. If the first three hours are higher than expected, in this case about 19% higher, we would expect the rest of the day to be high as well. The Current Adjusted has increased to 13,106. In the Hourly Forecasts area the adjusted hourly forecasts have also been increased.
The main display is described in Figure 3-17.
The current day is Friday 9/7/1998. That is the day in the Next Day area on the left. In the This Week area the full current week is shown. Item 3 points to the current day. The Forecast and Adjusted cells are filled in because they are predictions. Actual is blank since this is the day that is now working. In item 4, the Forecast for next Friday is not filled in. This is because we are predicting a week out and cannot predict next Friday until the Actual for this Friday is entered.
Use the Hour Forecast button to view the hourly predictions for the current week. The Accuracy Graph button displays a chart showing the weekly predictions, the adjusted predictions, and the actual values for the last 20 days.
The sample data ends on 8/6/98. The next week of call volumes is:
-
10,864
-
14,711
-
14,977
-
13,255
-
11,972
Click the NewDay button and enter 10,864, the value for the next day. The display will update automatically. This is
illustrated
in Figure 3-18.
Monday 8/10/1998 is now the working day, and all the viewing areas are reset to the beginning of the week. The Delete Day button will remove the last entered day from the application. So if you
mistyped
10,864 or just want to go back to Friday, press Delete Day.
|
The application is just an Excel workbook, so when new data is entered it is important to save changes just as you would with any Excel project.
|
|
3.2.14. Customizing the Application
To use the application with your data, the Data sheet should be cleared, and 56 days of dates and actual volumes should be pasted into columns B and C starting in row 2. Row 1 contains headings. Change the Settings sheet for the heading, item, lag, and sensitivity as needed. It will take one more full cycle of data for all the features to populate. It is
easiest
to enter this week using the AddDay macro, but the values can be typed directly into the Data sheet.
The application is self-contained, and needs no other software except Excel. Later in the book we look at more advanced prediction techniques that could be incorporated into this project by changing the Workarea sheet. It is possible to enhance the application in many ways:
-
A staffing model could be built by adding a new worksheet and some new logic on Workarea. A new entry for units of work per day can be added to Settings and used to calculate required staff.
-
The
AddDay
macro could be enhanced to update the HourlyDist range on Settings. This would keep the expected hourly work distribution current.
-
The Accuracy Chart could be
replaced
with a Pivot Chart, allowing prediction accuracy to be
analyzed
in more detail.
-
The Display sheet can be reformatted and information elements can be moved by Cut and Paste. Changing the information elements
themselves
is more complex and requires changes to the Workarea.
The key to successfully modifying the application is to follow the conventions. Keep the logic on Workarea. If you need to introduce a new data item put it on the Data sheet. New parameters go on Settings. Be sure to name everything you use.
|