Problem

You'd like to smooth data in a time series.

Solution

Use Excel's exponential smoothing feature in the Analysis ToolPak or construct a smoothed data series yourself using spreadsheet functions and VBA.

Discussion

Smoothing data is often desirable to remove unwanted noise in a data series. Computing moving averages as discussed in Recipe 6.3 is actually a smoothing process. Besides the moving averages methods discussed earlier, there are other ways to smooth data. Excel offers an exponential smoothing feature as part of the Analysis ToolPak. Further, you can construct any smoothing operations you desire using standard spreadsheet functions (and/or VBA). You already saw this approach in the previous recipe; however, this time I'll show you how to use a weighted moving average that makes use of a cubic spline interpolation kernel.

Exponential smoothing

Exponential smoothing is also a weighted-averaging technique. The idea behind weighted averaging is to give data values closest to the value being forecast or estimated greater importance, or influence, as opposed to values further away. Exponential smoothing uses the following formula:

Fn+1 is the value being estimated at time interval n + 1. a is a weighting factor, which is called a damping factor in Excel. Fn is the previous estimated value, and Yn is the previous actual value in the original data series.

To use exponential smoothing, select Tools images/U2192.jpg border=0> Data Analysis... from the main menu bar to open the Data Analysis dialog box (shown earlier in Figure 6-7). Select Exponential Smoothing from the list of available analyses. Doing so opens the Exponential Smoothing dialog box shown in Figure 6-11.

Figure 6-11. Exponential Smoothing dialog box

In the Input Range field, type (or select on the spreadsheet) the cell range containing the input data you wish to smooth. Enter a damping factor in the "Damping factor" field or leave it blank to use the default value of 0.3. Enter a reference to the topmost cell in the desired output cell range in the Output Range field.

Press OK when you're done and you should see the results on your spreadsheet. Figure 6-12 shows an example for the same annual temperature data used in the previous recipe. The resulting data series after smoothing is contained in column R.

Figure 6-12. Exponential smoothing results

Just like the Moving Average tool discussed earlier, the Exponential Smoothing tool places a cell formula rather than a value in each cell of the resulting data series. This way, if your original data changes, the smoothed data will update automatically. Cell R10 is selected in Figure 6-12 so you can see what the formulas look like. The formula is shown in the formula bar and is of the form `=0.7*O9+0.3*R9`. As you can see, this formula corresponds to the exponential smoothing equation I showed you earlier. Of course, you can enter these formulas yourself, bypassing the need to use the Analysis ToolPak, if you desire.

Figure 6-13 shows a plot of the smoothed data series superimposed over the original series so you can see the difference between the two.

Figure 6-13. Exponentially smoothed temperature data

Other weighted averaging techniques are commonly used in time series analysis. In the next section I'll show you one based on a cubic spline interpolation kernel.

Kernel smoothing

In this example, I'll show you a weighted averaging technique that uses a cubic spline interpolation kernel designed to approximate a Gaussian kernel. You can use all sorts of different kernels for smoothing with only slight modifications to the technique I'll present. The kernel I'll use is:

r represents the distance between data points and h represents the effective smoothing radius divided by 2. I'll use this smoothing function to compute weights when computing weighted averages of the time series. To make this task easier, I added a VBA function called `Wcs` that you call from the spreadsheet. Example 6-1 shows my VBA implementation for this smoothing kernel.

Example 6-1. Cubic spline kernel

Public Function Wcs(r As Double, h As Double) As Double Dim s As Double Dim w As Double s = r / h If s > 2 Then w = 0 ElseIf (s <= 2) And (s > 1) Then w = (1# / 3#) * (2# - s) ^ 3 ElseIf (s <= 1) And (s >= 0) Then w = (4# / 3#) * (1 - (3# / 2#) * s ^ 2 + (3# / 4#) * s ^ 3) End If Wcs = w End Function |

This is a fairly straightforward implementation of the equation I showed earlier, so I won't go over each line of code. Refer to Chapter 2 if you're not already up to speed on using VBA and adding custom functions and subroutines.

I'll use this function to generate weights in order to smooth the same temperature data discussed earlier. Figure 6-14 shows a weight table I computed using this function.

Figure 6-14. Weight table

The first column in the table, labeled `r`, contains relative indices around the fourth index, labeled `0`. We can do this here to compute r because the time series data is sampled at evenly spaced intervals. The second column contains formulas like `=Wcs(U10,2)`. This is the formula contained in cell V10. All the other cells contain similar formulas. The sum of the weights (computed using the `SUM` formula shown in cell V14) is required to normalize the weighted average computation. You can avoid this step if your kernel integrates to unity over its range of support.

Now, to compute the smoothed data series, I set up another column in my spreadsheet, as shown in Figure 6-15.

Figure 6-15. Kernel-smoothing results

The smoothed data is contained in column S under the heading `Kernel Smoothing`. I selected the first smoothed data value in cell S10 in the figure so you can see the cell formula. The formula is `=SUMPRODUCT(R7:R13,$V$7:$V$13)/$V$14`. As you can see, it uses the `SUMPRODUCT` formula to compute the sum of the products of each corresponding term in the two given cell ranges (see Chapter 7 for more information on this and other handy functions). In this case, the first cell range corresponds to the six data items around the given data item, including the given data item. This means the weighted average in this case is centered over the ith data item and includes the weighted influence of the three previous items and the three following items.

The second cell range contains the weights shown in Figure 6-14. Further, dividing by the sum of weights, contained in cell V14, normalizes the result. Once the first formula was set up, I simply copied and pasted it into the remaining cells in the data series.

Notice that we do lose some data. Specifically, we lose the first and last three data items in the range because of the range of influence of the weighting function I used. You can increase the range of influence by increasing the value for the `h` parameter passed into `Wcs`, or you can decrease it by decreasing this value. The greater the range of influence (hence the greater the number of data items in the series averaged for each smoothed result), the smoother the resulting data series.

Figure 6-16 shows a plot of the smoothed data series compared to the original data series; the effect of smoothing is clearly evident.

Figure 6-16. Kernel-smoothed temperature data

See Also

See Recipe 6.3 to learn how to compute non-weighted moving averages in Excel. Smoothing data using the techniques in this and the previous recipe is actually a form of filtering. You can also filter data in the frequency domain in Excel using Fourier transforms; read Recipe 6.10 to learn more.

Using Excel

- Introduction
- Navigating the Interface
- Entering Data
- Setting Cell Data Types
- Selecting More Than a Single Cell
- Entering Formulas
- Exploring the R1C1 Cell Reference Style
- Referring to More Than a Single Cell
- Understanding Operator Precedence
- Using Exponents in Formulas
- Exploring Functions
- Formatting Your Spreadsheets
- Defining Custom Format Styles
- Leveraging Copy, Cut, Paste, and Paste Special
- Using Cell Names (Like Programming Variables)
- Validating Data
- Taking Advantage of Macros
- Adding Comments and Equation Notes
- Getting Help

Getting Acquainted with Visual Basic for Applications

- Introduction
- Navigating the VBA Editor
- Writing Functions and Subroutines
- Working with Data Types
- Defining Variables
- Defining Constants
- Using Arrays
- Commenting Code
- Spanning Long Statements over Multiple Lines
- Using Conditional Statements
- Using Loops
- Debugging VBA Code
- Exploring VBAs Built-in Functions
- Exploring Excel Objects
- Creating Your Own Objects in VBA
- VBA Help

Collecting and Cleaning Up Data

- Introduction
- Importing Data from Text Files
- Importing Data from Delimited Text Files
- Importing Data Using Drag-and-Drop
- Importing Data from Access Databases
- Importing Data from Web Pages
- Parsing Data
- Removing Weird Characters from Imported Text
- Converting Units
- Sorting Data
- Filtering Data
- Looking Up Values in Tables
- Retrieving Data from XML Files

Charting

- Introduction
- Creating Simple Charts
- Exploring Chart Styles
- Formatting Charts
- Customizing Chart Axes
- Setting Log or Semilog Scales
- Using Multiple Axes
- Changing the Type of an Existing Chart
- Combining Chart Types
- Building 3D Surface Plots
- Preparing Contour Plots
- Annotating Charts
- Saving Custom Chart Types
- Copying Charts to Word
- Recipe 4-14. Displaying Error Bars

Statistical Analysis

- Introduction
- Computing Summary Statistics
- Plotting Frequency Distributions
- Calculating Confidence Intervals
- Correlating Data
- Ranking and Percentiles
- Performing Statistical Tests
- Conducting ANOVA
- Generating Random Numbers
- Sampling Data

Time Series Analysis

- Introduction
- Plotting Time Series Data
- Adding Trendlines
- Computing Moving Averages
- Smoothing Data Using Weighted Averages
- Centering Data
- Detrending a Time Series
- Estimating Seasonal Indices
- Deseasonalization of a Time Series
- Forecasting
- Applying Discrete Fourier Transforms

Mathematical Functions

- Introduction
- Using Summation Functions
- Delving into Division
- Mastering Multiplication
- Exploring Exponential and Logarithmic Functions
- Using Trigonometry Functions
- Seeing Signs
- Getting to the Root of Things
- Rounding and Truncating Numbers
- Converting Between Number Systems
- Manipulating Matrices
- Building Support for Vectors
- Using Spreadsheet Functions in VBA Code
- Dealing with Complex Numbers

Curve Fitting and Regression

- Introduction
- Performing Linear Curve Fitting Using Excel Charts
- Constructing Your Own Linear Fit Using Spreadsheet Functions
- Using a Single Spreadsheet Function for Linear Curve Fitting
- Performing Multiple Linear Regression
- Generating Nonlinear Curve Fits Using Excel Charts
- Fitting Nonlinear Curves Using Solver
- Assessing Goodness of Fit
- Computing Confidence Intervals

Solving Equations

- Introduction
- Finding Roots Graphically
- Solving Nonlinear Equations Iteratively
- Automating Tedious Problems with VBA
- Solving Linear Systems
- Tackling Nonlinear Systems of Equations
- Using Classical Methods for Solving Equations

Numerical Integration and Differentiation

- Introduction
- Integrating a Definite Integral
- Implementing the Trapezoidal Rule in VBA
- Computing the Center of an Area Using Numerical Integration
- Calculating the Second Moment of an Area
- Dealing with Double Integrals
- Numerical Differentiation

Solving Ordinary Differential Equations

- Introduction
- Solving First-Order Initial Value Problems
- Applying the Runge-Kutta Method to Second-Order Initial Value Problems
- Tackling Coupled Equations
- Shooting Boundary Value Problems

Solving Partial Differential Equations

- Introduction
- Leveraging Excel to Directly Solve Finite Difference Equations
- Recruiting Solver to Iteratively Solve Finite Difference Equations
- Solving Initial Value Problems
- Using Excel to Help Solve Problems Formulated Using the Finite Element Method

Performing Optimization Analyses in Excel

- Introduction
- Using Excel for Traditional Linear Programming
- Exploring Resource Allocation Optimization Problems
- Getting More Realistic Results with Integer Constraints
- Tackling Troublesome Problems
- Optimizing Engineering Design Problems
- Understanding Solver Reports
- Programming a Genetic Algorithm for Optimization

Introduction to Financial Calculations

- Introduction
- Computing Present Value
- Calculating Future Value
- Figuring Out Required Rate of Return
- Doubling Your Money
- Determining Monthly Payments
- Considering Cash Flow Alternatives
- Achieving a Certain Future Value
- Assessing Net Present Worth
- Estimating Rate of Return
- Solving Inverse Problems
- Figuring a Break-Even Point

Index

Excel Scientific and Engineering Cookbook (Cookbooks (OReilly))

ISBN: 0596008791

EAN: 2147483647

EAN: 2147483647

Year: N/A

Pages: 206

Pages: 206

Authors: David M Bourg

Similar book on Amazon

- Challenging the Unpredictable: Changeable Order Management Systems
- The Effects of an Enterprise Resource Planning System (ERP) Implementation on Job Characteristics – A Study using the Hackman and Oldham Job Characteristics Model
- Data Mining for Business Process Reengineering
- Intrinsic and Contextual Data Quality: The Effect of Media and Personal Involvement
- A Hybrid Clustering Technique to Improve Patient Data Quality

Flylib.com © 2008-2017.

If you may any questions please contact us: flylib@qtcs.net

If you may any questions please contact us: flylib@qtcs.net