20. DOEOptimizingOverviewOptimizing a process using Designed experiments involves
Before reading this section it is important that the reader has read and understood "DOEIntroduction," "DOECharacterizing," and "DOEScreening" in this chapter. This section covers only designing, analyzing, and interpreting an Optimizing Design, not the full DOE roadmap. Optimizing Designs (sometimes known as Response Surface Methodology[37]) in Lean Sigma are based on the set of experimental design techniques known as sequential experimentation and rely on Screening Designs as well as Characterizing Designs. Screening Designs allow a narrowing of the Xs down to the few potential candidates. Characterizing Designs determine for the shortened list of Xs how much each X contributes and how it contributes. The result of Screening and Characterizing Designs is a shortlist of critical Xs that need to be controlled to maintain a consistent level of the Y. What they do not provide is an optimized Y.
To explain this more readily, it is best to follow an example. Assume the first DOE work was a Screening Design that narrowed the Xs down to two key Xs. The associated Y is the Yield of the process. A subsequent Full Factorial on the two Xs yielded the results shown in Figure 7.20.1. The model explains 91.06% of the variation in the data and both Xs are significant.
To take this analysis further, an understanding of the solution space needs to be gained. Because there are only two Factors involved here, it is possible to represent the solution space graphically with a Surface Map of the model, as shown in Figure 7.20.2. Figure 7.20.2. Surface Plot of Yield versus Time and Temperature (output Minitab v14).
Clearly to improve the Yield, the Time and Temperature need to be changed to levels outside the Design space, preferably toward the top right-hand corner; the question is where? The Surface Map gives a good visual representation of the direction to travel, but the Contour Plot, as shown in Figure 7.20.3, really shows the best direction to travel in terms of Time and Temperature to get an improved Yield. Figure 7.20.3. Contour Plot of Yield versus Time and Temperature (output Minitab v14).If the journey is commenced from the Center Point of the Plot, the fastest way to move in the direction of increased Yield is known as the Path of Steepest Ascent and is perpendicular to the contour lines, as drawn on Figure 7.20.3. To proceed, additional data points (experimental runs) should be taken at regular reasonable intervals up the Path of Steepest Ascent until a maximum Yield is reached. This direction can be approximated by eye if there are only two Factors (as in this case) or calculated from the analysis results in Figure 7.20.1 irrespective of the number of Factors involved. Either way is appropriate in this case where there are only two Factors; however, Belts can become a little confused with the latter approach so the former is recommended here. To approximate the Path of Steepest Ascent, use the Contour Plot as shown in Figure 7.20.1. Extend the arrow out from the Center Point perpendicular to the contours until it reaches the boundary of the square. Read off the coordinates of the point where the arrow touches the square. In this case Time 78.2, Temperature = 132.5. As Temperature goes up 2.5 units from the Center Point, Time needs to go up approximately 3.2 units to stay on the Path of Steepest Ascent. From this, a simple table can be created and populated as shown in Table 7.20.1. Just a few of the steps were run with the associated Yield value in the table.
Clearly the Yield reaches a maximum value somewhere around "Origin + 5 Steps" at [91, 142], so no further runs are conducted after "Origin + 8 Steps." The experimental sequence could be halted at this point with a new, much improved Yield of 85%, but there might be an even better Yield value somewhere in the solution space. The approach at this point would be to start the process again with a new Full Factorial centered somewhere around the highest point found from the Path of Steepest Ascent. Based on attainable settings for the Xs, the Team, in this case, decides to center the experiment on [90,145] and stretch the levels of the Xs to cover a broader area, as shown in Figure 7.20.4. Figure 7.20.4. Sequence of experimentation for the Time-Temperature example.
The Team chose new levels:
The Team designed a Full Factorial, shown as Experiment 3 in Figure 7.20.4. The runs in Experiment 3 were conducted on the process (including two Center Points) and the associated Yield recorded. Analysis of the runs gave the results shown in Figure 7.20.5.
At first glance, the results seem a little strange; the linear model is no longer correct (high p-values) and there are now interaction and curvature effects (both with low p-values). This makes complete sense though upon further consideration. The previous experiment showed liner effects representing the side of a hill. Here it looks as though the experiment is sitting on the top of the same hill. There are no linear effects, but there is curvature, which are both indicative of a peak in the response surface. It could be that the experimental region is close to the maximum value for Yield. If there is curvature present then a linear model is no longer of any use, a quadratic model would be better. To see curvature in a given direction, at least three points are needed in a line. To do this here, the model is expanded by adding what are known as "Star Points" or "Axial Points" to create a new design, known as a Central Composite Design, as shown in Figure 7.20.6. Figure 7.20.6. Adding a Central Composite Design to the time-temperature example.
This design is efficient in representing curvature. As can be seen from the figure, the addition of the Star Points creates a line of three points horizontally, vertically, and on both diagonals. Only the four Star Points are required as extra runs to create this design from the Full Factorial. Two additional Center Points are usually added to make sure that conditions haven't changed from when the Full Factorial was run to when the additional points were run. All the Full Factorial runs, along with the Star Points and all four Center Points, are analyzed together in the Central Composite Design. A Block is used to distinguish the first set of runs (Full Factorial) from the second set (Star Points) and to check that no conditions have changed. The results are shown in Figure 7.20.7.
From the Figure it is clear that the Block has no effect (a p-value above 0.05); so conditions appear not to have changed from the Full factorial to the addition of the Star Points. The analysis can be rerun eliminating the Block term and the results are shown in Figure 7.20.8.
The model seems reasonable in that it explains 89.4% of the variation seen in the data (from the R-Sq value). The R-Sq(adj) value is a little lower at 82.8% indicating some redundant terms in the model, but that can be explained by the inclusion of the Factors Time and Temperature, even though their p-values indicate they aren't significant. The Factors cannot be removed for the sake of hierarchy. The equation for Yield can be determined from the "Coef" column, which lists the coefficients:
Remember these are in coded units; most software packages give the equation in actual units as well. Figure 7.20.9 shows the graphical representation of the response surface at this point. Figure 7.20.9. Contour Plot of the response surface from the Central Composite Design (output from Minitab v14).From the Contour Plot it seems that there is another area of even higher Yield off to the top left corner. To proceed, the approach would be to determine the Path of Steepest Ascent in that direction and conduct experiments along the Path until a maximum value is reached and then center a further experiment in that region. For the example in question, an expanded response surface is shown in Figure 7.20.10, which clearly shows the progression from the original experiment up to the top of a saddle and then the potential for further experiments leading to a higher Yield in the top left corner. Figure 7.20.10. The "big picture view" of Yield (output modified from Minitab v13).
Most DOE software packages provide an optimization algorithm, which can be applied to the model to determine the maxima or minima. Many Belts fall into the mode of optimizing beyond practicality. Remember that this is an imperfect model that is based on just a sample of data from a varying process with inherent noise in the Measurement System. To optimize to anything more then two significant figures is fairly meaningless. Use the optimizer instead to identify a region of operation. If there is a choice between operating regions then choose the flattest one, because this creates a more robust process (variation in the Xs has little effect in the Y). The final step in the process is to set the specifications for the Xs based on contour lines in the Y. If the Yield has to be kept above a certain value, then simply reading the associated value of the Xs from the Contour Plot can give specifications for the Xs. This is shown graphically in Figure 7.20.11; in this example, to maintain a Yield above 86%, the Xs should be controlled within the following specifications:
Figure 7.20.11. Creating specifications for the Xs (adapted from output from Minitab v14).RoadmapThe roadmap to designing, analyzing and interpreting an Optimizing DOE is as follows:
Other OptionsCalculating the Path of Steepest Ascent for Two or more FactorsTo calculate the exact Path of Steepest Ascent for two or more Factors, the model equation in coded units (1s and +1s) from the analysis is used. Using the example from "Overview" in this section:
For any model equation, the Path can be calculated by choosing the step size in one of the process variables, for example, ΔX1 = 1 (usually a step size of 1 or 2 at most). The step size for the other variable(s) is
where:
So for the preceding example, for every increase of 1 coded unit of Time, the associated step size for temperature to follow the Path of Steepest Ascent is
It should be quite straightforward to create a table using a spreadsheet to represent this, as shown in Table 7.20.2. Remember that the listed equation for Time and Temperature from the analysis is in coded units, so before the experiment can be run, the coded units have to be converted back to real life or "Natural Units," as shown in the table in the columns "Nat Time" and "Nat Temp."
This is done with the equation:
For the example here: and
and
The next steps are the same as in "Overview" in this section; the runs are conducted sequentially on the process until a highest value is determined. At this point a Full Factorial Design is centered on the highest point from the Path of Steepest Ascent. |