|
Six Sigma and Beyond. Statistics and Probability Authors: Stamatis D.H. Published year: 2003 Pages: 8-11/252 |
Once you have prepared a data file, you are ready to start analyzing the data. The first step in data analysis is describing the data. You look at the information you have gathered and summarize it in various ways. You count the number of people giving each of the possible responses. You describe the values by calculating averages and seeing how much the responses vary. You look at several characteristics together. How many men and how many women are satisfied with your new product? What are their average ages? You also identify values that appear to be unusual, such as ages in the one hundreds or incomes in the millions, and you check the original records to make sure that these values were picked up correctly. You do not want to waste time analyzing incorrect data.
Sometimes you have information available for everyone or everything that you are interested in drawing conclusions about, and all you need to do is summarize your data. But usually that is not the case. Instead, you usually want to draw conclusions about much larger groups of people or objects than those included in your study. You want to know what proportion of all purchasers of your product are satisfied with it, based on the opinions of the relatively small number of purchasers included in your survey. You want to know whether buyers of your product differ from nonbuyers. Are they younger , richer, better educated ? You want to be able to draw conclusions about all buyers and nonbuyers based on the people you have included in your study.
To do this (and understand it), you have to learn something about statistical inference. Later chapters in this volume will show you how to test hypotheses and draw conclusions about populations based on samples. You will learn how to test whether you have sufficient evidence to believe that the differences or relationships you find in your sample are true for the whole population.
You often want to determine what the relationship is between two variables. For example, what is the relationship between dollars spent on advertising and sales? How can you predict how many additional sales to expect if you increase your advertising budget by 25%? What is the relationship between the dosage of a drug and the reduction in blood pressure? How can you predict the effect on blood pressure if you cut the dose in half? You can study and model the relationship between pairs of variables in many different ways. You can compute indexes that estimate the strength of the relationship. You can build a model that allows you to predict values of one variable based on the values of another. That is what the last part of the book is about.
You must state your ideas clearly if you plan to evaluate them. This advice applies to any kind of work but especially to research design and statistical analysis. Before you begin working on design and analysis, you need to have a clearly defined topic to investigate.
You may have a general suspicion that smoking less makes people feel better. You may think that component A is better than component B. Or you may have an idea for a study method that will make people learn more. Before you begin a study about such intuitions, you should replace vague concepts such as "feeling better" or "smoking less" or "learning more" with definitions that describe measurements that you can make and compare. You might define "better" with a specific performance improvement or a reduction in failure. You might replace "feeling better" with an objective definition such as "the subject experiences no pain for a week." Or you might record the actual dosage of medication required to control pain. If you are interested in smoking, you need a lot of information to describe it. What does each of the subjects smoke ” a pipe, cigars, or cigarettes? How much tobacco do the subjects use in a day? How long have they been smoking? Has the number of cigarettes (or cigars or pipes) that they smoke changed?
On the other hand, you must balance your scientific curiosity with the practical problems of obtaining information. If you must rely on people's memory, you cannot ask questions like "What did you have for dinner ten years ago?" You must ask questions that people will be able to answer accurately. If you are trying to show a relationship between diet and disease, for example, you cannot rely on people's memory of what they ate at individual meals. Instead, you have to be satisfied with overall patterns that people can recall. Some information is simply not available to you, however much you would like to have it. It is better to recognize this fact before you begin a study than when you get your questionnaires back and find that people were not able to answer your favorite question. If you think about your topic in advance, you can substitute a better question ” one that will give you information you can use, even if it is not the information you wish you could have.
|
Six Sigma and Beyond. Statistics and Probability Authors: Stamatis D.H. Published year: 2003 Pages: 8-11/252 |