MISSING DATA


For each variable in a study, a special code indicates when information is missing. Data may be missing for legitimate reasons. Nevertheless, they have to be identified appropriately. Using computer software, this has become a very easy task. Depending on the software, the coding may take different formats in both identification and processing. (Usually, these codes are identified with the numbers 9; 99; 999 or (.)).

If you have used a specific code to identify missing data, the computer will usually treat the data as missing because you told it to, by entering a MISSING VALUE command. Data that are missing for this reason are called user -missing, because you (the user) specified them as missing.

However, sometimes the particular software must treat data as missing regardless of whether you tell it to or not. Perhaps a case in your data file is simply missing some variables . Perhaps somebody's fingers slipped on the keyboard and entered a response as "YP" instead of "60." When things like this happen, the statistical software assigns a special value called the system-missing value. Statistics are never computed with system-missing values because they are not proper values at all.

What good are missing values? Why do we have to fool with them? Most of the time, you cannot really do anything with missing values, but you do not want to throw away the whole cases they came from. Other variables in those cases may have perfectly good values. And you may change your mind about user-missing values. People who do not know for whom they will vote are sometimes useless for your analysis, but sometimes they are the most interesting people of all.

Nobody wants missing values in their data, but they always turn up. One of the first things you should do with a new data file is to get a general idea of how many missing values there are and why. You can do that with frequency tables, as will be explained in the next chapter.




Six Sigma and Beyond. Statistics and Probability
Six Sigma and Beyond: Statistics and Probability, Volume III
ISBN: 1574443127
EAN: 2147483647
Year: 2003
Pages: 252

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net