In almost any research study, there will be missing or
incomplete data. Missing data can happen for a number of reasons: participants
fail to respond to questions, subjects withdraw (or quit) studies before they
are completed, and data entry errors.
The problem with missing data is that nearly all statistical
techniques assume or require complete data. There can be legitimately missing
data; an example might be a survey in which one is asked if he or she married,
and if so how long. If you are not married, than you would be correct in
leaving the "how long" portion of the question blank.
It is also important to realize that legitimately missing
data can be meaningful. The missing data allows a validity check and may inform
the status of an individual. Osborn (2013) proves a great example. In cleaning
the data from an adolescent health risk survey, he noticed that some
individuals indicated on one question that they had never used illegal drugs,
but later in the survey when asked how many times they used marijuana,
indicated an answer greater than 0. Therefore, an answer they should have
skipped (or missing), showed an unexpected number. The author suggests several
possible explanations, such as the subject was not paying attention and
answered in error. However, a more intriguing possibility is that some subjects
did not view marijuana as an illegal drug, which is an interesting possibility
that could be examined in future search.
One way of dealing with legitimately missing data is making
the missing and present data two separate groups. Using the marriage survey
example, we could eliminate non-married individuals from a specific analysis when
looking at issues related to being married vs. not married. So instead of
asking the silly research question- "How long, on average, do all people,
even unmarried people, stay married- we can ask two more refined questions:
"What are the predictors of whether someone is currently married?"
and Of those who are currently married, how long on average have they been
married?
Next time we will consider categories of missing data. Do
you have an issue or a question that you would like me to discuss in a future
post? Would you like to be a guest writer? Send me your ideas!
leann.stadtlander@waldenu.edu
Osborn, J. W. (2013). Best
practices in data cleaning. DC: Sage.
No comments:
Post a Comment