First of all, if your research progress is slowed by fear of statistics, your are certainly not alone. Being afraid to “mess-up” your stats, and thus your project, is a common lament. But I’m here to tell you that your project is not that fragile! Once your data is collected, entered, cleaned, and ready for analysis, it is time for excitement, not concern! The golden rule here is: BACK UP. I’ll say it again: BACK UP. In case I haven’t been clear so far BACK UP! By this I mean back up your data. Make double, triple, and quadruple copies of your dataset and KEEP THEM IN DIFFERENT PLACES (e.g. on a server, on an external hard drive, on a flash drive…etc). It doesn’t do much good to keep back up copies in the same place as your original, because if something goes wrong where your data is located, the back up copies are likely toast too!
Backing up your data aside, there is little else than can go catastrophically wrong with your analysis. Let this knowledge give you the courage to jump-in and work with your data. “Figuring it out” is a valuable part of the analytic process and can leave one well prepared to answer their research questions. I often joke that when things go “too smoothly” during analysis, I likely got lucky somewhere along the way and I may not be so lucky next time. How will I know how to deal with the road blocks of data analysis if they’ve never come-up before?
Armed with the knowledge that each road block gets you closer to the answers to your research questions, fire-up SPSS (or your preferred statistical package), bust-out your stats class notes and stop being afraid to screw it up, because everything can be undone. After all, you’ve backed-up!
Warning, this blog will be short, sweet, and a bit pithy. The two most common questions that I receive about statistical analyses, no matter what kind or purpose, is: “Am I doing it right?” or “Am I allowed to…(fill-in a variation of a common analysis here)?” My response to these questions is usually: “Sure, you can do whatever you want, but what will it mean if you do?” I’ve said for a few years now that I don’t see statistics as being about find truths, but instead I see it as being about building arguments. The critical things is that you understand the impact of your statistical decisions.
Just like your high school math teachers told you, you must learn the rules before you can know when it is useful to break them. The same is true when building arguments with statistics. There aren’t many things you can do when running a statistical analysis that are inherently “wrong”. However, it is critical that you remain aware of the consequences or your statistical decisions. Next time you find yourself wondering if you are “allowed to” or are “supposed to” do something in an analysis that is not what is typically done, stop yourself from asking those questions and consider what would be different if you did it, and what would it mean for your interpretation?
Transformations: statistical voodoo or truth serum for your data?
Anyone that has taken a statistics class has probably learned about transforming data, at one time or another (although you may be in denial about it). In short, you may want to transform your data if you need to perform a parametric analysis, but the inherent assumptions are violated in your dataset. While this seems simple enough, many researchers are hesitant to employ this tactic of handling non-normally distributed data. Often, they can’t quite put their finger on and what bothers them about it, but the idea of “artificially changing their data” leaves many feeling uneasy.
As a researcher myself, I can relate, as many of the same people that preach the importance of considering analysis assumptions also teach us to feverishly protect the integrity of our data. However, for many of us I believe this may be where our methodological good -intentions lead us astray. While protecting the integrity of our data is indeed of paramount importance, transforming variables for the purposes of reaching normality of distribution is unfairly characterized as a threat to this integrity. In fact, one could easily argue that making inferences from results that are biased by non-normally distributed variables may be a greater threat to the integrity of your data and analysis than any transformation. It is perfectly reasonable for a well-intentioned researcher to worry about the consequences of transformation and for them to be wary of a transformation that subsequently brings a significant result where one once was not present. However, that researcher should rest their worries on the realization that an appropriately applied transformation of data generally raises the likelihood that their test of significance is UNBIASED, while it DOES NOT typically raise the likelihood that one will find significance in the absence of a true relationship (type I error). When no true relationship exists between two variables, a test is no more likely to find significance when its variables are transformed than it is when they are not.
From a pragmatic perspective, using transformation offers the benefit of allowing a researcher to utilize the techniques that they are familiar with and more likely to apply in their current situation, while minimizing the potential bias non-normally distributed data. In many cases, depending on the type of analysis being used and the design of your study, alternative and possibly more sophisticated techniques may exist for dealing with non-normal data. However, many of these techniques require substantial statistical experience and may be intimidating to many seeking to deal with their assumption problems. While transformation is surely not the answer to all problems with assumptions, or even non-normal data, it is far from “voodoo” and is an attractive alternative to turning a blind-eye to the distribution of the data.