Wednesday, June 18, 2008

Statistical Functions

Zimmerman, D. W., & Zumbo, B. D. (1992). Parametric alternatives to the student t test under violation of normality and homogeneity of variance. Perceptual and Motor Skills, 74, 835-844.
*Rank order tests are effective only under violations of normality; modified t-tests are far better under conditions of violated homogeneity of variance

Clinch, J. J. & Keselman, H. J. (1982). Parametric alternatives to the analysis of variance. Journal of Educational Statistics, 7, 215-231.
*Different tests were compared under various assumption violations via Monte Carlo methods; the ANOVA F test did worst, while the Brown/Forsyth method did best overall

Tomarken, A. J. & Serlin, R. C. (1986). Comparison of ANOVA alternatives under variance heterogeneity and specific noncentrality structures. Psychological Bulletin, 99, 90-99.
*ANOVA F test alternatives were compared on Type I error rates and power under variance heterogeneity; the Welch test did best in all cases except when extreme means were paired with high variances, where the Brown/Forsyth method did better

Wilcox, R. R. (1998). How many discoveries have been lost by ignoring modern statistical methods? American Psychologist, 53, 300-314.
*Many nonsignificant research results could have been significant if modern statistical methods had been used; new methods also create more accurate confidence intervals

Alternatives to Null Hypothesis Significance Testing

Abelson, R. P. (1985). A Variance Explanation Paradox: When a Little Is a Lot. Psychological Bulletin. 97 (1), 129-34.
*Using percent variance to explain the influence of situational factors is misleading

Prentice, D. A., & Miller, D. T. (1992). When small effects are impressive. Psychological Bulletin, 112, 160-164.
*Interpretation of effect size requires careful consideration of the topic being researched

Bem, D. J., & Honorton, C. (1994). Does Psi Exist? Replicable Evidence for an Anomalous Process of Information Transfer. PSYCHOLOGICAL BULLETIN. 115 (1), 4.
*An example of meta-analysis in action, arguing that psi exists (and is replicable)

Kirsch, I. & Sapirstein, G. (1998). Listening to Prozac but hearing placebo: A meta-analysis of antidepressant medication. Prevention & Treatment, 1
*An example of meta-analysis in action, arguing that SSRI’s work through placebo effect

Krueger, J. (2001). Null Hypothesis Significance Testing: On the Survival of a Flawed Method. AMERICAN PSYCHOLOGIST. 56, 16-26.
*Critics and defenders of NHST use Bayesian ideas; the real issue at stake is replicability

Seaman, M. A., & Serlin, R. C. (1998). Equivalence Confidence Intervals for Two-Group Comparisons of Means. PSYCHOLOGICAL METHODS. 3 (4), 403-411.
*Equivalence confidence intervals can and should replace NHST for determining if two group means are practically equivalent

Duckworth, W.M. and Stephenson, W.R. Resampling methods: Not just for statisticians anymore. Invited paper presented at JSM 2003, San Francisco, CA.
*Explores how to teach resampling methods (jackknife and bootstrap) to psychologists

Controversy in Null Hypothesis Significance Testing

Macdonald, R. R. (1997). On statistical testing in psychology. BRITISH JOURNAL OF PSYCHOLOGY. 88 (2), 333-348.
*Criticisms of NHST apply to Neyman-Pearson approach, but not the Fisherian approach

Huberty, C. J. (1993). Historical Origins of Statistical Testing Practices. Journal of Experimental Education. 61 (4), 317-33.
*Textbooks in psychology confuse the use and interpretation of p-values and alpha-levels

Gigerenzer, G. (1993). The superego, the ego, and the id in statistical reasoning. In G. Keren & C. Lewis (Eds.), A handbook for data analysis in the behavioral sciences: Methodological issues (pp. 311-339). Hillsdale, NJ: Lawrence Erlbaum Associates.
*Textbooks present an incoherent “hybrid logic,” mixing Neyman-Pearson and Fisher

Kaiser, H. (1960). Directional statistical decisions. Psychological Review. 67, 160-167.
*When using two-sided tests, it makes no sense to use a non-directional test

Greenwald, A. G. (1975). Consequences of prejudice against the null hypothesis. Psychological Bulletin, 82, 1-20.
*The false ideas that null results are useless or more likely to be due to incompetence prevents (or at least slows) scientific progress

Rosenthal, R. (1979). The “File Drawer Problem” and tolerance for null results. Psychological Bulletin, 86, 638-641.
*Research that results in null results are rarely published, making a field of research look more “significant” than it might actually be

Rozeboom, W. W. (1960). The fallacy of the null-hypothesis significance test. Psychological Bulletin. 57, 416-28.
*Using statistical tests to make “decisions” is na├»ve and rejection criteria are arbitrary, calling for a use of confidence intervals and (if possible) Bayesian statistics

Bakan D. (1966). The test of significance in psychological research. Psychological Bulletin. 66 (6), 423-37.
*Statistical results are often misinterpreted, calling for Bayesian methods

Hunter, J. E. (1997). Needed: A Ban on the Significance Test. PSYCHOLOGICAL SCIENCE -CAMBRIDGE-. 8 (1), 3-7.
*NHST breaks down when H0 is false and most studies purposely use H0's they know to be false, causing the error rate (Type I and II) of NHST to be around 60%

Cohen, J. (1994). The Earth Is Round (p <.05). AMERICAN PSYCHOLOGIST. 49 (12), 997.
*The logic of NHST is flawed and backwards; we need to better understand our data

Cohen, J. (1990). Things I Have Learned (So Far). American Psychologist. 45 (12), 1304-12.
*Informed judgment from the researcher is indispensable; power analysis can help

Loftus, G. R. (1996). Psychology Will Be a Much Better Science When We Change the Way We Analyze Data. CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE. 5 (6), 161-170.
*Null hypotheses are rarely possible, making “significance” useless; power is under-attended and the dichotomy of effects/non-effects is artificial

Harris, R. J. (1997). Reforming significance testing via three-valued logic. In Harlow, L.L., Mulaik, S.A., & Steiger, J.H. (Eds.) What if there were no significance tests? Hillsdale, NJ: Erlbaum.
*Three-valued logic can establish directionality and address Type III error

Wilkinson, L. (1999). Statistical Methods in Psychology Journals: Guidelines and Explanations. AMERICAN PSYCHOLOGIST. 54 (8), 594-604.
*APA decided not to ban NHST, instead urging researchers to distinguish between statistical and theoretical significance, and also use modern statistical graphics

Abelson, R. P. (1997). On the Surprising Longevity of Flogged Horses: Why There Is a Case for the Significance Test. PSYCHOLOGICAL SCIENCE -CAMBRIDGE-. 8 (1), 12-15.
*NHST can be used effectively in combination with other methods; enforcing a complete ban on it would be throwing away a tool that can be useful in a number of situations

Greenwald, A. G., Gonzalez, R., Harris, R. J., & Guthrie, D. (1996). Effect Sizes and p Values: What Should Be Reported and What Should Be Replicated? PSYCHOPHYSIOLOGY. 33 (2), 175-183.
*The interpretation of p-values in terms of replicability is widely mistaken

Harris, R. J. (1997). Significance Tests Have Their Place. PSYCHOLOGICAL SCIENCE -CAMBRIDGE-. 8 (1), 8-11.
*Three-valued logic can help NHST; using confidence intervals as an alternative runs into the same problems as NHST, while providing less information than a p-value would

Jones, L. V., & Tukey, J. W. (2000). A Sensible Formulation of the Significance Test. PSYCHOLOGICAL METHODS. 5, 411-414.
*Yet another iteration of the virtues of three-valued logic applied to NHST

Thursday, June 5, 2008

Balancing work and family

Pleck, J.H. (1999). Balancing work and family. Scientific American Presents: Men's Health.

With women's move into the workplace over the last few decades, men's roles have changed as well. Men spend much less of their lives working than they did in the past. Father's are not only becoming more available, but more engaged in the family as well. However, these effects are countered by an increase in divorce rates and an increase in the number of unmarried fathers. The family is far more psychologically central to men than work, as is the case for women. Fathers tend to carry their emotions home with them from work, whereas mothers keep their family experiences insulated from the workplace. There also still exist gross gender differences in the nature of behavioral interaction with children. However, despite changing roles, companies have not caught up in their sympathy for their male employees' work-family problems, so social change is still lagging.