Chapter published in:Contemporary Trends in Hispanic and Lusophone Linguistics: Selected papers from the Hispanic Linguistic Symposium 2015
Edited by Jonathan E. MacDonald
[Issues in Hispanic and Lusophone Linguistics 15] 2018
► pp. 143–168
The importance of motivated comparisons in variationist studies
As the state of the field advances empirically, sociolinguists are increasingly expected to utilize statistics in their data analysis. Some researchers have limited formal statistical training, and even for the more experienced researcher, the focus of model construction is often on the independent variables, e.g. interactions or multicollinearity issues. However, dependent variables with three or more variants require careful consideration. Building on Paolillo (2002), I show that identical binomial logistic regression models yield disparate results given differential treatment of a complex dependent variable. I conclude by offering concrete, hands-on advice for linguists working with their data in R with the goal of promoting judicious analyses among Hispanic sociolinguists.
Keywords: Model construction, treatment of data, motivated comparisons, dependent variable, Nicaraguan Spanish, variationist sociolinguistics
Published online: 14 February 2018
Baayen, R. H.
Bates, D., Maechler, M., Bolker, B., & Walker, S.
(2014) lme4: Linear mixed effects models using eigen and S4. R package version 1.1–7. Retrieved from http://cran.rproject.org/web/packages/lme4/index.html
Breiman, L., Cutler, A., Liaw, A., & Wiener, M.
(2015) Breiman and Cutler’s random forests for classification and regression. R package version 4.6–12. Retrieved from https://cran.r-project.org/web/packages/randomForest/randomForest.pdf
Brown, E. L., & Torres Cacoullos, R.
Bullock, B., Amengual, M., & Toribio, A. J.
Carvalho, A. M.
Chang, C. B.
Chappell, W., & Martínez Ibarra, F.
Chappell, W., & García, C.
(2013) Estimation of multinomial logit models in R. R package ‘mlogit,’ version 0.2-4. Retrieved from https://cran.r-project.org/web/packages/mlogit/mlogit.pdf
Díaz Campos, M.
(2016) New trends in the analysis of socio-phonological variation: Comparing traditional analyses grounded on discrete categories versus continuous categories based on acoustical measurements. Keynote at The 8th International Workshop on Spanish Sociolinguistics. San Juan, Puerto Rico. Retrieved from http://wss8upr2016.com/manuel-diaz-campos/
Ferguson, C. A.
File-Muriel, R. J., & Brown, E. K.
Foulkes, P., & Docherty, G.
Fox, M. A. M.
Gries, S. T.
Hothorn, T., Buehlmann, P., Dudoit, S., Molinaro, A., & Van Der Laan, M.
Hothorn, T., Hornik, K., & Zeileis, A.
Hualde, J. I., & Prieto, P.
Johnson, D. E.
Johnson, M., & Barnes, S.
Morgan, T. A.
Navarro Tomás, T.
Paolillo, J. C.
R Core Team
(2016) R language definition, version 3.3.1. Retrieved from https://cran.r-project.org/doc/manuals/r-release/R-lang.pdf
Ripley, B., & Venables, W.
(2015) Feed-forward neural networks and multinomial log-linear models. R package ‘nnet,’ version 7.3–11. Retrieved from https://cran.r-project.org/web/packages/nnet/nnet.pdf
Sankoff, D., Tagliamonte, S., & Smith, E.
(2015) Goldvarb Yosemite: A multivariate analysis application for Macintosh. Retrieved from http://individual.utoronto.ca/tagliamonte/goldvarb.html
Schmidt, L. B., & Willis, E. W.
Stasinopoulos, M., Rigby, B., Akantziliotou, C., & Voudouris, V.
(2015) Generalized additive models for location, scale and shape. R package version 4.2–7. Retrieved from https://www.jstatsoft.org/article/view/v023i07/v23i07.pdf
Strycharczuk, P., Van’t Veer, M., Bruil, M., & Linke, K.
Tagliamonte, S. A.