Models for Identifying Substantive Clusters and Fitted Subclusters in Social Science Data

Jeff Gill

Editor in Chief, Political Analysis; Distinguished Professor, Department of Government; Professor, Department of Mathematics & Statistics; Member, Center for Behavioral Neuroscience, American University; Visiting Professor, Harvard University

April 18, 2018 11:30AM

Unseen grouping, often called latent clustering, is a common feature in social science data.  Subjects may intentionally or unintentionally group themselves in ways that complicate the statistical analysis of substantively important relationships. This work introduces a new model-based clustering design which incorporates two sources of heterogeneity.  The first source is a random effect that introduces substantively unimportant grouping but must be accounted-for. The second source is more important and more difficult to handle since it is directly related to the relationships of interest in the data.  We develop a model to handle both of these challenges and apply it to data on terrorist groups, which are notoriously hard to model with conventional tools.