Knowledge Production in Higher Education Institutions: An Analysis on How Peer Review Policies Caused Review Scores to Change
Peer review is a cornerstone of scientific evaluation with over 200 years of history. High-stakes publication venues have important effects on the career of individual researchers, funding of various laboratories, reputation of institutions, and the collective knowledge of society. With 146 R1 universities and $708 billion dollars spent each year in research and development in the United States, an examination of how science is evaluated is necessary. Double-blind review is considered the gold standard of scientific evaluation because it creates an anonymous setting where reviewers can focus solely on the merits of the scientific finding. While several studies have examined the effects of double-blind review, no studies have examined how the text of the manuscript may affect review scores. We present a novel observational causal method that considers the text of the manuscript as potential confounders to review score. By using a state of the art word embedding methods, combined with k-nearest-neighbors matching, our method is applied to a top computer science conference the International Conference on Learning Representations (ICLR). The conference went through a policy change from single-blind to double-blind from years 2017 to 2018 creating a natural experiment for analysis. My results showed that the policy caused a negative score change from the single to double-blind setting. For evaluation, we found that human judgment preferred my text-matching method over existing baselines 71% of the time. This contribution results in both a methodological and finding contribution to the realm of higher education.