This paper studies the writing features and their indications towards conference / workshop appearances in AI venues.
As the numbers of submissions to conferences grow quickly, the task of assessing the quality of academic papers automatically, convincingly, and with high accuracy attracts increasing attention. We argue that studying interpretable dimensions of these submissions could lead to scalable solutions. We extract a collection of writing features, and construct a suite of prediction tasks to assess the usefulness of these features in predicting citation counts and the publication of AI-related papers. Depending on the venues, the writing features can predict the conference vs. workshop appearance with F1 scores up to 60-90, sometimes even outperforming the content-based tf-idf features and RoBERTa. We show that the features describe writing style more than content. To further understand the results, we estimate the causal impact of the most indicative features. Our analysis on writing features provides a perspective to assessing and refining the writing of academic articles at scale.
Following is Zining’s slide deck presented in SPOC lab meeting in July 2021.