staeiou / cscw19-paper-lengths

See interactive visualizations:

Home Page:http://stuartgeiger.com/papers/cscw19-paper-lengths/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The Rise and Fall of the Note: Changing Paper Lengths in ACM CSCW, 2000-2018

By R. Stuart Geiger, staff ethnographer, Berkeley Institute for Data Science, UC-Berkeley

This repo contains the code and data needed to reproduce the figures in a paper (arxiv link, publisher link) in Proceedings of the ACM on Human-Computer Interaction -- the new journal venue for the proceedings of the ACM conference on Computer-Supported Cooperative Work (or CSCW). The entire study involved text analysis of copyrighted papers, which is not free to redistribute here. However, the notebook I used for processing the PDFs is available for reference at code/data-cleaning-processing.ipynb. A data file containing all the quantitative statistics for each paper is at data/cscw-pages-notext.csv. This file is loaded by code/analysis-viz.ipynb, which processes it to produce the statistics and graphs presented in the paper. This notebook can also be run interactively for free in the cloud with Binder, so you can change various parameters or visualize it differently.

This repo now also includes 2019 PACMHCI CSCW data. The existing notebooks and data files that were used for the original paper are still in this repo, but new data files and notebooks are also in this repo with a -2019 suffix.

Binder

DOI

Abstract

In this note, I quantitatively examine various trends in the lengths of published papers in ACM CSCW from 2000-2018, focusing on several major transitions in editorial and reviewing policy. The focus is on the rise and fall of the 4-page note, which was introduced in 2004 as a separate submission type to the 10-page double-column "full paper" format. From 2004-2012, 4-page notes of 2,500 to 4,500 words consistently represented about 20-35% of all publications. In 2013, minimum and maximum page lengths were officially removed, with no formal distinction made between full papers and notes. The note soon completely disappeared as a distinct genre, which co-occurred with a trend in steadily rising paper lengths. I discuss such findings both as they directly relate to local concerns in CSCW and in the context of longstanding theoretical discussions around genre theory and how socio-technical structures and affordances impact participation in distributed, computer-mediated organizations and user-generated content platforms. There are many possible explanations for the decline of the note and the emergence of longer and longer papers, which I identify for future work. I conclude by addressing the implications of such findings for the CSCW community, particularly given how genre norms impact what kinds of scholarship and scholars thrive in CSCW, as well as whether new top-down rules or bottom-up guidelines ought to be developed around paper lengths and different kinds of contributions.

Data Dictionary

Row name Description Example 1 Example 2 Example 3
filename Filename (minus .pdf) in the original dataset 2012/p253-muller 2017.5/a033-chounta 2004/p21-hupfer
words Total number of words, including references and appendices 3096 10613 3368
year_float Year of publication in float, 2017 Online First is 2017.5 2012 2017.5 2004
characters Total number of characters, including references and appendices 21327 74482 22637
num_pages Number of pages in the PDF 4 20 4
orientation PDF paper orientation: 0 is portrait, 90 is landscape 0 0 0
year Year of publication in float, 2017 Online First is 2017.5 2012 2017.5 2004
words_per_page_total Number of words per page across the entire document 774 530.65 842
chars_per_word_total Number of character per page across the entire document 6.88857 7.018 6.7212
appx_start Character position of the beginning of the appendix (False if no appendix) False 69201 False
ref_start Character position of the beginning of the references 19387 60881 20976
appx_len_chars Length of appendix in characters 0 5281 0
ref_len_chars Length of reference section in characters 1940 8320 1661
appx_len_words Length of appendix section in words 0 464 0
ref_len_words Length of reference section in words 274 1097 235
words_per_page Number of words per page across the entire document 774 530.65 842
body_len_chars Length of the main paper in characters (no references or appendices, but includes the front matter) 19387 60881 20976
body_len_words Length of the main paper in words (no references or appendices, but includes the front matter) 2822 9052 3133
appx_prop_words Proportional length of the appendix by the total paper length (in words) 0 0.04372 0
ref_prop_words Proportional length of the reference section by the total paper length (in words) 0.0885013 0.103364 0.0697743
appx_prop_chars Proportional length of the appendix by the total paper length (in characters) 0 0.070903 0
ref_prop_chars Proportional length of the reference section by the total paper length (in words) 0.0909645 0.111705 0.0733754
body_words_per_char Number of words per character in the main body 6.86995 6.7257 6.69518
ref_words_per_char Number of words per character in the reference section 7.08029 7.58432 7.06809
appx_words_per_char Number of words per character in the appendix NaN 11.3815 NaN
title_from_text Title of the paper (imputed from the paper text, may not be perfect) Lurking As Personal Trait Or Situational Disp... When To Say “Enough Is Enough!”: A Study On T... Introducing Collaboration Into An Application...
lead_author Lead author of the paper, according to ACM DL filename muller chounta hupfer
title_has_quote 1 if the title contains a quotation mark, 0 if it does not 0 1 0

About

See interactive visualizations:

http://stuartgeiger.com/papers/cscw19-paper-lengths/

License:MIT License


Languages

Language:Jupyter Notebook 84.5%Language:HTML 14.2%Language:TeX 1.2%Language:Python 0.1%