Our dataset consists of student essays with detailed instructor feedback (written or audio) across multiple drafts, collected from first-year English as a First Language (ESL) and creative writing courses at the University of Pittsburgh’s Department of English. Unlike existing writing-feedback corpora, the essays are substantially longer (2 to 5 pages for ESL, 5 to 8 pages for creative writing), and the feedback targets idea development, narrative plotting, readerly engagement, and clarity rather than surface-level error correction. To date, we have collected 566 essays from 157 students across nine classes and aim to collect 15–20k essays from 500 students. Future iterations will incorporate office-hour audio recordings to capture instructor-student interactions. This dataset supports education research on how students develop as writers through feedback, as well as AI and NLP applications such as feedback classification, generation, and personalized writing support systems.
Xiang Lorraine Li
Assistant Professor
Xiang Lorraine Li is an Assistant Professor in the Department of Computer Science at the University of Pittsburgh. Her research lies at the intersection of natural language processing and machine learning, with a focus on studying current models’ limitations in high-impact domains such as education. She aims to build socially responsible, equitable, and robust models that serve diverse users. Her research on pluralistic evaluation and modeling was featured in the AAAI New Faculty Highlight in 2025, and her work has been published at leading NLP and ML venues.