The dataset is a multitask and multimodal corpus comprising about 100,000 handwritten responses from K-12 students in Brazilian public schools in low-income regions. Collected via an offline-first platform, each anonymized sample links a smartphone photo of student work to an OCR-extracted student answer. It covers multiple-choice answers, essays in Portuguese, and mathematical equations. It is designed to empower research on automated scoring, feedback generation, and handwriting recognition.
Rafael Ferreira Mello
Associate Professor, CESAR
Rafael Ferreira Mello is an Associate Professor at UFRPE and Senior Researcher at CESAR, holding a Ph.D. in Computer Science from UFPE and a Postdoctoral Fellowship from the University of Edinburgh. His research focuses on Artificial Intelligence in Education, with an emphasis on Natural Language Processing and Large Language Models. Dr. Mello was leader researcher in national projects funded by the Brazilian Ministry of Education that aim to enhance students’ written productions and develop Learning Analytics dashboards to support educational policy implementation. His work has also involved partnerships with global organizations such as Google and academic institutions across Europe, Australia, and Latin America.