Table of Links
2. Background
2.1 Effective Tutoring Practice
2.2 Feedback for Tutor Training
2.3 Sequence Labeling for Feedback Generation
2.4 Large Language Models in Education
3. Method
3.1 Dataset and 3.2 Sequence Labeling
3.3 GPT Facilitated Sequence Labeling
4. Results
6. Limitation and Future Works
APPENDIX
B. Input for Fine-Tunning GPT-3.5
C. Scatter Matric of the Correlation on the Outcome-based Praise
D. Detailed Results of Fine-Tuned GPT-3.5 Model's Performance
6. LIMITATION AND FUTURE WORKS
Measuring the impact of the proposed feedback system. While the current study demonstrates the potential of using GPTbased models for providing explanatory feedback in a novice tutor training context, we acknowledge the necessity of validating the effectiveness of feedback with highlighted components through empirical research involving actual users. To this end, we propose a comprehensive study aimed at assessing the real-world effectiveness and impact of our feedback system on novice tutors. The planned study will involve a group of novice tutors who will use our automated feedback system during their training sessions. The study will be designed to capture both qualitative and quantitative data to provide a holistic evaluation of the feedback system’s performance. Quantitative data will be collected through preand post-tests to measure the learning gains of tutors, while qualitative data will be gathered from surveys and interviews to assess tutors’ perceptions and experiences with the feedback.
Expanding the scope of the proposed feedback systems for diverse tutoring scenarios. We aim to empower novice tutors through automated explanatory feedback, enabling them to grasp effective tutoring strategies within our training programs. While the fine-tuned GPT-3.5 model has shown promising results in delivering explanatory feedback for giving effective praise, its applicability and effectiveness across a broader range of tutoring scenarios, such as responding to student errors and assessing student understanding, have yet to be explored. This gap highlights the necessity of broadening the scope of our proposed method. Expanding and rigorously evaluating our approach to encompass diverse educational contexts and lesson types is essential for building a more versatile and universally applicable automated feedback system.
Enhancing the proposed feedback system with data augmentation. We also recognized the inherent challenges associated with sequence labeling for highlighting key components of tutoring practice (e.g., praise components in our study). To achieve satisfactory performance, our study required the use of 50% of the total dataset, equivalent to 65 training samples. This substantial annotation workload raises concerns, particularly when considering the extension of fine-tuning GPT models to more tutor training lessons (e.g, our tutor training platform has designed 20 lessons for different tutoring strategies). To address this issue and reduce the reliance on extensive manual annotation, we are exploring the implementation of data augmentation techniques, such as random swap and synonym replacement [14]. By applying these data augmentation techniques to merely 10% of the dataset or 13 training samples, we aim can reduce the dependency on extensive manual annotation efforts.
Examining the applicability of the proposed feedback system across different platforms. In our future work, we aim to apply sequence labeling methods to analyze real-world tutoring transcripts and diverse datasets, such as teacher comments from educational platforms like ASSISTments [22]. Leveraging fine-tuned GPT models on highlighting the key components of instructional strategies (e.g., effective praise, response to student errors, and engaging with difficult students), we plan to generate comprehensive reports that highlight the desired and less desired components from the teacher feedback or comments and provide targeted feedback with suggestions for improvements. This initiative will potentially offer actionable insights to tutors on enhancing their pedagogical approaches in future sessions.
This paper is available on arxiv under CC BY 4.0 DEED license.
Authors:
(1) Jionghao Lin, Carnegie Mellon University ([email protected]);
(2) Eason Chen, Carnegie Mellon University ([email protected]);
(3) Zeifei Han, University of Toronto ([email protected]);
(4) Ashish Gurung, Carnegie Mellon University ([email protected]);
(5) Danielle R. Thomas, Carnegie Mellon University ([email protected]);
(6) Wei Tan, Monash University ([email protected]);
(7) Ngoc Dang Nguyen, Monash University ([email protected]);
(8) Kenneth R. Koedinger, Carnegie Mellon University ([email protected]).