[Journal] Wu et al. (2020) Using supervised machine learning on large-scale online forums to classify course-related Facebook messages in predicting learning achievement within the personal learning environment.

Wu, J.-Y.*, Hsiao, Y.-C.† , & Nian, M.-W.† (2020).  Using supervised machine learning on large-scale online forums to classify course-related Facebook messages in predicting learning achievement within the personal learning environment. Interactive Learning Environments28 (1)65-80. https://doi.org/10.1080/10494820.2018.1515085  (SSCI: 2020 IF: 3.928, Q1)

ABSTRACT

This paper demonstrated the use of supervised Machine Learning (Sup. ML) for text classification to predict students’ final course grades in a hybrid Advanced Statistics course and exhibited the potential of using ML-classified messages to identify students at risk of course failure. We built three classification models with training data of 76,936 posts from two large online forums and applied the models to classify messages into statistics-related and non-statistics-related posts in a private Facebook group. Three ML algorithms were compared in terms of classification effectiveness and congruency with human coding. Students with more messages endorsed by two or more ML algorithms as statistics-related had higher final course grades. Students who failed the course also had significantly fewer messages endorsed by all three ML algorithms than those who passed. Results suggest that ML can be used for identifying students in need of support within the personal learning environment and for quality control of large-scale educational data.

中文摘要

本篇論文創新地展示了使用監督式機器學習(Supervised Machine Learning)進行臉書學習社團討論文本的分類工作。同時使用Supervised ML演算法分類文本結果,建立可預測學生於混成式進階統計課程中的最終課程成績模型,並檢驗其用於識別可能學習落後學生的潛力。我們使用來自大型線上論壇的76,936篇帖子作為訓練數據,建立了三個文本分類模型,並將這些模型應用於將Facebook學習群組中的複雜文本分類,將文本分為與統計相關和非統計相關類別。我們比較了三種ML算法在分類效果和與人工編碼的評分一致性表現,並使用集成學習(Ensemble Learning)技術整合三種ML算法結果。研究發現,被兩種或更多的ML算法認可為與統計相關的帖子較多的學生,其最終課程成績較高。不及格的學生也明顯地比及格的學生在可獲得所有三種ML算法認可的文本較少。此將AI科技用於線上學習歷程與學習成效之突破性研究成果表明,監督式機器學習可以用於識別個人學習環境中亟需支持的學生,具有可大規模用於教育數據分析,確保優質線上學習成果之潛力。

Editor’s Comments to Author (Joseph Psotka, Co-Editor)

This is a terrific, ground breaking paper.  SVM often is superior to ANN when datasets are relatively small (i.e., in the thousand rather than millions.)  Will you please undertake to enlarge your datasets for training and for verification? I am happy to publish this ground breaking paper, but really want to see you pursue this with ever larger datasets.  Thank you so very much for considering ILE for publication of your excellent paper. By the way, is there any way to incorporate reinforcement learning with the small graduate student data set to improve the machine learning classifications for future assessments?  

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *