[Journal] Wu (2021) Learning analytics on structured and unstructured heterogeneous data sources: Perspectives from procrastination, help-seeking, and machine-learning defined cognitive engagement.

Wu, J.-Y. (2021). Learning analytics on structured and unstructured heterogeneous data sources: Perspectives from procrastination, help-seeking, and machine-learning defined cognitive engagement. Computers & Education, 163, 104066. doi:10.1016/j.compedu.2020.104066 (SSCI: 2021 IF: 11.182, Q1)

Abstract

Statistics is one of the most challenging courses for university students. The personal learning environment (PLE) pedagogical design was introduced to assist students’ Statistics learning. With the PLE pedagogy, this study examined learners’ demographic backgrounds, motivational measures (i.e., help-seeking and academic procrastination due to the use of Information and Communication Technologies, ICT), and ill-structured data (i.e., Facebook posts and comments) to understand what student demographic information, how they feel, and what they do can impact their statistics learning performance. Seventy-eight people joined Facebook groups to form statistics learning communities. Using weakly supervised machine learning (ML), we categorized students’ Facebook messages into statistics-relevant and statistics-irrelevant. Results of the learning analytics on multimodal sources of student data showed that help-seeking positively predicted statistics achievement. In contrast, academic procrastination with ICT negatively predicted statistics achievement, controlling for students’ demographics information, including age, gender, prior knowledge, and Internet/social media use. Moreover, the ensemble ML classified messages constructed by taking the sum of relevance coding (0 or 1) across three selected ML algorithms was highly aligned with the human coded message in terms of the degree of relevance to statistics. The ensemble ML classified messages were conceptualized as students’ cognitive engagement in statistics learning due to their high consistency with the human-labeled relevance coding and were positively associated with statistics achievement with a large effect size. The study contributed to developing an integrated learner-centered learning analytics framework with the PLE pedagogical design encompassing learner backgrounds and unstructured learner artifacts.

Keywords: Social media, Improving classroom teaching Learning communities, Data science applications in education, Weakly supervised Machine Learning, Transfer Learning, Learning Analytics

摘要

統計是大學生面臨的最具挑戰性的課程之一。本研究創新引入個人學習環境(Personal Learning Environment, PLE)教學設計,有效協助學生適性主動學習統計學。本研究搜集學習者的背景資訊、動機構念(即尋求幫助,以及因為資訊和通訊技術使用(ICT uses)而產生的自覺學術學業延宕)、以及非結構化對話數據(即Facebook的貼文和回應),突破性的提出『多模式異質資料學習分析』,以整合了解學生的背景資訊、自覺動機和對話投入如何影響他們的統計學習表現。78位學生簽署同意書、加入Facebook學習群組,組成統計學習社群。本研究使用弱監督機器學習(Weakly supervised Machine Learning),將學生在Facebook上的對話文本分類為與統計學習相關和與統計學習無關。多模式異質資料學習分析結果顯示,在控制了學生的背景資訊(包括年齡、性別、先備知識和網絡/社交媒體使用情況)後,其自覺尋求幫助可正向預測統計期末成績。相反,使用ICT導致學術學業拖延則對統計成績有負面影響。同時,我們選擇三個的ML算法的機器編碼結果(0或1)、進行集成編碼,其結果與人工編碼結果呈現高度編碼一致性,具內在效度。因而這些集成ML分類結果被創新的概念化為學生在PLE中統計學習的認知投入。經多模式學習分析檢驗之後,學生一整個學期的認知投入與其期末的統計成績呈正相關,且具有更大的效應量。此突破性的發現具預測效度與關聯效度,且具因果推論價值。本研究發現可協助統計教學創新,且提供了以學習者為中心完善異質資料學習分析框架。

關鍵詞:社交媒體,改進課堂教學,學習社群,教育中的數據科學應用,弱監督機器學習,遷移學習,學習分析