Machine learning applications of data science in education often group into two areas: (i) using technology to analyze unstructured data, such as natural language processing, computer vision, and clickstream analysis; and (ii) using big data to make predictions. In the field of education, predictive analytics at the individual level has been restricted due to questions about what is being predicted: How do we know results are valid, can they be replicated, are they fair and useful, what approaches are best, and what interpretations can be made? This presentation, which employs a large video archive of TED Talks, will exemplify how machine learning and predictive analytics can be used to predict the results of gold-standard assessment instruments – then go beyond the instruments. The approach opens up broad new horizons for current data science use in educational research. The talk will share an example employing techniques from mining video transcripts (TF-IDF and cosine similarity) and then extending machine learning (ML) via hybridization with information from measurement models, in this case through distractor analysis. How generalizable such approaches are in education will be discussed and implications for new tools and approaches in machine learning will be examined.
Kathleen Scalise is a professor at the University of Oregon in the Department of Methodology, Policy and Leadership. Dr. Scalise employs data science at the intersection with measurement and assessment for applied and theoretical research, including for learning in digital social networks and science/engineering education, as well as for network analyses of leadership and collaboration. She has extensive research in the areas of learning, e-learning, large-scale assessment, and instructional technology in the context of STEM education (science, technology, engineering, and mathematics) and also in emergent language, second language acquisition, digital literacy, social collaboration, leadership, and 21st century skills. In addition to data analytics, she is interested in new models for dynamic delivery of differentiated content to support the needs of all learners, innovative item types, and equity, opportunity and access in education. She serves currently as director of the National Assessment of Educational Progress (NAEP) Science for ETS. She is co-lead for the University of Oregon Social Systems Data Science Network. Her projects have included research on 21st Century Skills Assessments with Cisco, Intel and Microsoft; STEM Virtual Performance Assessments with Harvard University; and technology-enhanced assessments with Smarter Balanced and NGSS science assessment designs. She has served internationally on OECD’s PISA, and IEA’s eTIMSS and ICILS. She has extensive journal publications and served on the NRC committee report on assessment of the Next Generation Science Standards. She holds K-12 teaching credentials (California) for physical sciences and life sciences, a B.A. in biochemistry, and the Ph.D. focusing on Quantitative Measurement from UC Berkeley.