Using Data Mining to Model Student Achievement on the 4 th Grade TIMSS 2015 Mathematics Assessment: A Five Nation Study

Annette M Siemssen, University of Texas at El Paso


Data mining has been successfully used by financial and retail companies since the mid-1960’s to create predictive models and reveal unexpected relationships. However, it remains underutilized as a tool in educational research. Large-scale standardized assessment programs such as the Trends in International Mathematics and Science Study (TIMSS) provide vast amounts of data with the potential for providing new insights in education. Five nations, the Republic of Korea, the United States, Germany, Kuwait, and Kazakhstan were selected based on General Response Style theory to represent a spectrum of cultural backgrounds, from acquiescent to midpoint to individualistic (Hastedt, D. & van de Vijver, F. J. R., 2017). The data mining technique of Random Forest was used to create a series of models to predict student achievement in mathematics using items from the TIMSS 2015 4th Grade background questionnaires for students, teachers, and principals. The final collective model reduced the number of variables from 398 to 23 and was able to predict student achievement. Variables of importance included items relating to language, reading, nutrition, experience of educators and student perception of mathematical ability. Individual rankings for variable importance for each nation indicated acquiescent, and midpoint nations shared more variable importance with nations of similar response style than with the collective model. The variable importance ranking for Kazakhstan, the nation representing the individualistic response style, neither aligned well with other nations nor the collective model. Only two variables, the amount of books in the home and the experience of the principal, were highly ranked by all five nations. The large discrepancies between the nation and collective models indicates the need to address local concerns when forming education policy.

Subject Area

Mathematics education|Statistics|Education

Recommended Citation

Siemssen, Annette M, "Using Data Mining to Model Student Achievement on the 4 th Grade TIMSS 2015 Mathematics Assessment: A Five Nation Study" (2018). ETD Collection for University of Texas, El Paso. AAI10812051.