Date of Graduation

5-2019

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Educational Statistics and Research Methods (PhD)

Degree Level

Graduate

Department

Rehabilitation, Human Resources and Communication Disorders

Advisor/Mentor

Ronna C. Turner

Committee Member

Xinya Liang

Second Committee Member

Wen-juo Lo

Third Committee Member

Samantha Robinson

Keywords

item response theory, psychometrics

Abstract

Statistical models used for estimating skill or ability levels often vary by field, however their underlying mathematical models can be very similar. Differences in the underlying models can be due to the need to accommodate data with different underlying formats and structure. As the models from varying fields increase in complexity, their ability to be applied to different types of data may have the ability to increase. Models that are applied to educational or psychological data have advanced to accommodate a wide range of data formats, including increased estimation accuracy with sparsely populated data matrices. Conversely, the field of online gaming has expanded over the last two decades to include the use of more complex statistical models to provide real-time game matching based on ability estimates. It can be useful to see how statistical models from educational and gaming fields compare as different datasets may benefit from different ability estimation procedures. This study compared statistical models typically used in game match making systems (Elo, Glicko) to models used in psychometric modeling (item response theory and Bayesian item response theory) using both simulated data and real data under a variety of conditions. Results indicated that conditions with small numbers of items or matches had the most accurate skill estimates using the Bayesian IRT (item response theory) one-parameter logistic (1PL) model, regardless of whether educational or gaming data were used. This held true for all sample sizes with small numbers of items. However, the Elo and the non-Bayesian IRT 1PL models were close to the Bayesian IRT 1PL model’s estimations for both gaming and educational data. While the 2PL models were not shown to be accurate for the gaming study conditions, the IRT 2PL and Bayesian IRT 2PL models outperformed the 1PL models when 2PL educational data were generated with the larger sample size and item condition. Overall, the Bayesian IRT 1PL model seemed to be the best choice across the smaller sample and match size conditions.

Citation

Morrison, B. (2019). Comparing Elo, Glicko, IRT, and Bayesian IRT Statistical Models for Educational and Gaming Data. Graduate Theses and Dissertations Retrieved from https://scholarworks.uark.edu/etd/3201

Download

Included in

Applied Statistics Commons, Statistical Models Commons

COinS

Graduate Theses and Dissertations

Comparing Elo, Glicko, IRT, and Bayesian IRT Statistical Models for Educational and Gaming Data

Date of Graduation

Document Type

Degree Name

Degree Level

Department

Advisor/Mentor

Committee Member

Second Committee Member

Third Committee Member

Keywords

Abstract

Citation

Included in

Browse

Links

Search

Graduate Theses and Dissertations

Comparing Elo, Glicko, IRT, and Bayesian IRT Statistical Models for Educational and Gaming Data

Author

Date of Graduation

Document Type

Degree Name

Degree Level

Department

Advisor/Mentor

Committee Member

Second Committee Member

Third Committee Member

Keywords

Abstract

Citation

Included in

Share

Browse

Links

Search