The purpose of this study is to provide an empirical analysis of the performance of an L1 Japanese automated essay scoring system which was on L2 Japanese compositions. In particular, this study concerns the use of such a system in formal L2 Japanese classes by the teachers (not in standardised tests). Thus experiments were designed accordingly. For this study, Jess, Japanese essay scoring system, was trialled using L2 Japanese compositions (n = 50). While Jess performed very well, being comparable with human raters in that the correlation between Jess and the average of the nine human raters is at least as high as the correlation between the 9 human raters, we also found: 1) that the performance of Jess is not as good as the reported performance of English automated essay scoring systems in the L2 environment and 2) that the very good compositions tend to be under-scored by Jess, indicating that Jess still has possible room for improvement.
|Journal||Papers in Language Testing and Assessment|
|Publication status||Published - 2013|