The performances of two different procedures for calculating the likelihood ratio (LR) for forensic text comparison (FTC) are empirically compared. One is the multivariate kernel density (MVKD) procedure with so-called lexical features. The MVKD procedure has been successfully applied to various types of evidence, including texts. The other is the procedure based on character N-grams. N-gram is a widely-used, robust probabilistic language model. The effectiveness of character N-grams has been reported in authorship analysis, however, to the best of my knowledge, it has not yet been applied to LR-based FTC. In this study, the log-likelihood-ratio-cost (Cllr), which is an appropriate assessment metric for LR-based systems, is used to assess the performance of the two procedures. It will be reported that the MVKD procedure outperforms the character N-gram procedure. Through the comparison of the two procedures, this study also demonstrates how the weight of evidence (= LR) can be estimated from text messages.
|Published - 2014
|Australian Linguistic Society Conference Proceedings - ALS 2014 - University of Newcastle
Duration: 1 Jan 2014 → …
|Australian Linguistic Society Conference Proceedings - ALS 2014
|1/01/14 → …