TY - JOUR
T1 - Strength of forensic text comparison evidence from stylometric features: a multivariate likelihood ratio-based analysis
AU - Ishihara, Shunichi
PY - 2017
Y1 - 2017
N2 - An experiment in forensic text comparison (FTC) within the likelihood ratio (LR) framework is described, in which authorship attribution was modelled with word-and character-based stylometric features. Chatlog messages of 115 authors were selected from a chatlog archive containing real pieces of chatlog evidence used to prosecute paedophiles. Four different text lengths (500, 1000, 1500 or 2500 words) were used for modelling in order to investigate how system performance is influenced by sample size. Strength of authorship attribution evidence (or LR) is estimated with the Multivariate Kernel Density formula. Performance was primarily assessed with the log-likelihood ratio cost (Cllr), but assessments of other metrics, e.g. credible interval and equal error rate, are also given. Taking into account the small number of features used for modelling authorship attribution, results are promising. Even with a small sample size of 500 words, the system achieved a discrimination accuracy of c. 76% (Cllr = 0.68258). With a sample size of 2500 words, a discrimination accuracy of c. 94% (Cllr = 0.21707) was obtained. Larger sample size is beneficial to FTC, resulting in an improvement in discriminability, an increase in the magnitude of the consistent-with-fact LRs and a decrease in the magnitude of the contrary-to-fact LRs. It was found that 'Average character number per word token', 'Punctuation character ratio', and vocabulary richness features are robust features, which work well regardless of sample sizes. The results demonstrate the efficacy of the LR framework for analysing authorship attribution evidence.
AB - An experiment in forensic text comparison (FTC) within the likelihood ratio (LR) framework is described, in which authorship attribution was modelled with word-and character-based stylometric features. Chatlog messages of 115 authors were selected from a chatlog archive containing real pieces of chatlog evidence used to prosecute paedophiles. Four different text lengths (500, 1000, 1500 or 2500 words) were used for modelling in order to investigate how system performance is influenced by sample size. Strength of authorship attribution evidence (or LR) is estimated with the Multivariate Kernel Density formula. Performance was primarily assessed with the log-likelihood ratio cost (Cllr), but assessments of other metrics, e.g. credible interval and equal error rate, are also given. Taking into account the small number of features used for modelling authorship attribution, results are promising. Even with a small sample size of 500 words, the system achieved a discrimination accuracy of c. 76% (Cllr = 0.68258). With a sample size of 2500 words, a discrimination accuracy of c. 94% (Cllr = 0.21707) was obtained. Larger sample size is beneficial to FTC, resulting in an improvement in discriminability, an increase in the magnitude of the consistent-with-fact LRs and a decrease in the magnitude of the contrary-to-fact LRs. It was found that 'Average character number per word token', 'Punctuation character ratio', and vocabulary richness features are robust features, which work well regardless of sample sizes. The results demonstrate the efficacy of the LR framework for analysing authorship attribution evidence.
U2 - 10.1558/ijsll.30305
DO - 10.1558/ijsll.30305
M3 - Article
VL - 24
SP - 67
EP - 98
JO - The International Journal of Speech, Language and the Law
JF - The International Journal of Speech, Language and the Law
IS - 1
ER -