A Comparative Study of Two Procedures for Calculating Likelihood Ration in Forensic Text Comparison: Multivariate Kernel Density vs. Gaussian Mixture Model-Universal Background Model

    Research output: Contribution to conferencePaper

    Abstract

    We compared the performances of two procedures for calculating the likelihood ratio (LR) on the same set of text data. The first procedure was a multivariate kernel density (MVKD) procedure which has been successfully applied to various types of forensic evidence, including glass fragments, handwriting, fingerprint, voice, and texts. The second procedure was a Gaussian mixture model - universal background model (GMM-UBM), which has been commonly used in forensic voice comparison (FVC) with so-called automatic features. Previous studies have applied the MVKD system to electronically-generated texts to estimate LRs, but so far no previous studies seem to have applied the GMM-UBM system to such texts. It has been reported that the latter GMM-UBM system outperforms the MVKD system in FVC. The data used for this study was chatlog messages collected from 115 authors, which were divided into test, background and development databases. Three different sample sizes of 500, 1500 and 2500 words were used to investigate how the performance is susceptible to the sample size. Results show that regardless of sample size, the performance of the GMM-UBM system was better than that of the MVKD system with respect to both validity (= accuracy) (of which the metric is the log-likelihood-ratio cost, Cllr) and reliability (= precision) (of which the metric is the 95% credible interval, CI).
    Original languageEnglish
    Pages71-79
    Publication statusPublished - 2013
    EventAustralasian Language Technology Association Workshop ALTA 2013 - Brisbane Australia
    Duration: 1 Jan 2013 → …

    Conference

    ConferenceAustralasian Language Technology Association Workshop ALTA 2013
    Period1/01/13 → …

    Fingerprint Dive into the research topics of 'A Comparative Study of Two Procedures for Calculating Likelihood Ration in Forensic Text Comparison: Multivariate Kernel Density vs. Gaussian Mixture Model-Universal Background Model'. Together they form a unique fingerprint.

    Cite this