Compared to other forensic comparative sciences, studies of the efficacy of the likelihood ratio (LR) framework in forensic authorship analysis are lagging. An experiment is described concerning the estimation of strength of linguistic text evidence within that framework. The LRs were estimated by trialling three different procedures: one is based on the multivariate kernel density (MVKD) formula, with each group of messages being modelled as a vector of authorship attribution features; the other two involve N-grams based on word tokens and characters, respectively. The LRs that were separately estimated from the three different procedures are logistic-regression-fused to obtain a single LR for each author comparison. This study used predatory chatlog messages sampled from 115 authors. To see how the number of word tokens affects the performance of a forensic text comparison (FTC) system, token numbers used for modelling each group of messages were progressively increased: 500, 1000, 1500 and 2500 tokens. The performance of the FTC system is assessed using the log-likelihood-ratio cost (Cllr), which is a gradient metric for the quality of LRs, and the strength of the derived LRs is charted as Tippett plots. It is demonstrated in this study that (i) out of the three procedures, the MVKD procedure with authorship attribution features performed best in terms of Cllr, and that (ii) the fused system outperformed all three of the single procedures. When the token length is 1500, for example, the fused system achieved a Cllr value of 0.15. Some unrealistically strong LRs were observed in the results. Reasons for these are discussed, and a possible solution to the problem, namely the empirical lower and upper bound LR (ELUB) method is trialled and applied to the LRs of the best-achieving fusion system.