Stylometric Techniques for Multiple Author Clustering

David Kernot, Terry Bossomaier, Roger Bradbury

    Research output: Contribution to journalArticle


    Abstract—In 1598-99 printer, William Jaggard named Shakespeare as the sole author of The Passionate Pilgrim even though Jaggard chose a number of non-Shakespearian poems in the volume. Using a neurolinguistics approach to authorship identification, a four-feature technique, RPAS, is used to convert the 21 poems in The Passionate Pilgrim into a multi-dimensional vector. Three complementary analytical techniques are applied to cluster the data and reduce single technique bias before an alternate method, seriation, is used to measure the distances between clusters and test the strength of the connections. The multivariate techniques are found to be robust and able to allocate nine of the 12 unknown poems to Shakespeare. The authorship of one of the Barnfield poems is questioned, and analysis highlights that others are collaborations or works of yet to be acknowledged poets. It is possible that as many as 15 poems were Shakespeare’s and at least five poets were not acknowledged.
    Original languageEnglish
    Pages (from-to)1-9
    JournalInternational Journal of Advanced Computer Science and Applications
    Issue number3
    Publication statusPublished - 2017

    Cite this