Date on Master's Thesis/Doctoral Dissertation
5-2024
Document Type
Doctoral Dissertation
Degree Name
Ph. D.
Department
Computer Engineering and Computer Science
Degree Program
Computer Science and Engineering, PhD
Committee Chair
Yampolskiy, Roman
Committee Co-Chair (if applicable)
Nasraoui, Olfa
Committee Member
Nasraoui, Olfa
Committee Member
Lauf, Adrian
Committee Member
Losavio, Michael
Author's Keywords
Stylometry; multimodal; authorship identification; feature fusion; text mining; source code stlometry
Abstract
This dissertation introduces multimodal stylometry, a novel approach to authorship identification that integrates text and source code features for a comprehensive understanding of an author's unique style. Traditional stylometric methods have primarily focused on either text stylometry or source code stylometry, thereby neglecting the potential insights that multimodality may provide. This research aims to bridge this gap by proposing a framework that combines textual and source code data to enhance the accuracy and reliability of authorship identification. The study begins by reviewing existing literature on authorship identification and stylometry, highlighting the limitations of unimodal approaches. Leveraging recent advancements in multimodal biometrics and feature fusion, the research introduces a methodology that extracts stylometric features from written text and source code. These multimodal features are then integrated using an extended feature fusion technique that introduces an extra layer of feature selection. To validate the proposed approach, a diverse dataset comprising texts and corresponding source code data from various authors is curated. The dissertation explores the effectiveness of multimodality when compared to unimodality. Furthermore, the research investigates the transferability of the proposed multimodal stylometry framework in distinguishing AI and Human generated text and source code. The findings not only advance authorship identification techniques but also hold implications for applications in forensic linguistics, digital humanities, and content analysis. Ultimately, this research underscores the significance of multimodal stylometry in estimating the identity of an author.
Recommended Citation
Adebayo, Glory O., "Multimodal stylometry: A novel approach for authorship identification." (2024). Electronic Theses and Dissertations. Paper 4361.
https://doi.org/10.18297/etd/4361