Date on Master's Thesis/Doctoral Dissertation

12-2017

Document Type

Master's Thesis

Degree Name

M.S.

Department

Computer Engineering and Computer Science

Degree Program

Computer Science, MS

Committee Chair

Yampolskiy, Roman

Committee Co-Chair (if applicable)

Imam, Ibrahim

Committee Member

Imam, Ibrahim

Committee Member

El-Baz, Ayman

Abstract

Authorship analysis is a process of identifying a true writer of a given document and has been studied for decades. However, only a handful of studies of authorship analysis of translators are available despite the fact that online translations are widely available and also popularly employed in automatic translations of posts in social networking services. The identification of translation algorithms has potential to contribute to the investigation of cybercrimes, involving translation of scam messages by algorithmic translations to reach speakers of foreign languages. This study tested bag of words (BOW) approach in authorship attribution and the existing approaches to translator attribution. We also proposed a simple but accurate feature that extracts the combinations of lexical and syntactic information from texts. Our experiments show that the proposed feature is text size invariant.

Share

COinS