Master's Thesis

Computer Engineering and Computer Science

Computer Science, MS

Almaghraby, Adel

Gentili, Monica

Gentili, Monica

Imam, Ibrahim

Word2vec; SVM; text mining


Natural Language Processing represents a quantum leap for governance and industries. It enables them to have an insight into hidden patterns and information within their data. In this thesis, we have worked on an important field in Natural Language Processing, which is Text Classification. Our goal is to help restaurant owners to find which dishes customers like more. To do that we have used a dataset from that has 150,000 restaurant reviews, then count the most frequent dishes mentioned. However, this way is not effective except if these reviews are categorized into different restaurants-styles. For this reason, we have used Word2vec with Support Vector Machine algorithms to classify these reviews into four restaurant-style categories (Mediterranean, Indian, Mexican, and Japanese). The experimental result shows that this methodology has successfully achieved a classification accuracy of 87.2%, which shows that the methodology is effective in classifying reviews datasets.