Lexicon-based Sentiment Analysis in Persian
Pp. 154-183 (30)
Mohammad Ehsan Basiri, Nasser Ghasem-Aghaee and Ahmad Reza Naghsh-Nilchi
Sentiment analysis is a field of study concerning the extraction of people’s
opinion and attitude from their writings on the Web. Most research efforts in the area
of sentiment analysis have focused on English texts and few works considered the
problem of Persian sentiment analysis. Persian is spoken by more than a hundred
million speakers around the world and is the official language of Iran, Tajikistan, and
Afghanistan. From a computational point of view, Persian is a challenging language
due to its derivational nature and the use of Arabic words, informal style of writing,
and different forms of writing for compound words. In this chapter, we present a
lexicon-based framework for sentiment analysis in Persian. Specifically, we develop a
Persian lexicon which associates sentiment words with their sentiment strengths.
Furthermore, in the proposed framework, we address several problems of sentiment
analysis in Persian, such as misspelling, word spacing, and stemming. We used the
proposed framework in the problem of polarity detection and rating prediction of cellphone
reviews. The results show that our approach outperforms supervised machine
learning techniques in terms of accuracy and mean absolute error.
Lexicon-based review classification, Natural language processing,
Opinion mining, Sentiment analysis, Text mining.
Department of Computer Engineering, Faculty of Engineering, Shahrekord University, Shahrekord, Iran