A ROBUST FRAMEWORK OF SENTIMENT ANALYSIS FOR ONLINE CUSTOMER REVIEWS AND BLOGS

FAZAL MASUD KHAN, . (2015) A ROBUST FRAMEWORK OF SENTIMENT ANALYSIS FOR ONLINE CUSTOMER REVIEWS AND BLOGS. Doctoral thesis, INSTITUTE OF COMPUTING AND INFORMATION TECHNOLOGY GOMAL UNIVERSITY.

[img] Text
16034S.pdf

Download (2MB)

Abstract

The Web 2.0 has dramatically changed people?s communication style. It is a great move toward more community oriented, highly collaborative, interactive and responsive Web. Today we are not only using the Internet but we are part of this global network. Social media sites became the world?s largest virtual community, where people express their views about products, events and services, anytime from anywhere. These views have great impact on community thinking and decisions. The most flourished feature of this era is the rising of blogging which provides resourceful and open way to anyone, anywhere. These data sources provide the rich basis for sentiment analysis. The statistics show that 80% of consumers have changed their decisions about purchase based on negative reviews found online. The study found that blogs are 63% more likely to influence purchase decisions than magazines. Evaluation of social media has powered interest in sentiment analysis. There exist two main approaches for extracting sentiment automatically, the lexicon-based approach and statistical or machine learning approach. The later approach demands a lot of training data to learn lexical items that express sentiment and its performance drops when the same classifiers is used in a different domain. The main focus of this work is to develop a lexicon-based framework for automatic classification of blogs and reviews with respect to their semantic orientation. This method consists of three major components: Sentiment analysis, Slang?s detection and scoring, and Context-aware spelling corrector. Lexicon-based methods for sentiment analysis are robust, give good performance in cross-domain and can be easily boosted with additional source of knowledge. It performs well on blog posting, reviews and also a preferable classifier for handling contextual valence shifters. Irrespective of these merits no single lexicon can perform in an optimal way all the time. This method uses a dynamic, updateable and comprehensive lexicon based on existing opinion lexicons, dictionaries and other machine-readable resources to classify the user-generated contents into positive, negative and neutral polarity. Slangs and spelling correction are two vital elements for sentiment analysis because slang and misspelled word may affect the sentiment score. These two issues were handled using Web resources and Statistical language model. The proposed work was implemented, and evaluated with different datasets of reviews and blogs. The empirical results show that the proposed work outperforms the existing, related methods and achieves 90.3% accuracy on average. This method showed high accuracy in binary classification. All the three components of the proposed method performed well with different domains.

Item Type: Thesis (Doctoral)
Uncontrolled Keywords: ROBUST,FRAMEWORK,SENTIMENT, CUSTOMER REVIEWS,BLOGS
Subjects: Q Science > QA Mathematics
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Depositing User: Unnamed user with email jmemon@hec.gov.pk
Date Deposited: 25 Sep 2017 04:08
Last Modified: 25 Sep 2017 04:08
URI: http://eprints.hec.gov.pk/id/eprint/6347

Actions (login required)

View Item View Item